Evaluation of Speaker Recognition System Using Different Distance Metrics

Authors

  • Sukhvinder Kaur  SDDIET, Barwala, Golpura-134009, Haryana, India
  • Monica  SDDIET, Barwala, Golpura-134009, Haryana, India
  • J. S. Sohal  Director, LCET, Ludhiana-141113, Punjab, India

Keywords:

Bayesian Information criteria (BIC); Kullback-Leibler Distance Metric (KL2); T-test Distance Metric; Mel Frequency Cepstral Coefficients (MFCC); Nonlinear energy operator (NEO); Detection Error Tradeoff(DET); Receiver Operating Characteristics (ROC); Area Under Curve (AUC).

Abstract

In today's world scenario, speaker recognition system is very popular in voice verification for identity and access control to services. In this paper, speaker identification and verification is done with the help of feature extraction and different matching algorithms. We have introduced a new approach for speaker recognition. In this system, speech signals are firstly framed and then these signals are compressed using DWT for noise reduction and better sampling frequency. Furthermore, features of compressed signal are extracted with the help of Mel frequency Cepstral Coefficients (MFCC) and nonlinear energy operator (NEO). These features are further used for identification and verification of speaker’s voice. The distance metrics incorporated are Delta Bayesian Information Criteria (delta BIC), Kullback-Leibler Distance Metric (KL2), and T-Test metric. At the end, results are evaluated with Detection Error Tradeoff (DET) curve and Receiver Operator Characteristics (ROC) curve by finding the area under curve (AUC). The best result is shown by T-Test metric with MFCC feature.

References

  1. H. Beigi, Fundamentals of Speaker Recognition. 2011.
  2. M. A. Imtiaz and G. Raja, "Isolated word Automatic Speech Recognition (ASR) System using MFCC, DTW & KNN," Proc. - APMediaCast 2016, pp. 106-110, 2017.
  3. S. M. Joseph, "Speech Compression Using Wavelet Transform," pp. 754-758, 2011.
  4. S. Pal, "Speech Signal Processing : Non-Linear Energy Operator Centric Review," Int. J. Electron. Eng. Res., vol. 4, no. 3, pp. 205-221, 2012.
  5. S. Chen and P. Gopalakrishnan, "Speaker, environment and channel change detection and clustering via the bayesian information criterion," Proc. DARPA Broadcast News Transcr. Underst. Work., vol. 8, pp. 127-132, 1998.
  6. M. A. Siegler, U. Jain, B. Raj, and R. M. Stern, "Automatic Segmentation, Classification and Clustering of Broadcast News Audio," Proc. DARPA Speech Recognit. Work., pp. 97-99, 1997.
  7. T. H. Nguyen, S. Chng, and H. Li, "T-test distance and clustering criterion for speaker diarization," Proc. Annu. Conf. Int. Speech Commun. Assoc. INTERSPEECH, no. 4, pp. 36-39, 2008.
  8. L. F. Carvalho, G. Fernandes, M. V. O. De Assis, J. J. P. C. Rodrigues, and M. Lemes Proença, "Digital signature of network segment for healthcare environments support," Irbm, vol. 35, no. 6, pp. 299-309, 2014.
  9. A. Martin, G. Doddington, T. Kamm, M. Ordowski, and M. Przybocki, "The DET Curve in Assessment of Detection Task Performance," Proc. Eurospeech ’97, pp. 1895-1898, 1997.

Downloads

Published

2017-10-31

Issue

Section

Research Articles

How to Cite

[1]
Sukhvinder Kaur, Monica, J. S. Sohal, " Evaluation of Speaker Recognition System Using Different Distance Metrics, IInternational Journal of Scientific Research in Computer Science, Engineering and Information Technology(IJSRCSEIT), ISSN : 2456-3307, Volume 2, Issue 5, pp.541-546, September-October-2017.