Evaluation of Speaker Recognition System Using Different Distance Metrics

Authors(3) :-Sukhvinder Kaur, Monica, J. S. Sohal

In today's world scenario, speaker recognition system is very popular in voice verification for identity and access control to services. In this paper, speaker identification and verification is done with the help of feature extraction and different matching algorithms. We have introduced a new approach for speaker recognition. In this system, speech signals are firstly framed and then these signals are compressed using DWT for noise reduction and better sampling frequency. Furthermore, features of compressed signal are extracted with the help of Mel frequency Cepstral Coefficients (MFCC) and nonlinear energy operator (NEO). These features are further used for identification and verification of speaker’s voice. The distance metrics incorporated are Delta Bayesian Information Criteria (delta BIC), Kullback-Leibler Distance Metric (KL2), and T-Test metric. At the end, results are evaluated with Detection Error Tradeoff (DET) curve and Receiver Operator Characteristics (ROC) curve by finding the area under curve (AUC). The best result is shown by T-Test metric with MFCC feature.

Authors and Affiliations

Sukhvinder Kaur
SDDIET, Barwala, Golpura-134009, Haryana, India
SDDIET, Barwala, Golpura-134009, Haryana, India
J. S. Sohal
Director, LCET, Ludhiana-141113, Punjab, India

Bayesian Information criteria (BIC); Kullback-Leibler Distance Metric (KL2); T-test Distance Metric; Mel Frequency Cepstral Coefficients (MFCC); Nonlinear energy operator (NEO); Detection Error Tradeoff(DET); Receiver Operating Characteristics (ROC); Area Under Curve (AUC).

  1. H. Beigi, Fundamentals of Speaker Recognition. 2011.
  2. M. A. Imtiaz and G. Raja, "Isolated word Automatic Speech Recognition (ASR) System using MFCC, DTW & KNN," Proc. - APMediaCast 2016, pp. 106-110, 2017.
  3. S. M. Joseph, "Speech Compression Using Wavelet Transform," pp. 754-758, 2011.
  4. S. Pal, "Speech Signal Processing?: Non-Linear Energy Operator Centric Review," Int. J. Electron. Eng. Res., vol. 4, no. 3, pp. 205-221, 2012.
  5. S. Chen and P. Gopalakrishnan, "Speaker, environment and channel change detection and clustering via the bayesian information criterion," Proc. DARPA Broadcast News Transcr. Underst. Work., vol. 8, pp. 127-132, 1998.
  6. M. A. Siegler, U. Jain, B. Raj, and R. M. Stern, "Automatic Segmentation, Classification and Clustering of Broadcast News Audio," Proc. DARPA Speech Recognit. Work., pp. 97-99, 1997.
  7. T. H. Nguyen, S. Chng, and H. Li, "T-test distance and clustering criterion for speaker diarization," Proc. Annu. Conf. Int. Speech Commun. Assoc. INTERSPEECH, no. 4, pp. 36-39, 2008.
  8. L. F. Carvalho, G. Fernandes, M. V. O. De Assis, J. J. P. C. Rodrigues, and M. Lemes Proença, "Digital signature of network segment for healthcare environments support," Irbm, vol. 35, no. 6, pp. 299-309, 2014.
  9. A. Martin, G. Doddington, T. Kamm, M. Ordowski, and M. Przybocki, "The DET Curve in Assessment of Detection Task Performance," Proc. Eurospeech ’97, pp. 1895-1898, 1997.

Publication Details

Published in : Volume 2 | Issue 5 | September-October 2017
Date of Publication : 2017-10-31
License:  This work is licensed under a Creative Commons Attribution 4.0 International License.
Page(s) : 541-546
Manuscript Number : CSEIT172562
Publisher : Technoscience Academy

ISSN : 2456-3307

Cite This Article :

Sukhvinder Kaur, Monica, J. S. Sohal, "Evaluation of Speaker Recognition System Using Different Distance Metrics", International Journal of Scientific Research in Computer Science, Engineering and Information Technology (IJSRCSEIT), ISSN : 2456-3307, Volume 2, Issue 5, pp.541-546, September-October-2017.
Journal URL : http://ijsrcseit.com/CSEIT172562

Article Preview