Robust Speaker Recognition using Enhanced Spectrogram

Authors(2) :-Sukhvinder Kaur, J. S. Sohal

The aim of this paper is to present an efficient, fast and optimized system that identify the speaker in automatic speaker recognition system (ASR). It can be used in voice biometrics. In this proposed technique, the daubechies wavelet transform is used to compress the audio stream in the ratio of 1:4 with 99% of energy; their features are extracted by enhanced spectrogram with non-linear energy operator. Finally, three different distance matrices: T-test, deltaBIC and KL2 were used for feature matching of different speakers. The proposed technique using enhanced spectrogram with t-test distance metric gives fast and better results as compared to delta BIC and KL2.

Authors and Affiliations

Sukhvinder Kaur
I.K. Gujral PTU, Jalandhar, Kapurthala, India
J. S. Sohal
Director, LCET, Ludhiana, Punjab, India

Bayesian Information Criteria, Kullback Leibler Distance Metric, Enhanced Spectrogram, Non-Linear Energy Operator, T-Test Wavelet Transform

  1. A. Alexander and A. Drygajlo, "Speaker Recognition?: A Simple Demonstration Using".
  2. M. W. Mak and H. B. Yu, "A study of voice activity detection techniques for NIST speaker recognition evaluations," Comput. Speech Lang., vol. 28, no. 1, pp. 295-313, 2014.
  3. T. Kinnunen and H. Li, "An Overview of Text-Independent Speaker Recognition?: from Features to Supervectors," 2009.
  4. J. I. Agbinya and N. S. Wales, "Processing," pp. 1-6, 1996.
  5. J.-D. Wu and B.-F. Lin, "Speaker identification using discrete wavelet packet transform technique with irregular decomposition," Expert Syst. Appl., vol. 36, no. 2, pp. 3136-3143, 2009.
  6. A. Potamianos and P. Maragos, "multiband energy demodulation," vol. 99, no. 6, pp. 3795-3806, 1996.
  7. N. Shokouhi, A. Ziaei, A. Sangwan, and J. H. L. Hansen, "Robust Overlapped Speech Detection And Its Application In Word-Count Estimation For Prof-Life-Log Data Navid Shokouhi , Ali Ziaei , Abhijeet Sangwan , John H . L . Hansen Center for Robust Speech Systems ( CRSS ) The University of Texas at Dallas , Richar," no. 978, pp. 4724-4728, 2015.
  8. P. Maragos, S. Member, J. F. Kaiser, T. F. Quatieri, and S. Member, "Application to Speech Analysis S?:, s?:," vol. 41, no. 10, pp. 3024-3051, 1993.
  9. P. S. Gopalakrishnan, "Clustering Via The Bayesian Information Criterion With," pp. 645-648, 1998.
  10. T. H. Nguyen, E. S. Chng, and H. Li, "T-Test Distance and Clustering Criterion for Speaker Diarization," no. 4, pp. 2-5.

Publication Details

Published in : Volume 2 | Issue 4 | July-August 2017
Date of Publication : 2017-08-31
License:  This work is licensed under a Creative Commons Attribution 4.0 International License.
Page(s) : 637-640
Manuscript Number : CSEIT1724135
Publisher : Technoscience Academy

ISSN : 2456-3307

Cite This Article :

Sukhvinder Kaur, J. S. Sohal, "Robust Speaker Recognition using Enhanced Spectrogram", International Journal of Scientific Research in Computer Science, Engineering and Information Technology (IJSRCSEIT), ISSN : 2456-3307, Volume 2, Issue 4, pp.637-640, July-August-2017.
Journal URL :

Article Preview