Robust Speaker Recognition using Enhanced Spectrogram

Authors

  • Sukhvinder Kaur  I.K. Gujral PTU, Jalandhar, Kapurthala, India
  • J. S. Sohal  Director, LCET, Ludhiana, Punjab, India

Keywords:

Bayesian Information Criteria, Kullback Leibler Distance Metric, Enhanced Spectrogram, Non-Linear Energy Operator, T-Test Wavelet Transform

Abstract

The aim of this paper is to present an efficient, fast and optimized system that identify the speaker in automatic speaker recognition system (ASR). It can be used in voice biometrics. In this proposed technique, the daubechies wavelet transform is used to compress the audio stream in the ratio of 1:4 with 99% of energy; their features are extracted by enhanced spectrogram with non-linear energy operator. Finally, three different distance matrices: T-test, deltaBIC and KL2 were used for feature matching of different speakers. The proposed technique using enhanced spectrogram with t-test distance metric gives fast and better results as compared to delta BIC and KL2.

References

  1. A. Alexander and A. Drygajlo, "Speaker Recognition : A Simple Demonstration Using".
  2. M. W. Mak and H. B. Yu, "A study of voice activity detection techniques for NIST speaker recognition evaluations," Comput. Speech Lang., vol. 28, no. 1, pp. 295-313, 2014.
  3. T. Kinnunen and H. Li, "An Overview of Text-Independent Speaker Recognition : from Features to Supervectors," 2009.
  4. J. I. Agbinya and N. S. Wales, "Processing," pp. 1-6, 1996.
  5. J.-D. Wu and B.-F. Lin, "Speaker identification using discrete wavelet packet transform technique with irregular decomposition," Expert Syst. Appl., vol. 36, no. 2, pp. 3136-3143, 2009.
  6. A. Potamianos and P. Maragos, "multiband energy demodulation," vol. 99, no. 6, pp. 3795-3806, 1996.
  7. N. Shokouhi, A. Ziaei, A. Sangwan, and J. H. L. Hansen, "Robust Overlapped Speech Detection And Its Application In Word-Count Estimation For Prof-Life-Log Data Navid Shokouhi , Ali Ziaei , Abhijeet Sangwan , John H . L . Hansen Center for Robust Speech Systems ( CRSS ) The University of Texas at Dallas , Richar," no. 978, pp. 4724-4728, 2015.
  8. P. Maragos, S. Member, J. F. Kaiser, T. F. Quatieri, and S. Member, "Application to Speech Analysis S :, s :," vol. 41, no. 10, pp. 3024-3051, 1993.
  9. P. S. Gopalakrishnan, "Clustering Via The Bayesian Information Criterion With," pp. 645-648, 1998.
  10. T. H. Nguyen, E. S. Chng, and H. Li, "T-Test Distance and Clustering Criterion for Speaker Diarization," no. 4, pp. 2-5.

Downloads

Published

2017-08-31

Issue

Section

Research Articles

How to Cite

[1]
Sukhvinder Kaur, J. S. Sohal, " Robust Speaker Recognition using Enhanced Spectrogram, IInternational Journal of Scientific Research in Computer Science, Engineering and Information Technology(IJSRCSEIT), ISSN : 2456-3307, Volume 2, Issue 4, pp.637-640, July-August-2017.