Robust Speaker Recognition using Enhanced Spectrogram

Sukhvinder Kaur; J. S. Sohal

doi:10.32628/CSEIT1724135

Authors

Sukhvinder Kaur I.K. Gujral PTU, Jalandhar, Kapurthala, India
J. S. Sohal Director, LCET, Ludhiana, Punjab, India

Keywords:

Bayesian Information Criteria, Kullback Leibler Distance Metric, Enhanced Spectrogram, Non-Linear Energy Operator, T-Test Wavelet Transform

Abstract

The aim of this paper is to present an efficient, fast and optimized system that identify the speaker in automatic speaker recognition system (ASR). It can be used in voice biometrics. In this proposed technique, the daubechies wavelet transform is used to compress the audio stream in the ratio of 1:4 with 99% of energy; their features are extracted by enhanced spectrogram with non-linear energy operator. Finally, three different distance matrices: T-test, deltaBIC and KL2 were used for feature matching of different speakers. The proposed technique using enhanced spectrogram with t-test distance metric gives fast and better results as compared to delta BIC and KL2.

References

A. Alexander and A. Drygajlo, "Speaker Recognition : A Simple Demonstration Using".
M. W. Mak and H. B. Yu, "A study of voice activity detection techniques for NIST speaker recognition evaluations," Comput. Speech Lang., vol. 28, no. 1, pp. 295-313, 2014.
T. Kinnunen and H. Li, "An Overview of Text-Independent Speaker Recognition : from Features to Supervectors," 2009.
J. I. Agbinya and N. S. Wales, "Processing," pp. 1-6, 1996.
J.-D. Wu and B.-F. Lin, "Speaker identification using discrete wavelet packet transform technique with irregular decomposition," Expert Syst. Appl., vol. 36, no. 2, pp. 3136-3143, 2009.
A. Potamianos and P. Maragos, "multiband energy demodulation," vol. 99, no. 6, pp. 3795-3806, 1996.
N. Shokouhi, A. Ziaei, A. Sangwan, and J. H. L. Hansen, "Robust Overlapped Speech Detection And Its Application In Word-Count Estimation For Prof-Life-Log Data Navid Shokouhi , Ali Ziaei , Abhijeet Sangwan , John H . L . Hansen Center for Robust Speech Systems ( CRSS ) The University of Texas at Dallas , Richar," no. 978, pp. 4724-4728, 2015.
P. Maragos, S. Member, J. F. Kaiser, T. F. Quatieri, and S. Member, "Application to Speech Analysis S :, s :," vol. 41, no. 10, pp. 3024-3051, 1993.
P. S. Gopalakrishnan, "Clustering Via The Bayesian Information Criterion With," pp. 645-648, 1998.
T. H. Nguyen, E. S. Chng, and H. Li, "T-Test Distance and Clustering Criterion for Speaker Diarization," no. 4, pp. 2-5.

Robust Speaker Recognition using Enhanced Spectrogram

Authors

Keywords:

Abstract

References

Downloads

Published

Issue

Section

License

How to Cite