One-Decade Survey on Speaker Diarization for Telephone and Meeting Speech

Authors

  • Ajit Das  Department of CST, Bodoland University, Kokrajhar, Assam, India
  • Utpal Bhattacharjee  Department of Mathematical Sciences, Bodoland University, Kokrajhar, Assam, India
  • Dipak Kr. Mitra  

Keywords:

Speaker Diarization, Meeting Speech,Telephone Speech, Segmentation and Clustering.

Abstract

Finding speaker turns and identifying the speakers is known as speaker diarizationi.e speakerdiarization effectively answer the question „who speak and when?. In other words its task is to determine the speaker turns in an audio or video recording which contents unknown speech and unknown number of speakers. Over recent years this domains have received most research attention within the speaker diarization community. It is mainly used in many applications related to audio processing such as information retrieval from telephone conversation, meeting speech, broadcast news etc. In this paper, our aim is to review the current state-of-the-art, focusing on research developed since beginning of diarization that relates to Speaker Diarization for telephone and meeting speech.

References

  1. Demir,  C.  and  M.  U.  Dogan,  "Speech-Music  Segmentation  System  for  Speech  Recognition",  Signal  Processing  and Communications Applications, 2009. SIU 2009. IEEE 17th , pp. 608-611, 2009.
  2. Kenny, P., D. A. Reynolds and F. Castaldo, "Diarization of Telephone Conversations Using Factor Analysis", IEEE Journal of Selected Topics in Signal Processing, Vol. 4, pp. 1059–1070, 2010.
  3. Ganchev, T., N. Fakotakis and G. Kokkinakis, "Comparative Evaluation of Various MFCC Implementations on the Speaker Verification Task",  Proceedings of 10th International Conference on Speech and Computer, Vol. 2, pp. 191–194, 2005.
  4. Hermasnsky, H., "Perceptual Linear Predictive Analysis of Speech", Journal of the Acoustical Society of America, Vol. 87, pp. 1738–1752, 1990.
  5. Bimbot, F., J. F. Bonastre, C. Fredouille, G. Gravier and I. Magrin-Chagnolleau, et al., "A Tutorial on Text-Independent Speaker Verification", EURASIP Journal on Applied Signal Processing, Vol. 4, pp. 430–451, 2004.
  6. Elie El-Khoury, Christine Sénac and Julien Pinquier " Improved Speaker Diarization System For Meetings" EEE international Conference on Acoustic, speech and signal Processing , 978-1-4244-2354-5/09/$25.00 ©2009 IEEE,
  7. Deepu Vijayasenan ,"An Information Theoretic Approach to Speaker Diarization of Meeting Data" IEEE Transactions on Audio, Speech, and Language Processing, VOL. 17, NO. 7, SEPTEMBER 2009, pp. 1382-1393.
  8. Fabio Valente, Petr Motlicek and Deepu Vijayasenan ,"Variational Bayesian Speaker Diarization Of Meeting Recordings" EEE international Conference on Acoustic, speech and signal Processing, 978-1-4244-4296-6/10/$25.00 ©2010 IEEE, pp. 4954-4957.
  9. Wei Li Yanxiong Li and Qianhua He,  "Estimating Key Speaker in Meeting Speech Based on Multiple Features Optimization", International Journal of Signal Processing, Image Processing and Pattern Recognition Vol. 8, No. 4 (2015), pp. 31-40.
  10. Hayley Hung , Yan Huang, Gerald Friedland  and Daniel Gatica-Perez,  "Estimating Dominance in Multi-Party Meetings Using Speaker Diarization", IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 19, NO. 4, MAY 2011, pp. 847-860.
  11. Authors Qiao Li, Qing Fan, Yunpeng Xiao, and Weiping Ye " A Comparable Study on PNCC in Speaker Diarization for Meetings", 2010 First ACIS International Symposium on Cryptography, and Network Security, Data Mining and Knowledge Discovery, E-Commerce and Its Applications, and Embedded Systems, 978-0-7695-4332-1/10 $26.00 © 2010 IEEE, pp.157-160.
  12. Sree Harsha Yella and Hervé Bourlard  "Overlapping Speech Detection Using Long-Term Conversational Features for Speaker Diarization in Meeting Room Conversations", IEEE/ACM TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 22, NO. 12, DECEMBER 2014, pp. 1688-1700.
  13. Giovanni Soldi, Christophe Beaugeant and Nicholas Evans in their paper "Adaptive And Online Speaker Diarization For Meeting Data" , 2015 23rd European Signal Processing Conference (EUSIPCO), 978-0-9928626-3-3/15/$31.00 ©2015 IEEE, pp. 2112-2116.
  14. Sree Harsha Yella and Herve Bourlard, "Information Bottleneck Based Speaker Diarization of Meetings Using Non-Speech as side Information" Acoustic Beamforming for Speaker Diarization of Meetings" , 2014 IEEE International Conference on Acoustic, Speech and Signal Processing (ICASSP), 978-1-4799-2893-4/14/$31.00 ©2014 IEEE, pp. 96-100.
  15. Xavier Anguera, Chuck Wooters and Javier Hernando , "Acoustic Beamforming for Speaker Diarization of Meetings", IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 15, NO. 7, SEPTEMBER 2007.pp. 2011-2022.
  16. Itshak Lapidot, Jean-Francois Bonastre and Samy Bengio " Telephone Conversation Speaker Diarization Using Mealy-HMMs", Odyssey 2014: The Speaker and Language Recognition Workshop, 16-19 June 2014, Joensuu, Finland, pp. 173-178.
  17. Simon Bozonnet, Ravichander Vipperla and  Nicholas Evans "Phone Adaptive Training for Speaker Diarization", EURECOM.
  18. Rong Zheng, Ce Zhang, Shanshan Zhang and  Bo Xu, "Variational Bayes Based I-Vector For Speaker Diarization Of Telephone Conversations", 2014 IEEE International Conference on Acoustic, Speech and Signal Processing (ICASSP), 978-1-4799-2893-4/14/$31.00 ©2014 IEEE, pp.91-95.
  19. Houman Ghaemmaghami, David Dean and Sridha Sridharan  "A Speaker Rediarization Scheme for Improving Diarization in Large Two-Speaker Telephone Datasets", Speech and Audio Research Laboratory, Queensland University of Technology, Brisbane, Australia, pp. 1272-1276.

Downloads

Published

2017-10-31

Issue

Section

Research Articles

How to Cite

[1]
Ajit Das, Utpal Bhattacharjee, Dipak Kr. Mitra, " One-Decade Survey on Speaker Diarization for Telephone and Meeting Speech, IInternational Journal of Scientific Research in Computer Science, Engineering and Information Technology(IJSRCSEIT), ISSN : 2456-3307, Volume 2, Issue 5, pp.990-994, September-October-2017.