Bird Sound Classification : Leveraging Deep Learning for Species Identification

Authors

  • Ardon Kotey Student, Dwarkadas J. Sanghvi College of Engineering, Mumbai, Maharashtra, India Author
  • Allan Almeida Student, Dwarkadas J. Sanghvi College of Engineering, Mumbai, Maharashtra, India Author
  • Nihal Gupta Student, Dwarkadas J. Sanghvi College of Engineering, Mumbai, Maharashtra, India Author
  • Dr. Vinaya Sawant Student, Dwarkadas J. Sanghvi College of Engineering, Mumbai, Maharashtra, India Author

DOI:

https://doi.org/10.32628/CSEIT24103127

Keywords:

Convolutional Neural Network, Bird Sound Recognition, Transfer Learning, Audio Classification

Abstract

Birds are meaningful to a wide audience including the public. They live in almost every type of environment and in almost every niche (place or role) within those environments. The monitoring of species diversity and migration is important for almost all conservation efforts. The analysis of long-term audio data is vital to support those efforts but relies on complex algorithms that need to adapt to changing environmental conditions. Convolutional neural networks (CNNs) are powerful toolkits of machine learning that have proven efficient in the field of image processing and sound recognition. In this paper, a CNN system classifying bird sounds is presented and tested through different configurations and hyperparameters. The MobileNet pre-trained CNN model is finetuned using a dataset acquired from the Xeno-canto bird song sharing portal, which provides a large collection of labeled and categorized recordings. Spectrograms generated from the downloaded data represent the input of the neural network. The attached experiments compare various configurations including the number of classes (bird species) and the color scheme of the spectrograms. Results suggest that choosing a color map in line with the images the network has been pre-trained with provides a measurable advantage. The presented system is viable only for a low number of classes.

Downloads

Download data is not yet available.

References

J. Salamon and J. P. Bello, “Deep convolutional neural networks and data augmentation for environmental sound classification,” IEEE Signal Processing Letters, vol. 24, pp. 279–283, 2017. DOI: https://doi.org/10.1109/LSP.2017.2657381

V. Bisot, R. Serizel, S. Essid, and G. Richard, “Leveraging deep neural networks with nonnegative representations for improved environmental sound classification,” IEEE International Workshop on Machine Learning for Signal Processing MLS, 2017. DOI: https://doi.org/10.1109/MLSP.2017.8168139

D. Gupta. (2017) Transfer learning and the art of using pre-trained models in deep learning. [Online].Available:https://www.analyticsvidhya.com/blog/2017/06/transfer-learning-the-art-of-fine-tuning-a-pre-trained-model/

J. Allen, “Short term spectral analysis, synthesis, and modification by discrete fourier transform,” IEEE Transactions on Acoustics, Speech,and Signal Processing, vol. 25, no. 3, pp. 235–238, Jun 1977. DOI: https://doi.org/10.1109/TASSP.1977.1162950

A. G. Howard, M. Zhu, B. Chen, D. Kalenichenko, W. Wang, T. Weyand, M. Andreetto, and H. Adam, “MobileNets: Efficient convolutional neural networks for mobile vision applications,” CoRR, vol. abs/1704.04861,2017.

A. Joly, H. Goeau, H. Glotin, C. Spampinato, P. Bonnet, W.-P. Vellinga, ¨ J.-C. Lombardo, R. Planque, S. Palazzo, and H. Muller, “LifeCLEF ¨ 2017 lab overview: multimedia species identification challenges,” in International Conference of the Cross-Language Evaluation Forum for European Languages. Springer, 2017, pp. 255–274. DOI: https://doi.org/10.1007/978-3-319-65813-1_24

S. Kahl, T. Wilhelm-Stein, H. Hussein, H. Klinck, D. Kowerko, M. Ritter, and M. Eibl, “Large-scale bird sound classification using convolutional neural networks,” Working notes of CLEF, 2017.

K. J. Piczak, “Recognizing bird species in audio recordings using deep convolutional neural networks,” Working notes of CLEF, 2016

E. Sprengel, M. Jaggi, Y. Kilcher, and T. Hofmann, “Audio based bird Species identification using deep learning techniques,” Working notes of CLEF, 2016.

D. Stowell and M. D. Plumbley, “Audio-only bird classification using unsupervised feature learning,” Working Notes of CLEF, 2014.

M.Lasseck,“Improved automatic bird identification through decision tree based feature selection and bagging.” Working Notes of CLEF, 2015.

E. C¸ akir, S. Adavanne, G. Parascandolo, K. Drossos, and T. Virtanen, “Convolutional recurrent neural networks for bird audio detection,” in Proceedings of the 25th European Signal Processing Conference (EUSIPCO). IEEE, 2017, pp. 1744–1748. DOI: https://doi.org/10.23919/EUSIPCO.2017.8081505

P. Jancovic, M. K ˇ ok¨ uer, M. Zakeri, and M. Russell, “Bird species recog- ¨ nition using HMM-based unsupervised modelling of individual syllables with incorporated duration modelling,” in International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2016, pp. 559–563. DOI: https://doi.org/10.1109/ICASSP.2016.7471737

A. Joly, V. Leveau, J. Champ, and O. Buisson, “Shared nearest neighbors match kernel for bird songs identification - LifeCLEF 2015 challenge,” Working Notes of CLEF, 2015.

B. Fazekas, A. Schindler, T. Lidy, and A. Rauber, “A multi-modal deep neural network approach to bird-song identification,” Working Notes of CLEF, 2017.

K. J. Piczak, “Environmental sound classification with convolutional neural networks,” 2015 IEEE 25th Int DOI: https://doi.org/10.1109/MLSP.2015.7324337

Downloads

Published

05-06-2024

Issue

Section

Research Articles

Similar Articles

1-10 of 287

You may also start an advanced similarity search for this article.