Determining Species of Bird Using Their Voice
Keywords:
Bird Sound Recognition, Deep learning, CNN, LSTM, WavNet, Feature Extraction, Zero-Crossing Rate, Root Mean Square (RMS) Energy, MFCCs, Data Augmentation, Noise Addition, Pitch Shifting, Species ClassificationAbstract
This project focuses on developing a deep learning model to identify bird species based on their vocalizations. Given the challenges posed by datasets with imbalanced class distributions, the aim is to curate a selection of bird species that have sufficient audio samples for effective model training. We utilize advanced algorithms, including Convolutional Neural Networks (CNN), Long Short-Term Memory (LSTM) networks, and WavNet, alongside comprehensive feature extraction techniques. The audio features extracted include zero-crossing rate, root mean square energy, and Mel-frequency cepstral coefficients (MFCCs), which are pivotal for distinguishing vocal characteristics among species. The model's performance is enhanced through data augmentation strategies, such as noise addition and pitch shifting, to increase the diversity of training samples. This approach allows for robust classification, even with a limited number of audio recordings per species. Ultimately, our model demonstrates the potential to accurately predict bird species based on audio input, contributing to biodiversity studies and ecological monitoring efforts.
Downloads
References
IUCN. (n.d.). Birds - Species conservation work. Retrieved from https://www.iucn.org/theme/species/our-work/birds
Xeno-Canto. (n.d.). Sharing bird sounds from around the world. Retrieved from https://www.xeno-canto.org/
Debnath, S., Roy, P. P., Ali, A. A., & Amin, M. A. (2016). Bird species identification using vocal features. In Proceedings of the 5th International Conference on Informatics, Electronics and Vision (ICIEV).
Sun, R., Marye, Y. W., & Zhao, H. A. (2013). Improved bird species recognition using FFT and a four-layer neural network. Presented at the 13th International Symposium on Communications and Information Technologies (ISCIT).
MathWorks. (n.d.). AlexNet pretrained convolutional neural network. Retrieved from https://www.mathworks.com/help/deeplearning/ref/alexnet.html
Downloads
Published
Issue
Section
License
Copyright (c) 2025 International Journal of Scientific Research in Computer Science, Engineering and Information Technology

This work is licensed under a Creative Commons Attribution 4.0 International License.