Speech Emotion Recognition Using MLP Classifier
DOI:
https://doi.org/10.32628/CSEIT217446Keywords:
Emotion, RAVDESS Dataset, Speech Emotion Recognition, Convolutional neural network.Abstract
Language is human's most important communication and speech is basic medium of communication. Emotion plays a crucial role in social interaction. Recognizing the emotion in a speech is important as well as challenging because here we are dealing with human machine interaction. Emotion varies from person to person were same person have different emotions all together has different way express it. When a person express his emotion each will be having different energy, pitch and tone variation are grouped together considering upon different subject. Therefore the speech emotion recognition is a future goal of computer vision. The aim of our project is to develop the smart emotion recognition speech based on the convolutional neural network. Which uses different modules for emotion recognition and the classifier are used to differentiate emotion such as happy sad angry surprise. The machine will convert the human speech signals into waveform and process its routine at last it will display the emotion. The data is speech sample and the characteristics are extracted from the speech sample using librosa package. We are using RAVDESS dataset which are used as an experimental dataset. This study shows that for our dataset all classifiers achieve an accuracy of 68%.
References
- Awni Hannun, Ann Lee, Qjantong Xu and Ronan Collobert, Sequence to sequence speech recognition with time-depth deperable convolutions, interspeech 2019, Sep 2019.
- Lawrence R Rabiner Ronald W Schafer, “Introduction to Digital Speech Processing", Vol. 1, Nos. 1–2 (2007) 1–194, 2007 L. R. Rabiner and R. W... Schafer.
- Li, J., Deng, L., Gong, Y. (2014). An Overview of Noise-Robust Automatic Speech Recognition, IEEE/ACM Transactions on Audio Speech & Language Processing, Vol.22, No.4, pp.745-777.
- Jinyu Li, Li Deng, Yifan Gong, and Reinhold Hach- Umbach “An overview of noiserobust sutomatic speech recognition” IEEE/ACM Transactions on Audio, Speech and Language Processing, Vol 22, no 4, pp 745-777, 2014.
- Chang, A X., Martini, B and Culurciello E (2015) ‘Recurrent Neural Networks hardware implementation on FPGA’, arXiv preprint arXiv: 1511.05552.
Downloads
Published
Issue
Section
License
Copyright (c) IJSRCSEIT

This work is licensed under a Creative Commons Attribution 4.0 International License.