Audio Assistance for Visually Impaired Using Image Captioning

Krunal Tule; Krishna Patil; Manas Yeole; Shrenik Shingi; Dr. Rashmi Phalnikar

doi:10.32628/CSEIT21822

Authors

Krunal Tule Dept. of Computer Science and Engineering, MIT-WPU, Pune, Maharashtra, India
Krishna Patil Dept. of Computer Science and Engineering, MIT-WPU, Pune, Maharashtra, India
Manas Yeole Dept. of Computer Science and Engineering, MIT-WPU, Pune, Maharashtra, India
Shrenik Shingi Dept. of Computer Science and Engineering, MIT-WPU, Pune, Maharashtra, India
Dr. Rashmi Phalnikar Dept. of Computer Science and Engineering, MIT-WPU, Pune, Maharashtra, India

Keywords:

CNN, RNN, Image Captioning, Text-To-Speech, Raspberry Pi

Abstract

Blind people navigate safely through a familiar room based on a strong judgement about the location of objects. If something has been moved, added or removed, it can present difficulty and potentially a danger. Human eyes are one of the most important body parts that help humans to understand and interact with their surroundings. Most learning and recognition of objects around us is accomplished using the eyes and their biological capabilities. Given the recent advancement of imaging systems and the ever-increasing processing power of microprocessors, developing audio assistance systems for the visually impaired using image captioning is possible. In the initial system, we propose a system consisting of a camera-equipped microprocessor to capture the images and generate descriptive text out of them. This will ultimately help the visually impaired to perform their day-to-day activity independently.

References

Adela Puscasiu , Alexandra Fanca, Dan-IoanGota, HonoriuValean, “Automated image captioning” Department of Automation Technical University of Cluj-Napoca Cluj-Napoca, Româniadoi: 10.1109/AQTR49680.2020.9129930.
Varsha Kesavan Electronics and Telecommunications Fr. Conceicao Rodrigues Institute of Technology, Mumbai University “Deep Learning based Automatic Image Caption Generation” 2019 Global Conferencefor Advancement in Technology (GCAT) Bangalore, India. Oct 18-20, 2019, doi: 10.1109/GCAT47503.2019.8978293
Faruk Ahmed, Md Sultan Mahmud, Rakib Al-Fahad, ShahinurAlam, and Mohammed Yeasin Department of Electrical and Computer Engineering The University of Memphis, Memphis, TN 38152, USA, “Image Captioning for Ambient Awareness on a Sidewalk”, 2018 1st International Conference on Data Intelligence and Security, doi: 10.1109/ICDIS.2018.00020.
Cristian Iorga, Victor-Emil Neagoe, Department of Applied Electronics and Information Engineering “Politehnica” University of Bucharest Bucharest, Romania, “A Deep CNN Approach with Transfer Learning for Image Recognition”, ECAI 2019 - International Conference– 11th Edition Electronics, Computers and Artificial Intelligence 27 June-29 June, 2019, Pitesti, ROMÂNIA, doi: 10.1109/ECAI46879.2019.9042173.
Tsung-Yi Lin Michael Maire Serge BelongieLubomirBourdev Ross Girshick James Hays Pietro Perona Deva Ramanan C. Lawrence Zitnick Piotr Dollar, “Microsoft COCO: Common Objects in Context” arXiv:1405.0312v3

Audio Assistance for Visually Impaired Using Image Captioning

Authors

Keywords:

Abstract

References

Downloads

Published

Issue

Section

License

How to Cite