Audio Assistance for Visually Impaired Using Image Captioning

Authors

  • Krunal Tule  Dept. of Computer Science and Engineering, MIT-WPU, Pune, Maharashtra, India
  • Krishna Patil  Dept. of Computer Science and Engineering, MIT-WPU, Pune, Maharashtra, India
  • Manas Yeole  Dept. of Computer Science and Engineering, MIT-WPU, Pune, Maharashtra, India
  • Shrenik Shingi  Dept. of Computer Science and Engineering, MIT-WPU, Pune, Maharashtra, India
  • Dr. Rashmi Phalnikar  Dept. of Computer Science and Engineering, MIT-WPU, Pune, Maharashtra, India

Keywords:

CNN, RNN, Image Captioning, Text-To-Speech, Raspberry Pi

Abstract

Blind people navigate safely through a familiar room based on a strong judgement about the location of objects. If something has been moved, added or removed, it can present difficulty and potentially a danger. Human eyes are one of the most important body parts that help humans to understand and interact with their surroundings. Most learning and recognition of objects around us is accomplished using the eyes and their biological capabilities. Given the recent advancement of imaging systems and the ever-increasing processing power of microprocessors, developing audio assistance systems for the visually impaired using image captioning is possible. In the initial system, we propose a system consisting of a camera-equipped microprocessor to capture the images and generate descriptive text out of them. This will ultimately help the visually impaired to perform their day-to-day activity independently.

References

  1. Adela Puscasiu , Alexandra Fanca, Dan-IoanGota, HonoriuValean, “Automated image captioning” Department of Automation Technical University of Cluj-Napoca Cluj-Napoca, Româniadoi: 10.1109/AQTR49680.2020.9129930.
  2. Varsha Kesavan Electronics and Telecommunications Fr. Conceicao Rodrigues Institute of Technology, Mumbai University “Deep Learning based Automatic Image Caption Generation” 2019 Global Conferencefor Advancement in Technology (GCAT) Bangalore, India. Oct 18-20, 2019, doi: 10.1109/GCAT47503.2019.8978293
  3. Faruk Ahmed, Md Sultan Mahmud, Rakib Al-Fahad, ShahinurAlam, and Mohammed Yeasin Department of Electrical and Computer Engineering The University of Memphis, Memphis, TN 38152, USA, “Image Captioning for Ambient Awareness on a Sidewalk”, 2018 1st International Conference on Data Intelligence and Security, doi: 10.1109/ICDIS.2018.00020.
  4. Cristian Iorga, Victor-Emil Neagoe, Department of Applied Electronics and Information Engineering “Politehnica” University of Bucharest Bucharest, Romania, “A Deep CNN Approach with Transfer Learning for Image Recognition”, ECAI 2019 - International Conference– 11th Edition Electronics, Computers and Artificial Intelligence 27 June-29 June, 2019, Pitesti, ROMÂNIA, doi: 10.1109/ECAI46879.2019.9042173.
  5. Tsung-Yi Lin Michael Maire Serge BelongieLubomirBourdev Ross Girshick James Hays Pietro Perona Deva Ramanan C. Lawrence Zitnick Piotr Dollar, “Microsoft COCO: Common Objects in Context” arXiv:1405.0312v3

Downloads

Published

2021-03-13

Issue

Section

Research Articles

How to Cite

[1]
Krunal Tule, Krishna Patil, Manas Yeole, Shrenik Shingi, Dr. Rashmi Phalnikar, " Audio Assistance for Visually Impaired Using Image Captioning" International Journal of Scientific Research in Computer Science, Engineering and Information Technology(IJSRCSEIT), ISSN : 2456-3307, Volume 8, Issue 2, pp.05-10, March-April-2021.