VI-Assist Using AI for Visually Impaired Person

Riyanshu  Rai; Neha Singh; Ashish Pal; Adil Khan; Dr.Vinayak Shinde

doi:10.32628/CSEIT2410232

Authors

Riyanshu Rai Department of Computer Engineering, Shree L. R. Tiwari College of Engineering, Mumbai, Maharashtra, India Author
Neha Singh Department of Computer Engineering, Shree L. R. Tiwari College of Engineering, Mumbai, Maharashtra, India Author
Ashish Pal Department of Computer Engineering, Shree L. R. Tiwari College of Engineering, Mumbai, Maharashtra, India Author
Adil Khan Department of Computer Engineering, Shree L. R. Tiwari College of Engineering, Mumbai, Maharashtra, India Author
Dr.Vinayak Shinde Department of Computer Engineering, Shree L. R. Tiwari College of Engineering, Mumbai, Maharashtra, India Author

DOI:

https://doi.org/10.32628/CSEIT2410232

Keywords:

Vi-Assist, Object Detection, Path Navigation Algorithm, Depth Estimation, AI Speech Synthesis

Abstract

Vi-Assist is a ground-breaking tool that offers a wide range of capabilities to meet the various issues faced by people with visual impairments. Utilizing state-of-the-art technologies like YOLOv5 for object detection, BLIP for environment description, and an advanced path navigation algorithm based on A*, the app offers real-time information, enabling users to navigate, interact with their surroundings, and find objects of interest more effectively. Furthermore, Vi-Assist uses Deep Face for facial recognition, supporting users in recognizing known faces and deciphering non-verbal signs to overcome obstacles in social interactions. MIDAS for depth estimation, OpenCV, Deep Learning, PyQt, AI/ML techniques, and Eleven Labs for AI speech synthesis are all integrated into this revolutionary application, which goes beyond simple assistance to empower visually impaired people and promote confidence, independence, and enhanced standard of living overall.

Downloads

Download data is not yet available.

References

Pascolini, D.; Mariotti, S. P. (2012). Global estimates of visual impairment: 2010. British Journal of Ophthalmology, 96(5), 614–618. doi:10.1136/bjophthalmol-2011-300539. DOI: https://doi.org/10.1136/bjophthalmol-2011-300539

Kumar, Sunil, et al. "Artificial Intelligence Solutions for the Visually Impaired: A Review." Handbook of Research on AI and Knowledge Engineering for Real-Time Business Intelligence (2023): 198-207. DOI: https://doi.org/10.4018/978-1-6684-6519-6.ch013

Walle, H., De Runz, C., Serres, B., & Venturini, G. (2022). A survey on recent advances in AI and vision-based methods for helping and guiding visually impaired people. Applied Sciences, 12(5), 2308. DOI: https://doi.org/10.3390/app12052308

F. Ashiq et al., "CNN-Based Object Recognition and Tracking System to Assist Visually Impaired People," in IEEE Access, vol. 10, pp. 14819-14834, 2022, doi: 10.1109/ACCESS.2022.3148036. DOI: https://doi.org/10.1109/ACCESS.2022.3148036

M. A. Khan, P. Paul, M. Rashid, M. Hossain and M. A. R. Ahad, "An AI-Based Visual Aid With Integrated Reading Assistant for the Completely Blind," in IEEE Transactions on Human-Machine Systems, vol. 50, no. 6, pp. 507-517, Dec. 2020, doi: 10.1109/THMS.2020.3027534. DOI: https://doi.org/10.1109/THMS.2020.3027534

Amit, Y., Felzenszwalb, P., Girshick, R. (2021). Object Detection. In: Ikeuchi, K. (eds) Computer Vision. Springer, Cham. https://doi.org/10.1007/978-3-030-63416-2_660 DOI: https://doi.org/10.1007/978-3-030-63416-2_660

C. Liu, Y. Tao, J. Liang, K. Li and Y. Chen, "Object Detection Based on YOLO Network," 2018 IEEE 4th Information Technology and Mechatronics Engineering Conference (ITOEC), Chongqing, China, 2018, pp. 799-803, doi: 10.1109/ITOEC.2018.8740604. DOI: https://doi.org/10.1109/ITOEC.2018.8740604

Wayahdi, M. R., Ginting, S. H. N. ., & Syahputra, D. . (2021). Greedy, A-Star, and Dijkstra’s Algorithms in Finding Shortest Path. International Journal of Advances in Data and Information Systems, 2(1), 45-52. https://doi.org/10.25008/ijadis.v2i1.1206. DOI: https://doi.org/10.25008/ijadis.v2i1.1206

Bhoi A. Monocular depth estimation: A survey. arXiv preprint arXiv:1901.09402. 2019 Jan 27.

S. Li and W. Deng, "Deep Facial Expression Recognition: A Survey," in IEEE Transactions on Affective Computing, vol. 13, no. 3, pp. 1195-1215, 1 July-Sept. 2022, doi: 10.1109/TAFFC.2020.2981446.

E. Maiettini, G. Pasquale, L. Rosasco and L. Natale, "Interactive data collection for deep learning object detectors on humanoid robots," 2017 IEEE-RAS 17th International Conference on Humanoid Robotics (Humanoids), Birmingham, UK, 2017, pp. 862-868, doi: 10.1109/HUMANOIDS.2017.8246973. DOI: https://doi.org/10.1109/HUMANOIDS.2017.8246973

Wang, T. S., Kim, G. T., Kim, M., & Jang, J. (2023). Contrast Enhancement-Based Preprocessing Process to Improve Deep Learning Object Task Performance and Results. Applied Sciences, 13(19), 10760. DOI: https://doi.org/10.3390/app131910760

S. Li, Y. Li, Y. Li, M. Li and X. Xu, "YOLO-FIRI: Improved YOLOv5 for Infrared Image Object Detection," in IEEE Access, vol. 9, pp. 141861-141875, 2021, doi: 10.1109/ACCESS.2021.3120870. DOI: https://doi.org/10.1109/ACCESS.2021.3120870

S. Raj, Y. Gupta and R. Malhotra, "License Plate Recognition System using Yolov5 and CNN," 2022 8th International Conference on Advanced Computing and Communication Systems (ICACCS), Coimbatore, India, 2022, pp. 372-377, doi: 10.1109/ICACCS54159.2022.9784966. DOI: https://doi.org/10.1109/ICACCS54159.2022.9784966

Ming, Y., Meng, X., Fan, C., & Yu, H. (2021). Deep learning for monocular depth estimation: A review. Neurocomputing, 438, 14–33. doi:10.1016/j.neucom.2020.12.089 DOI: https://doi.org/10.1016/j.neucom.2020.12.089

Wu, T.-H., Wang, T.-W., & Liu, Y.-Q. (2021). Real-Time Vehicle and Distance Detection Based on Improved Yolo v5 Network. 2021 3rd World Symposium on Artificial Intelligence (WSAI). doi:10.1109/wsai51899.2021.9486316. DOI: https://doi.org/10.1109/WSAI51899.2021.9486316

A. Candra, M. A. Budiman and K. Hartanto, "Dijkstra's and A-Star in Finding the Shortest Path: a Tutorial," 2020 International Conference on Data Science, Artificial Intelligence, and Business Analytics (DATABIA), Medan, Indonesia, 2020, pp. 28-32, doi: 10.1109/DATABIA50434.2020.9190342. DOI: https://doi.org/10.1109/DATABIA50434.2020.9190342

S. Li and W. Deng, "Deep Facial Expression Recognition: A Survey," in IEEE Transactions on Affective Computing, vol. 13, no. 3, pp. 1195-1215, 1 July-Sept. 2022, doi: 10.1109/TAFFC.2020.2981446. DOI: https://doi.org/10.1109/TAFFC.2020.2981446

Zheng, Z., Wang, P., Liu, W., Li, J., Ye, R., & Ren, D. (2020). Distance-IoU Loss: Faster and Better Learning for Bounding Box Regression. Proceedings of the AAAI Conference on Artificial Intelligence, 34(07), 12993-13000. https://doi.org/10.1609/aaai.v34i07.6999. DOI: https://doi.org/10.1609/aaai.v34i07.6999

Santoso, K., & Kusuma, G. P. (2018). Face recognition using modified OpenFace. Procedia Computer Science, 135, 510-517. DOI: https://doi.org/10.1016/j.procs.2018.08.203

Quan, W., & Fang, J. (2010). A star recognition method based on the adaptive ant colony algorithm for star sensors. Sensors, 10(3), 1955-1966. DOI: https://doi.org/10.3390/s100301955

Srikanteswara, R., Reddy, M.C., Himateja, M., Kumar, K.M. (2022). Object Detection and Voice Guidance for the Visually Impaired Using a Smart App. In: Shetty D., P., Shetty, S. (eds) Recent Advances in Artificial Intelligence and Data Engineering. Advances in Intelligent Systems and Computing, vol 1386. Springer, Singapore. https://doi.org/10.1007/978-981-16-3342-3_11. DOI: https://doi.org/10.1007/978-981-16-3342-3_11