Audio Feedback for Object Detection using Deep Learning

S. Sohail; Dr. Srinivasan Jagannathan; Mr. Suresh

doi:10.32628/CSEIT228449

Authors

S. Sohail Department of Computer Application, Madanapalle Institute of Technology and Science, Madanapalle, Andhra Pradesh, India
Dr. Srinivasan Jagannathan Department of Computer Application, Madanapalle Institute of Technology and Science, Madanapalle, Andhra Pradesh, India
Mr. Suresh Department of Computer Application, Madanapalle Institute of Technology and Science, Madanapalle, Andhra Pradesh, India

Keywords:

Tensor flow, Yolo_v3, Web Speech API, Deep Learning.

Abstract

Object recognition is one of the challenging application of computer vision, which has been widely applied in many areas for e.g. autonomous cars, Robotics, Security tracking, Guiding Visually Impaired Peoples etc. With the rapid development of deep learning many algorithms were improving the relationship between video analysis and image understanding. All these algorithms work differently with their network architecture but with the same aim of detecting multiple objects within complex image. Absence of vision impairment restraint the movement of the person in an unfamiliar place and hence it is very essential to take help from our technologies and trained them to guide blind peoples whenever they need.

References

S. Cherian, & C. Singh, “Real Time Implementation of Object Tracking Through webcam,” Internation Journal of Research in Engineering and Technology, 128-132, (2014).
Z. Zhao, Q. Zheng, P.Xu, S. T, & X. Wu, “Object detection with deep learning: A review,” IEEE transactions on neural networks and learning systems, 30(11), 3212-3232, (2019).
N. Dalal, & B. Triggs, “Histograms of oriented gradients for human detection,” In 2005 IEEE computer society conference on computer vision and pattern recognition (CVPR'05) (Vol. 1, pp. 886-893). IEEE, (2005, June).
R. Girshick., J. Donahue, T. Darrell, & J. Malik, “Region-based convolutional networks for accurate object detection and segmentation,” IEEE transactions on pattern analysis and machine intelligence, 38(1), 142-158, (2015).
X. Wang, A. Shrivastava, & A. Gupta, “A-fast-rcnn: Hard positive generation via adversary for object detection,” In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 2606- 2615), (2017).
S. Ren, K. H, R. Girshick, & J. Sun, “Faster r-cnn: Towards real-time object detection with region proposal networks,” In Advances in neural information processing systems (pp. 91-99), (2015).
J. Redmon, S. Divvala, R. Girshick, & A. Farhadi, “You only look once: Unified, real-time object detection,” In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 779-788), (2016).
J. Redmon, & A. Farhadi, “YOLO9000: better, faster, stronger,” In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 7263-7271) (2017).
J. Redmon & A. Farhadi, “Yolov3: An incremental improvement,” ArXiv preprint arXiv: 1804.02767, (2018).
R. Bharti, K. Bhadane, P. Bhadane, & A. Gadhe, “Object Detection and Recognition for Blind Assistance,” International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 06, (2019).
T. Lin, Y. Maire, M. Belongie, S. Hays, J. Perona, P. Ramanan, D., & C.L. Zitnick, “Microsoft coco: Common objects in context,” In European conference on computer vision (pp. 740-755). Springer, Cham, (2014, September).
Lowe D., “Distinctive image features from scale-invariant keypoints, ” Int. J. Comput. Vis., vol. 60, no. 2, pp. 91–110, 2004.
Dalal N. and Triggs B., “Histograms of oriented gradients for human detection,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2005, pp. 886–893.
Everingham M., van Gool L., Williams C. K. I., Winn J. , and Zisserman A., “The PASCAL visual object classes (VOC) challenge, ” Int. J. Comput. Vis., vol. 80, no. 2, pp. 303–338, 2010.
Fukushima K., “Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position,” Biol. Cybern. , vol. 36, no. 4, pp. 193–202, 1980.
Rumelhart D. E., Hinton G. E., and Williams R. J., “Learning internal representations by error propagation,” Parallel Distrib. Process. , vol. 1, pp. 318–362, 1986.
Krizhevsky A., Sutskever I., and Hinton G., “ImageNet classification with deep convolutional neural networks,” in Proc. Adv. Neural Inf. Process. Syst., 2012, pp. 1106–1114.
Sermanet P., Kavukcuoglu K., Chintala S., and LeCun Y., “Pedestrian detection with unsupervised multi-stage feature learning,” in Proc. IEEE Conf. Comput. Vis. Pattern Recog., 2013, pp. 3626–3633.
P. Sermanet, D. Eigen, X. Zhang, M. Mathieu, R. Fergus, and Y. LeCun, “OverFeat: Integrated Recognition, Localization and Detection using Convolutional Networks,” in ICLR, 2014.
J. Uijlings, K. van de Sande, T. Gevers, and A. Smeulders, “Selective search for object recognition,” IJCV, 2013.
O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, Z. Huang, A. Karpathy, A. Khosla, M. Bernstein, A. C. Berg, and L. Fei-Fei, “ImageNet large scale visual recognition challenge,” arXiv e-prints, vol. arXiv:1409.0575v1 [cs.CV], 2014.

Audio Feedback for Object Detection using Deep Learning

Authors

Keywords:

Abstract

References

Downloads

Published

Issue

Section

License

How to Cite