Detection and Recognition for Reading Text in Images

Pooja Kumari; Mamta Yadav

doi:10.32628/CSEIT1835251

Authors

Pooja Kumari M.Tech Scholar CSE, M.D.U Rohtak, YCET Narnaul, Mahendergarh, India
Mamta Yadav Assistant Professor CSE, M.D.U Rohtak, YCET Narnaul, Mahendergarh, India

Keywords:

Detection and Recognition, ASCII Code, Optical character Recognition, Hausdorff Distance, Euclidean Distance

Abstract

Detection And Recognition for Reading Text in Images is a difficult but important problem. It can be summarized as: how to enable a computer to recognize letters and digits from a predefined alphabet, possibly using contextual information. Various attempts at solving this problem, using different selections of features and classifiers, have been made. Human performance has been achieved in accuracy by automated text recognition systems, and has been bypassed in speed for the case of single size, single font, high quality, known layout, known background, text. When one or more of the above parameters are changed, the problem becomes increasingly difficult. In particular, attaining human performance in recognizing cursive script of varying size, varying style, unknown layout, unknown background is far from the reach of todays' algorithms, despite the continuous research effort for almost four decades. In this report, we analyze the problem in detail, present the associated difficulties, and propose a coherent framework for addressing automated text recognition. A lot of people like to say that the world is overwhelmed with information that is still harder and harder to deal with, both for individual humans living in the overwhelmed world and for the technology they use. Popularity of mobile devices equipped with cameras has influenced peoples' lives in many ways recently. One of these changes is that people started to take photos as notes about things which are not of visual nature as opening hours or traffic schedules. Taking a picture of the signs became a very convenient way of storing such information, however later retrieval of such "photographic notes" with any meta-data may became very time consuming.

References

Agarwal, Shivani, Awan, Aatif, and Roth, Dan. Learning to detect objects in images via a sparse, part-based representation. IEEE Transactions on Pattern Analysis and Machine Intelligence 26, 11 (2004), 1475-1490.
Baird, Henry S., and Nagy, George. Self-correcting 100-font classifier. In Proc. of SPIE: Document Recognition (1994), Luc M. Vincent and Theo Pavlidis, Eds., vol. 2181, pp. 106-115.
Bapst, Fr'ed'eric, and Ingold, Rolf. Using typography in document image analysis. In Electronic Publishing, Artistic Imaging, and Digital Typography (1998), vol. 1375 of Lecture Notes in Computer Science, pp. 240-251.
Bargeron, David, Viola, Paul, and Simard, Patrice. Boosting-based transductive learning for text detection. In Proc. Intl. Conf. on Document Analysis and Recognition (2005), pp. 1166-1171.
Bazzi, Issam, Schwartz, Richard, and Makhoul, John. An omnifont openvocabulary OCR system for English and Arabic. IEEE Transactions on Pattern Analysis and Machine Intelligence 21, 6 (1999), 495-504.
Beal, Matthew J. Variational Algorithms for Approximate Bayesian Inference. PhD thesis, University College London, London, 2003.
Beaufort, R., and Mancas-Thillou, C. A weighted finite-state framework for correcting errors in natural scene OCR. Proc. Intl. Conf. on Document Analysis and Recognition 2 (2007), 889-893.
Belongie, S., Malik, J., and Puzicha, J. Shape matching and object recognition using shape contexts. IEEE Transactions on Pattern Analysis and Machine Intelligence 24, 4 (2002), 509-522.
Berger, Adam L., Della Pietra, Stephen A., and Della Pietra, Vincent J. A maximum entropy approach to natural language processing. Computational Linguistics 22, 1 (1996), 39-71.
Bernstein, Elliot Joel, and Amit, Yali. Part-based statistical models for object classification and detection. In Proc. Conf. on Computer Vision and Pattern Recognition (2005), pp. 734-740.
Bledsoe, W. W., and Browning, I. Pattern recognition and reading by machine. In Proc. of Eastern Joint Computer Conf. (1959), pp. 225-232. 134
Blum, Avrim, and Langley, Pat. Selection of relevant features and examples in machine learning. Artificial Intelligence 97, 1-2 (1997), 245-271.
Boyd, Stephen, and Vandenberghe, Lieven. Convex Optimization. Cambridge University Press, 2004.
Brakensiek, Anja, Willett, Daniel, and Rigoll, Gerhard. Improved degraded document recognition with hybrid modeling techniques and character n-grams. In Proc. Intl. Conf. on Pattern Recognition (2000), vol. 4, pp. 438-441.
Breuel, Thomas M. Classification by probabilistic clustering. In Proc. Intl. Conf. on Acoustics, Speech, and Signal Processing (2001), vol. 2, pp. 1333-1336.
Breuel, Thomas M. Character recognition by adaptive statistical similarity. In Proc. Intl. Conf. on Document Analysis and Recognition (2003), vol. 1, pp. 158- 162.
Buntine, W., and Weigend, A. Bayesian back-propagation. Complex Systems 5 (1991), 603-643.
Carbonetto, P., de Freitas, N., and Barnard, K. A statistical model for general contextual object recognition. In Proc. European Conf. on Computer Vision (2004), vol. 1, pp. 350-362.
Caruana, Rich. Multitask learning. Machine Learning 28, 1 (1997), 41-75.
Chen, Datong, Odobez, Jean-Marc, and Bourlard, H. Text detection and recognition
in images and video frames. Pattern Recognition 37, 3 (2004), 595-608.
Chen, Xiangrong, and Yuille, Alan L. Detecting and reading text in natural scenes. In Proc. Conf. on Computer Vision and Pattern Recognition (2004), pp. 366-373.

Detection and Recognition for Reading Text in Images

Authors

Keywords:

Abstract

References

Downloads

Published

Issue

Section

License

How to Cite