Optical Character Recognition from Printed Text Images

Dr. T. Kameswara Rao; K. Yashwanth Chowdary; I. Koushik Chowdary; K. Prasanna Kumar; Ch. Ramesh

doi:10.32628/CSEIT1952175

Authors

Dr. T. Kameswara Rao Professor, Department of CSE, VVIT, Nambur, Guntur, Andhra Pradesh, India
K. Yashwanth Chowdary B. Tech, Department of CSE, VVIT, Nambur, Guntur, Andhra Pradesh, India
I. Koushik Chowdary B. Tech, Department of CSE, VVIT, Nambur, Guntur, Andhra Pradesh, India
K. Prasanna Kumar B. Tech, Department of CSE, VVIT, Nambur, Guntur, Andhra Pradesh, India
Ch. Ramesh B. Tech, Department of CSE, VVIT, Nambur, Guntur, Andhra Pradesh, India

DOI:

https://doi.org//10.32628/CSEIT1952175

Keywords:

Corner Point, FAST (Features from Accelerated Segment Test), OCR, Multilingual Documents, Handwritten Documents

Abstract

In recent years, text extraction from document images is one of the most widely studied topics in Image Analysis and Optical Character Recognition. These extractions of document images can be used for document analysis, content analysis, document retrieval and many more. Many complex text extracting processes Maximization Likelihood (ML), Edge point detection, Corner point detection etc. are used to extract text documents from images. In this article, the corner point approach was used. To extract document from images we used a very simple approach based on FAST algorithm. Firstly, we divided the image into blocks and their density in each block was checked. The denser blocks were labeled as text blocks and the less dense were the image region or noise. Then we check the connectivity of the blocks to group the blocks so that the text part can be isolated from the image. This method is very fast and versatile, it can be used to detect various languages, handwriting and even images with a lot of noise and blur. Even though it is a very simple program the precision of this method is closer or higher than 90%. In conclusion, this method helps in more accurate and less complex detection of text from document images.

References

Suruchi G. Dedgaonkar, Anjali A. Chandavale, Ashok M. Sapkal ,“Survey of Methods for Character Recognition”, International Journal of Engineering and Innovative Technology (IJEIT), Volume 1, Issue 5, May 2012, ISSN: 2277-3754.
Shalin A. Chopra, Amit A. Ghadge, Onkar A. Padwal, Karan S. Punjabi, Prof. Gandhali S. Gurjar,“Optical Character Recognition”, International Journal of Advanced Research in Computer and Communication Engineering ,Vol. 3, Issue 1, January 2014,pp. 4956-4958, ISSN (Online) : 2278-1021,ISSN (Print): 2319-5940.
Sarika Pansare, Dhanshree Joshi,” A Survey on Optical Character Recognition Techniques”, International Journal of Science and Research (IJSR), Volume 3 Issue 12, December 2014, pp. 1247-1249, ISSN (Online): 2319-7064.
Sukhpreet Singh,” Optical Character Recognition Techniques: A Survey”, Journal of Emerging Trends in Computing and Information Sciences, Vol. 4, No. 6 June 2013, pp. 545-550, ISSN 2079-8407.
Deepika Ghai, Neelu Jain , “ Text Extraction from Document Images- A Review”, International Journal of Computer Applications (0975 – 8887) , Volume 84 – No 3, December 2013 , pp. 40- 48.
Keechul Junga, Kwang In Kim, Anil K. Jain, “Text information extraction in images and video: a survey”, Pattern Recognition, 37, pp. 977-997, 2004.
Line Eikvil,” Optical Character Recognition”, Norsk Regnesentral, P.B. 114 Blindern, N-0314, December 1993.

Optical Character Recognition from Printed Text Images

Authors

DOI:

Keywords:

Abstract

References

Downloads

Published

Issue

Section

License

How to Cite