Optical Character Recognition from Printed Text Images
DOI:
https://doi.org/10.32628/CSEIT1952175Keywords:
Corner Point, FAST (Features from Accelerated Segment Test), OCR, Multilingual Documents, Handwritten DocumentsAbstract
In recent years, text extraction from document images is one of the most widely studied topics in Image Analysis and Optical Character Recognition. These extractions of document images can be used for document analysis, content analysis, document retrieval and many more. Many complex text extracting processes Maximization Likelihood (ML), Edge point detection, Corner point detection etc. are used to extract text documents from images. In this article, the corner point approach was used. To extract document from images we used a very simple approach based on FAST algorithm. Firstly, we divided the image into blocks and their density in each block was checked. The denser blocks were labeled as text blocks and the less dense were the image region or noise. Then we check the connectivity of the blocks to group the blocks so that the text part can be isolated from the image. This method is very fast and versatile, it can be used to detect various languages, handwriting and even images with a lot of noise and blur. Even though it is a very simple program the precision of this method is closer or higher than 90%. In conclusion, this method helps in more accurate and less complex detection of text from document images.
References
- Suruchi G. Dedgaonkar, Anjali A. Chandavale, Ashok M. Sapkal ,“Survey of Methods for Character Recognition”, International Journal of Engineering and Innovative Technology (IJEIT), Volume 1, Issue 5, May 2012, ISSN: 2277-3754.
- Shalin A. Chopra, Amit A. Ghadge, Onkar A. Padwal, Karan S. Punjabi, Prof. Gandhali S. Gurjar,“Optical Character Recognition”, International Journal of Advanced Research in Computer and Communication Engineering ,Vol. 3, Issue 1, January 2014,pp. 4956-4958, ISSN (Online) : 2278-1021,ISSN (Print): 2319-5940.
- Sarika Pansare, Dhanshree Joshi,” A Survey on Optical Character Recognition Techniques”, International Journal of Science and Research (IJSR), Volume 3 Issue 12, December 2014, pp. 1247-1249, ISSN (Online): 2319-7064.
- Sukhpreet Singh,” Optical Character Recognition Techniques: A Survey”, Journal of Emerging Trends in Computing and Information Sciences, Vol. 4, No. 6 June 2013, pp. 545-550, ISSN 2079-8407.
- Deepika Ghai, Neelu Jain , “ Text Extraction from Document Images- A Review”, International Journal of Computer Applications (0975 – 8887) , Volume 84 – No 3, December 2013 , pp. 40- 48.
- Keechul Junga, Kwang In Kim, Anil K. Jain, “Text information extraction in images and video: a survey”, Pattern Recognition, 37, pp. 977-997, 2004.
- Line Eikvil,” Optical Character Recognition”, Norsk Regnesentral, P.B. 114 Blindern, N-0314, December 1993.
Downloads
Published
Issue
Section
License
Copyright (c) IJSRCSEIT

This work is licensed under a Creative Commons Attribution 4.0 International License.