Optical Character Recognition from Printed Text Images

Authors

  • Dr. T. Kameswara Rao  Professor, Department of CSE, VVIT, Nambur, Guntur, Andhra Pradesh, India
  • K. Yashwanth Chowdary  B. Tech, Department of CSE, VVIT, Nambur, Guntur, Andhra Pradesh, India
  • I. Koushik Chowdary  B. Tech, Department of CSE, VVIT, Nambur, Guntur, Andhra Pradesh, India
  • K. Prasanna Kumar  B. Tech, Department of CSE, VVIT, Nambur, Guntur, Andhra Pradesh, India
  • Ch. Ramesh  B. Tech, Department of CSE, VVIT, Nambur, Guntur, Andhra Pradesh, India

DOI:

https://doi.org//10.32628/CSEIT1952175

Keywords:

Corner Point, FAST (Features from Accelerated Segment Test), OCR, Multilingual Documents, Handwritten Documents

Abstract

In recent years, text extraction from document images is one of the most widely studied topics in Image Analysis and Optical Character Recognition. These extractions of document images can be used for document analysis, content analysis, document retrieval and many more. Many complex text extracting processes Maximization Likelihood (ML), Edge point detection, Corner point detection etc. are used to extract text documents from images. In this article, the corner point approach was used. To extract document from images we used a very simple approach based on FAST algorithm. Firstly, we divided the image into blocks and their density in each block was checked. The denser blocks were labeled as text blocks and the less dense were the image region or noise. Then we check the connectivity of the blocks to group the blocks so that the text part can be isolated from the image. This method is very fast and versatile, it can be used to detect various languages, handwriting and even images with a lot of noise and blur. Even though it is a very simple program the precision of this method is closer or higher than 90%. In conclusion, this method helps in more accurate and less complex detection of text from document images.

References

  1. Suruchi G. Dedgaonkar, Anjali A. Chandavale, Ashok M. Sapkal ,“Survey of Methods for Character Recognition”, International Journal of Engineering and Innovative Technology (IJEIT), Volume 1, Issue 5, May 2012, ISSN: 2277-3754.
  2. Shalin A. Chopra, Amit A. Ghadge, Onkar A. Padwal, Karan S. Punjabi, Prof. Gandhali S. Gurjar,“Optical Character Recognition”, International Journal of Advanced Research in Computer and Communication Engineering ,Vol. 3, Issue 1, January 2014,pp. 4956-4958, ISSN (Online) : 2278-1021,ISSN (Print): 2319-5940.
  3. Sarika Pansare, Dhanshree Joshi,” A Survey on Optical Character Recognition Techniques”, International Journal of Science and Research (IJSR), Volume 3 Issue 12, December 2014, pp. 1247-1249, ISSN (Online): 2319-7064.
  4. Sukhpreet Singh,” Optical Character Recognition Techniques: A Survey”, Journal of Emerging Trends in Computing and Information Sciences, Vol. 4, No. 6 June 2013, pp. 545-550, ISSN 2079-8407.
  5. Deepika Ghai, Neelu Jain , “ Text Extraction from Document Images- A Review”, International Journal of Computer Applications (0975 – 8887) , Volume 84 – No 3, December 2013 , pp. 40- 48.
  6. Keechul Junga, Kwang In Kim, Anil K. Jain, “Text information extraction in images and video: a survey”, Pattern Recognition, 37, pp. 977-997, 2004.
  7. Line Eikvil,” Optical Character Recognition”, Norsk Regnesentral, P.B. 114 Blindern, N-0314, December 1993.

Downloads

Published

2019-04-30

Issue

Section

Research Articles

How to Cite

[1]
Dr. T. Kameswara Rao, K. Yashwanth Chowdary, I. Koushik Chowdary, K. Prasanna Kumar, Ch. Ramesh, " Optical Character Recognition from Printed Text Images, IInternational Journal of Scientific Research in Computer Science, Engineering and Information Technology(IJSRCSEIT), ISSN : 2456-3307, Volume 5, Issue 2, pp.597-604, March-April-2019. Available at doi : https://doi.org/10.32628/CSEIT1952175