Optical Character Recognition - A Review

Authors(2) :-Dr. S.Vijayarani, M. Geetha

Optical Character Recognition (OCR) is the transformation of digital document images into editable text format. Nowadays, most of us downloading the digital documents and e-books from an internet. Normally, most of the downloaded documents are available in the form of images, so it is impossible to edit or to perform search process for extracting any information from these digital documents. Optical Character Recognition is a process of detecting and recognizing the characters from the document images. Documents are converted into digital form with the help of scanners and mobile phones. Scanners scan the documents, whereas mobile phones takes the snapshots of the documents. OCR technique helps to convert handwritten or printed document images into editable form and then we can perform document editing and searching process. The main disadvantage of OCR tool is the accuracy, i.e. it unable to translate the document image accurately into editable form. The OCR does not convert some of the characters, numbers and special symbols in the document images properly. Therefore, there is a need for development of accurate OCR techniques, which can able to perform search and edit processes successfully. This paper studies the fundamental concepts of OCR, significant steps, and their related works.

Authors and Affiliations

Dr. S.Vijayarani
Assistant Professor, Department of Computer Science, Bharathiar University, Coimbatore, Tamil Nadu, India
M. Geetha
M.Phil Research Scholar, Department of Computer Science, Bharathiar University, Coimbatore, Tamil Nadu, India

Document Image, Optical Character Recognition (OCR), Segmentation, Feature Extraction.

  1. Aparna K G, A G Ramakrishnan,”A Complete Tamil Optical Character Recognition System”, International Journal of Machine Learning and Computing, Vol. 2, No. 3, June 2012.
  2. Seethalakshmi R,Sreeranjani T. R, Balachandar T,” Optical character recognition for printed Tamil text using Unicode”, November 2005, Volume 6, Issue 11, pp 1297–1305
  3. AkshayApte and HarshadGado, “Tamil character recognition using structural features” ,2010
  4. JagadeeshKannan R and PrabhakarR,”An improved Handwritten Tamil Character Recognition System using Octal Graph”, Int.  J. of Computer Science, ISSN 1549-3636, and Vol 4 (7): 509-516, 2008.
  5. Jagadeesh Kumar R, Prabhakar R and Suresh R.M, “Off-line Cursive Handwritten Tamil Characters Recognition”, International Conference on Security Technology, page(s): 159 – 164, 2008
  6. “A survey of modern optical character recognition techniques” (DRAFT), February 2004
  7. Amarjot Singh, KetanBacchuwar, and AkshayBhasin,”A Survey of OCR Applications”International Journal of Machine Learning and Computing, Vol. 2, No. 3, June 2012.
  8. G.Vamvakas, B.Gatos, N. Stamatopoulos, and S.J.Perantonis, ”A Complete Optical Character Recognition Methodology for Historical Documents”, 2008IEEEDOI 10.1109/DAS.2 008.73
  9. M. Antony Robert Raj, Dr.S.Abiram,”A Survey on Tamil Handwritten Character Recognition using OCR Techniques” DOI: 10.5121/csit.2012.2213.
  10. U. Pal and B. B. Choudhuri. A Complete Printed Bangla OCR Systrem. Pattern Recognition. Vol 31. May 1998
  11. Amarjot Singh, KetanBacchuwar, and AkshayBhasin,”A Survey of OCR Applications”International Journal of Machine Learning and Computing, Vol. 2, No. 3, June 2012.
  12. RohitVerma, Dr. Jahid Ali, “A-Survey of Feature Extraction and Classification Techniques in OCR Systems”, International Journal of Computer Applications & Information Technology Vol. I, Issue III, November 2012 (ISSN: 2278-7720).
  13. Dr.AmitabhWahi, Mr.Sundaramurthy.S, PoovizhiP,”Handwritten Tamil Character Recognition”,2013 Fifth International Conference on Advanced Computing (ICoAC).
  14. Maya R. Gupta, Nathaniel P. Jacobson, Eric K. Garcia,” OCR binarization and image pre-processing for searching historical documents”, Pattern Recognition 40 (2007) 389 – 397, 2006.
  15. RamanathanS.Ponmathavan,N.ValliappanL.Thaneshwaran, Arun.S.Nair, “Optical Character Recognition for English and Tamil Using Support Vector Machines”,  Advances in Computing, Control, & Telecommunication Technologies, 2009. ACT '09. International Conference on 2009
  16. 16. V. Ajantha Devi, S Santhosh Baboo,”     Embedded Optical Character Recognition On Tamil Text Image using Raspberry Pi ”, International Journal of Computer Science Trends and Technology (IJCST) – Volume 2 Issue 4, Jul-Aug 2014

Publication Details

Published in : Volume 2 | Issue 5 | September-October 2017
Date of Publication : 2017-10-31
License:  This work is licensed under a Creative Commons Attribution 4.0 International License.
Page(s) : 942-947
Manuscript Number : CSEIT1725216
Publisher : Technoscience Academy

ISSN : 2456-3307

Cite This Article :

Dr. S.Vijayarani, M. Geetha, "Optical Character Recognition - A Review", International Journal of Scientific Research in Computer Science, Engineering and Information Technology (IJSRCSEIT), ISSN : 2456-3307, Volume 2, Issue 5, pp.942-947, September-October-2017. |          | BibTeX | RIS | CSV

Article Preview