A Review on OCR Post Processing Error Correction Algorithm

Authors

  • Mayur Burhan  Department of Computer Science & Engineering, Rajiv Gandhi College of Engineering and Research, Nagpur, Maharashtra, India
  • Mimoh Samarth  Department of Computer Science & Engineering, Rajiv Gandhi College of Engineering and Research, Nagpur, Maharashtra, India
  • Mrunal Talokar  Department of Computer Science & Engineering, Rajiv Gandhi College of Engineering and Research, Nagpur, Maharashtra, India
  • Nikhil Gaikwad  Department of Computer Science & Engineering, Rajiv Gandhi College of Engineering and Research, Nagpur, Maharashtra, India

Keywords:

Optical Character recognition, Post processing, Text mining, Error Correction

Abstract

Building an effective method to detect characters from images with less error rate is the great task. Our aim is to give such an algorithm that will be generated error-free recognition of text from the given image. For this purpose, OCR was developed to translate scanned text into editable computer text. Unfortunately, OCR is still imperfect as it falsely identifies scanned text leading to misspellings and errors in the OCR output. Hence a post-processing technique used for detecting and correcting OCR non-word and real-word errors. The aim of our project is to develop an 'Android app for translating text into a known language which is named as TransLocator.'

References

  1. Journal of Emerging Trends in Computing and Information Sciences, ISSN 2079-8407, Vol. 3, No. 1, January 2012 http://www.cisjournal.org/journalofcomputing/archive/vol3no1/vol3no1_7.pdf
  2. An Overview of the Tesseract OCR Engine Ray Smith Google Inc. [email protected] Proc. International Conference on Document Analysis and Recognition, ICDAR’2007, Curitiba, Brazil, Sep. 2007
  3. ADAPTIVE POST-PROCESSING OF OCR TEXT VIA KNOWLEDGE ACQUISITION Lon-Mu Liu, Yair M. Babad, Wei Sun, and Ki-Kan Chan Department of Information and Decision Sciences University of Illinois at
  4. Chicago (M/C 294) Box 4348, Chicago, Illinois 60680
  5. A novel OCR approach based on document layout analysis and text block classification Weiheng ZHU1 Yuanfeng LIU2 Liang HAO1 1.Department of Computer Science 2. Information Technology Research Institute Jinan University Guangzhou, China
  6. Optical character recognition using template matching and back propagation algorithm Swapnil Desai CSED, Thapar University Patiala (Punjab), India [email protected] Ashima Singh Assistant Professor, CSED, Thapar University Patiala (Punjab), India
  7. Review of the Character Recognition System Process and Optical Character Recognition Approach Jaswinder Kaur, Mrs. Rupinder Kaur CSE & PTU,
  8. OCR Post-processing Using Weighted Finite-State Transducers Rafael Llobet, J.Ramon Navarro-Cerdan, Juan-Carlos Perez-Cortes, JoaquimArlandisInstitutoTecnologico de Informatica Universidad Politecnica de Valencia Camino de Vera s/n, 46071 Valencia, Spain {rllobet, jonacer, jcperez, arlandis}@iti.upv.es ∗
  9. A novel OCR approach based on document layout analysis and text block classification
  10. Optical character recognition using template   matching and back propagation algorithm Swapnil Desai Ashima Singh
  11. OCR POST-PROCESSING ERROR CORRECTION ALGORITHM USING GOOGLE'S ONLINE SPELLING SUGGESTION Youssef Bassil, Mohammad Alwani
  12. Comparative Study with Analysis of OCR Algorithms and Invention Analysis of Character Recognition Approached Methodologies Santosh Kumar Hengel and B. Rama2 1 , 2Department of Computer Science, Kakatiya University, Warangal, Telangana State,
  13. Post-Processing OCR Text using Web-Scale Corpora Jie Mei† , Aminul Islam , Abidalrahman Moh’d† , Yajing Wu† , Evangelos Milios† †Faculty of Computer Science, Dalhousie University {jmei,amohd,yajing,eem}@cs.dal.ca School of Computing and Informatics, University of Louisiana at Lafayette
  14. GAijnter MAijhlberger, Johannes Zelger, and David Sagmeister. 2014. User-driven Correction of OCR Errors: Combining Crowdsourcing and Information Retrieval Technology. In Proceedings of the First International Conference on Digital Access to Textual Cultural Heritage (DATeCH ’14). ACM, New York, NY, USA, 53-56. https://doi.org/10.1145/2595188.2595212
  15. OCR Error Correction Using Character Correction and Feature-Based Word Classification Ido Kissos School of Computer Science, Tel Aviv University Ramat Aviv, Israel Nachum Dershowitz School of Computer Science 

Downloads

Published

2018-04-30

Issue

Section

Research Articles

How to Cite

[1]
Mayur Burhan, Mimoh Samarth, Mrunal Talokar, Nikhil Gaikwad, " A Review on OCR Post Processing Error Correction Algorithm, IInternational Journal of Scientific Research in Computer Science, Engineering and Information Technology(IJSRCSEIT), ISSN : 2456-3307, Volume 3, Issue 3, pp.258-262, March-April-2018.