A Review on OCR Post Processing Error Correction Algorithm

Authors(4) :-Mayur Burhan, Mimoh Samarth, Mrunal Talokar, Nikhil Gaikwad

Building an effective method to detect characters from images with less error rate is the great task. Our aim is to give such an algorithm that will be generated error-free recognition of text from the given image. For this purpose, OCR was developed to translate scanned text into editable computer text. Unfortunately, OCR is still imperfect as it falsely identifies scanned text leading to misspellings and errors in the OCR output. Hence a post-processing technique used for detecting and correcting OCR non-word and real-word errors. The aim of our project is to develop an 'Android app for translating text into a known language which is named as TransLocator.'

Authors and Affiliations

Mayur Burhan
Department of Computer Science & Engineering, Rajiv Gandhi College of Engineering and Research, Nagpur, Maharashtra, India
Mimoh Samarth
Department of Computer Science & Engineering, Rajiv Gandhi College of Engineering and Research, Nagpur, Maharashtra, India
Mrunal Talokar
Department of Computer Science & Engineering, Rajiv Gandhi College of Engineering and Research, Nagpur, Maharashtra, India
Nikhil Gaikwad
Department of Computer Science & Engineering, Rajiv Gandhi College of Engineering and Research, Nagpur, Maharashtra, India

Optical Character recognition, Post processing, Text mining, Error Correction

  1. Journal of Emerging Trends in Computing and Information Sciences, ISSN 2079-8407, Vol. 3, No. 1, January 2012 http://www.cisjournal.org/journalofcomputing/archive/vol3no1/vol3no1_7.pdf
  2. An Overview of the Tesseract OCR Engine Ray Smith Google Inc. theraysmith@gmail.com Proc. International Conference on Document Analysis and Recognition, ICDAR’2007, Curitiba, Brazil, Sep. 2007
  3. ADAPTIVE POST-PROCESSING OF OCR TEXT VIA KNOWLEDGE ACQUISITION Lon-Mu Liu, Yair M. Babad, Wei Sun, and Ki-Kan Chan Department of Information and Decision Sciences University of Illinois at
  4. Chicago (M/C 294) Box 4348, Chicago, Illinois 60680
  5. A novel OCR approach based on document layout analysis and text block classification Weiheng ZHU1 Yuanfeng LIU2 Liang HAO1 1.Department of Computer Science 2. Information Technology Research Institute Jinan University Guangzhou, China
  6. Optical character recognition using template matching and back propagation algorithm Swapnil Desai CSED, Thapar University Patiala (Punjab), India swapnil.innovative@gmail.com Ashima Singh Assistant Professor, CSED, Thapar University Patiala (Punjab), India
  7. Review of the Character Recognition System Process and Optical Character Recognition Approach Jaswinder Kaur, Mrs. Rupinder Kaur CSE & PTU,
  8. OCR Post-processing Using Weighted Finite-State Transducers Rafael Llobet, J.Ramon Navarro-Cerdan, Juan-Carlos Perez-Cortes, JoaquimArlandisInstitutoTecnologico de Informatica Universidad Politecnica de Valencia Camino de Vera s/n, 46071 Valencia, Spain {rllobet, jonacer, jcperez, arlandis}@iti.upv.es ?
  9. A novel OCR approach based on document layout analysis and text block classification
  10. Optical character recognition using template   matching and back propagation algorithm Swapnil Desai Ashima Singh
  11. OCR POST-PROCESSING ERROR CORRECTION ALGORITHM USING GOOGLE'S ONLINE SPELLING SUGGESTION Youssef Bassil, Mohammad Alwani
  12. Comparative Study with Analysis of OCR Algorithms and Invention Analysis of Character Recognition Approached Methodologies Santosh Kumar Hengel and B. Rama2 1 , 2Department of Computer Science, Kakatiya University, Warangal, Telangana State,
  13. Post-Processing OCR Text using Web-Scale Corpora Jie Mei† , Aminul Islam , Abidalrahman Moh’d† , Yajing Wu† , Evangelos Milios† †Faculty of Computer Science, Dalhousie University {jmei,amohd,yajing,eem}@cs.dal.ca School of Computing and Informatics, University of Louisiana at Lafayette
  14. GA?nter MA?hlberger, Johannes Zelger, and David Sagmeister. 2014. User-driven Correction of OCR Errors: Combining Crowdsourcing and Information Retrieval Technology. In Proceedings of the First International Conference on Digital Access to Textual Cultural Heritage (DATeCH ’14). ACM, New York, NY, USA, 53-56. https://doi.org/10.1145/2595188.2595212
  15. OCR Error Correction Using Character Correction and Feature-Based Word Classification Ido Kissos School of Computer Science, Tel Aviv University Ramat Aviv, Israel Nachum Dershowitz School of Computer Science 

Publication Details

Published in : Volume 3 | Issue 3 | March-April 2018
Date of Publication : 2018-04-30
License:  This work is licensed under a Creative Commons Attribution 4.0 International License.
Page(s) : 258-262
Manuscript Number : CSEIT183380
Publisher : Technoscience Academy

ISSN : 2456-3307

Cite This Article :

Mayur Burhan, Mimoh Samarth, Mrunal Talokar, Nikhil Gaikwad, "A Review on OCR Post Processing Error Correction Algorithm", International Journal of Scientific Research in Computer Science, Engineering and Information Technology (IJSRCSEIT), ISSN : 2456-3307, Volume 3, Issue 3, pp.258-262, March-April-2018.
Journal URL : http://ijsrcseit.com/CSEIT183380

Follow Us

Contact Us