Digitization of Data Using OCR

Rutwik Shete

doi:10.32628/CSEIT21827

Authors

Rutwik Shete School Of Computer Science, MIT WPU, Pune, Maharashtra, India

Keywords:

Artificial Intelligence; Machine Learning; Data; Digitize; Preprocessed; Google Vision AI API; Machine Learning Models.

Abstract

In this modern world we hear buzzwords like Artificial Intelligence and Machine Learning whose application in the tech industries not only mesmerises us but creates an important landmark on human minds. Interestingly the second part of both the words, that is intelligence and learning respectively , are quite entangled with each other. They emphasise on the importance of the past data. As we all learn from the data our ancestors produced and we are creating new for our future generations. Unfortunately our ancestors could not keep that data on computer or on the cloud due to lack of resources. Instead, they put it on rocks and paper. This paper is an attempt to develop a system which will digitize the data on paper to be consumed in Machine Learning Models to achieve better precision in predictions. This system starts with just clicking a clear photo of a bill / printed document / invoice or any data on paper. Then it will be preprocessed for better end results by adjusting its saturation, brightness and other characteristics. This will then allow us to go further and call the Google’s Vision AI API (Application Program Interface) which has the capability to read the document and return back the text which may or may not be in a linear fashion. Hence this text needs to be post processed in a way in which it could be further used for storing or utilizing it in the Machine Learning Models.

References

The Cambrian Data Explosion
Google Vision Ai OCR API
Jyotsna, S. Chauhan, E. Sharma and A. Doegar, “Binarization techniques for degraded document images — A review,” 2016 5th International Conference on Reliability, Infocom Technologies and Optimization (Trends and Future Directions) (ICRITO), Noida, 2016, pp. 163–166, doi: 10.1109/ICRITO.2016.7784945.
A. Papandreou and B. Gatos, “A Novel Skew Detection Technique Based on Vertical Projections,” 2011 International Conference on Document Analysis and Recognition, Beijing, 2011, pp. 384–388, doi: 10.1109/ICDAR.2011.85..
K. Lin, T. H. Li, S. Liu and G. Li, “Real Photographs Denoising With Noise Domain Adaptation and Attentive Generative Adversarial Network,” 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Long Beach, CA, USA, 2019, pp. 1717–1721, doi: 10.1109/CVPRW.2019.00221.
Choudhary, Amit & Rishi, Rahul & Savita, Ahlawat. (2013). A New Character Segmentation Approach for Off-Line Cursive Handwritten Words.Procedia Computer Science. 17. 88–95. 10.1016/j.procs.2013.05.013.
Google Vision Api Pricings

Digitization of Data Using OCR

Authors

Keywords:

Abstract

References

Downloads

Published

Issue

Section

License

How to Cite