Document Analysis using Similarity Measures : A Case Study on Text Retrieval System

Authors(1) :-Suresha M

A document is an information container that contains information either in printed format or in handwritten format and document is a medium for transferring knowledge. Human vision is the most accurate language identification system in the world. Within a few seconds of looking at a document, one can determine the language even without deskewing and segmenting the image, while computer vision is not able to match human capability. Today there is an increasing need for automatic language identification with the support of computers. As the world moves from paper to paperless office, more and more communication and storage of documents is performed digitally which facilitates quicker additions, searches and modifications and increases the life of such records.

Authors and Affiliations

Suresha M
Department Of Computer Science, Kuvempu University, India

Document Analysis, Similarity Measures, Text Retrieval.

  1. B B Chaudhuri and U Pal, Skew Angle Detection of Digitized Indian Script Documents, IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol.19, No.2, 1997.
  2. Cattoni R., Coianiz T., Messelodi S., and Modena M C., 1998. Geometric Layout Analysis Techniques for Document Image Understanding: A Review, ITC - IRST, 1998.
  3. Rangachar Kasturi, Lawrence o Gorman and Venu Govindaraju, Document image analysis: a primer, Sadhana, Vol 22, Part I, pp 3-22, 2002.
  4. Song Mao, Azriel Rosen Feld, and Tapas Kanungo, Document structure analysis algorithms: A literature survey, Electronic Imaging, 2003.
  5. Yu B., and Jain A K., A generic system for form dropout. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 18, No.11, 1996.
  6. Yuan Y Tang, Seong Whan Lee, and Ching Y Suen, Automatic Document Processing: A Survey, Vol 29, No.12, pp 1931-1952, 1996.

Publication Details

Published in : Volume 2 | Issue 6 | November-December 2017
Date of Publication : 2017-12-31
License:  This work is licensed under a Creative Commons Attribution 4.0 International License.
Page(s) : 125-130
Manuscript Number : CSEIT172638
Publisher : Technoscience Academy

ISSN : 2456-3307

Cite This Article :

Suresha M, "Document Analysis using Similarity Measures : A Case Study on Text Retrieval System", International Journal of Scientific Research in Computer Science, Engineering and Information Technology (IJSRCSEIT), ISSN : 2456-3307, Volume 2, Issue 6, pp.125-130 , November-December-2017.
Journal URL :

Article Preview