Manuscript Number : CSEIT174406
Parallel Corpora : A Much-Needed Linguistic Resource for Low Computational Resource Languages
Authors(1) :-Preeti Dubey Natural language Processing (NLP) is one of the upcoming research areas of computer science. There are many applications of NLP, but in the last decade, most of the effort in this field is inclined towards machine translation. A lot of work is available for the machine translation of English and Hindi. Some work is also undertaken for the translation of Indian languages, therefore; there has been a revolutionary research in development of text in machine readable form. Currently efforts are being made for developing large parallel corpora for most Indian languages, which is a much-needed linguistic resource for the development of Statistical Machine Translation systems. This paper introduces the concept of parallel corpus, its need and application in natural language processing. The various projects undertaken for the development of parallel corpus, followed by tools where parallel corpus is applied is also presented. The need of development of this resource for languages with low computational resources is also discussed.
Preeti Dubey Text Corpus, Speech Corpus, Parallel Corpora, Natural Language Processing, Low Resource Languages Publication Details Published in : Volume 2 | Issue 7 | September 2017 Article Preview
Assistant Professor, Department of Computer Science, J&K Higher Education Department, India
Date of Publication : 2017-09-30
License: This work is licensed under a Creative Commons Attribution 4.0 International License.
Page(s) : 41-44
Manuscript Number : CSEIT174406
Publisher : Technoscience Academy