Detecting and Extracting Named Entities with Particular Reference to Marathi Language

Authors

  • Sumalatha D. Bandari  Department of CSE, Dr. Daulatrao Aher College of Engineering, Karad, Maharashtra, India
  • Laxman L. Kumarwad  Department of MCA, Government College of Engineering, Karad, Maharashtra, India

Keywords:

Data Mining, J48, SMO, Naïve Bayes, Classification Algorithms

Abstract

The organization name, person name, location name, brand name and others are called named entities. The purpose of detecting and extracting named entities is to recognize all the named entities in the document and extracting those named entities. Detection of named entities is two step procedure- proper nouns identification and the classification of identified proper nouns. In the first step proper nouns are recognized from the text. In the second step proper nouns are classified into the different classes like the names of an organization, person, location, brand and others. Recognition of named entities is used in many applications like Natural Language Processing, Machine Translation and Machine Learning. Morphologically rich and free ordered features are present in Indian languages. Reorganization of named entities is difficult in the Indian languages like Marathi, Hindi, Urdu, Telugu and Bengali etc. The objective of this paper is to conduct the survey on recognition of named entities in different Indian languages and compared the performance metrics of different named entity approaches. Also, mentioned the challenges of Named Entity Recognition in Marathi language like morphological features, no capitalization, writing variations and ambiguity.

References

  1. Vikas Yadav, “A Survey on Recent Advances in Named Entity Recognition from Deep Learning models”, Proceedings of the 27th International Conference on Computational Linguistics, pages 2145–2158 Santa Fe, New Mexico, USA, August 20-26, 2018.
  2. Hinal Shah, Prachi Bhandari, “Study off Named Entity Recognition For Indian Languages”, International Journal of Information Sciences and Techniques (IJIST) Vol.6, No.1/2, March 2016 DOI : 10.5121/ijist.2016.6202 11
  3. Vinay Singh, Deepanshu Vijay, Syed S. Akhtar, Manish Shrivastava, “Named Entity Recognition for Hindi-English Code-Mixed Social Media Text”, Proceedings of the Seventh Named Entities Workshop, pages 27–35, Melbourne, Australia, July 20, 2018.
  4. Shilpi Srivastava, Mukund Sanglikar & D.C Kothari. ”Named Entity Recognition System for Hindi Language: A Hybrid Approach” International Journal of Computational Linguistics (IJCL), Volume (2): Issue (1): 2011.
  5. Laxman L. Kumarwad, Rajendra D. Kumbhar and Sumalatha D. Bandari, "Present Status of Common Service Centre in Satara District of Maharashtra State (India)," 2018 8th International Conference on Cloud Computing, Data Science & Engineering (Confluence), Noida, 2018, pp. 389-393. doi: 10.1109/CONFLUENCE.2018.8442748
  6. A. Dey, J. Abedinand & B. Purkayastha, “A Comprehensive Study of Named Entity Recognition On Inflectional Languages”, International. Journal of Advanced Research in Computer Science and Software Engineering 2014, Vol. 4, pp 696-701.
  7. Asif Ekbal et. al. “Language Independent Named Entity Recognition in Indian Languages”. IJCNLP, 2008, pp 33-40.
  8. P. K. Gupta and S. Arora, “An Approach for Named Entity Recognition System for Hindi: An Experimental Study,” in Proceedings of ASCNT-2009, CDAC, Noida, India, pp. 103–108.
  9. Vijayakrishna. R, “ Named Entity Recognition in Tamil using Conditional Random Fields on tourism domain”, Proceedings of the IJCNLP-08 Workshop on NER for South and South East Asian Languages, Hyderabad, India, January 2008. pp 59–66.
  10. Asif Ekbal and Sivaji Bandyopadhyay, “NER for Bengali & Hindi using Conditional Random Fields”, LiLT Volume 2, Issue 1, November 2009.
  11. Asif Ekbal and Sivaji Bandyopadhyay “NER system for Bengali and Hindi by using SVM model”, International Journal of Computer, Electrical, Automation, Control and Information Engineering Vol 4, No 3, 2010.
  12. B. Sasidhar, “Named Entity Recognition in Telugu Language using Language Dependent Features and Rule based Approach”, International Journal of Computer Applications (0975 – 8887) Volume 22– No.8, May 2011.
  13. Nita Patil, Ajay S. Patil & B. V. Pawar “Issues and challenges in Marathi named entity recognition”, International Journal on Natural Language Computing (IJNLC) Vol. 5, No.1, February 2016.
  14. Sujan Kumar Saha, “Named Entity Recognition in Hindi using Maximum Entropy and Transliteration”, 2008
  15. S. Amarappa and S. V. Sathyanarayana “A Hybrid approach for Named Entity Recognition, Classification and Extraction (NERCE) in Kannada”, Proceeding of International Conference on Multimedia Processing, Communication and Info.Tech., MPCIT, DOI: 03.AETS.2013.4.91, Association of Computer Electronics and Electrical Engineers, 2013.
  16. Zornitsa Kozareva, “NER system for Spanish language using combining different data driven systems for improving Named Entity Recognition”, NLDB 2005, LNCS 3513, pp. 80–90, 2005, Springer-Verlag Berlin Heidelberg 2005.
  17. Jimmy L and Darvinder Kaur “Named Entity Recognition in Manipuri: A Hybrid approach”, Gurevych, C. Biemann, and T. Zesch (Eds.): GSCL 2013, LNAI 8105, pp. 104–110, 2013.
  18. Laxman L. Kumarwad, Rajendra D. Kumbhar, "E-Governance Initiatives in Maharashtra (India): Problems and Challenges", International Journal of Information Engineering and Electronic Business (IJIEEB), Vol.8, No.5, pp.18-25, 2016. DOI: 10.5815/ijieeb.2016.05.03
  19. Nita Patil, Ajay Patil and B. V. Pawar, “HMM based Named Entity Recognition for inflectional Language”, IEEE International Conference on Computer, Communications, and Electronics (COMPTELIX 2017):565-572.
  20. Kamal Sarkar, “A hidden markov model based system for entity extraction from social media english text”, fire 2015. arXiv preprint arXiv:1512.03950.
  21. Sai Kiranmai Gorla, Sriharshitha Velivelli, N L Bhanu Murthy, Aruna Malapati” Named Entity Recognition for Telugu News Articles using Naïve Bayes Classifier”, Proceedings of the NewsIR’18 Workshop at ECIR, Grenoble, France, 26-March-2018.

Downloads

Published

2018-02-27

Issue

Section

Research Articles

How to Cite

[1]
Sumalatha D. Bandari, Laxman L. Kumarwad, " Detecting and Extracting Named Entities with Particular Reference to Marathi Language, IInternational Journal of Scientific Research in Computer Science, Engineering and Information Technology(IJSRCSEIT), ISSN : 2456-3307, Volume 3, Issue 1, pp.1972-1984, January-February-2018.