Lightweight multilingual Named Entity Resource Extremely Extraction and Linking Using Page Rank and Semantic Graphs

S N V A S R K Prasad; K Gurnadha Gupta; M Manasa

doi:10.32628/CSEIT172441

Authors

S N V A S R K Prasad CSE, Sri Indu College of Engineering and Technology, JNTU Hyderabad, Hyderabad, India
K Gurnadha Gupta CSE, Sri Indu College of Engineering and Technology, JNTU Hyderabad, Hyderabad, India
M Manasa CSE, Sri Indu College of Engineering and Technology, JNTU Hyderabad, Hyderabad, India

Keywords:

KBP, ERD, Semantic Graphs, Wikipedia, NER, NEL, JAGADISH

Abstract

Text analytic systems usually trust heavily on detecting and linking entity mentions in documents to data bases for downstream applications like sentiment analysis, question responsive and recommended systems. a major challenge for this task is to be able to accurately discover entities in new languages with restricted labeled resources. during this paper we present an accurate and lightweight1 multi-lingual named entity recognition (NER) and linking (NEL) system. The contributions of this paper are three-fold: 1) light-weight named entity recognition with competitive ac-curacy; 2) Candidate entity retrieval that uses search click-log data and entity embedding to attain high preciseness with an occasional memory footprint; and 3) e consumer entity disambiguation. Our system achieves progressive performance on TAC KBP 2013 trilingual data and on English aidaconll data. a multilingual named element recognizer and linker. Group depends on the connections in Wikipedia to determine mappings between the substances furthermore, their distinctive names, and Wikidata as a dialect skeptic reference of substance identifiers. Group separates the notices from content utilizing a string coordinating motor and connections them to elements with a mix of principles, PageRank, and highlight vectors based on the Wikipedia classes. We assessed Group with the assessment convention of ERD'14 (Carmel et al., 2014) and we come to the aggressive F1-score of 0.746 on the advancement set. Crowd is composed to be multilingual and has forms in English, French, and Swedish.

References

Kurt Bollacker, Colin Evans, Praveen Paritosh, Tim Sturge, and Jamie Taylor. 2008. Freebase: A collaboratively created graph database for structur-ing human knowledge. In Proceedings of the 2008 ACM SIGMOD International Conference on Man-agement of Data, SIGMOD ’08, pages 1247–1250, New York, NY, USA. ACM.
Sergey Brin and Lawrence Page. 1998. The anatomy of a large-scale hypertextual web search engine. Computer Networks and ISDN Systems, 30(1-7):107–117, April.Razvan C Bunescu and Marius Pasca. 2006. Using en-cyclopedic knowledge for named entity disambigua-tion. In European Chapter of the Association for Computational Linguistics, volume 6, pages 9–16.
David Carmel, Ming-Wei Chang, Evgeniy Gabrilovich, Bo-June Paul Hsu, and Kuansan Wang. 2014. ERD’14: Entity recognition and disambiguation challenge. In ACM SIGIR Forum, volume 48, pages 63–77. ACM.
Silviu Cucerzan. 2014. Name Entities Made Obvi-ous: The Participation in the ERD 2014 Evaluation. In Proceedings of the First International Workshop on Entity Recognition & Disambiguation, ERD ’14, pages 95–100, New York, NY, USA. ACM.
Alan Eckhardt, Juraj Hresko,ˇ Jan Prochazka,´ and Otakar Smri;. 2014. Entity linking based on the co-occurrence graph and entity probability. In Pro-ceedings of the First International Workshop on En-tity Recognition & Disambiguation, ERD ’14, pages 37–44, New York, NY, USA. ACM.
Paolo Ferragina and Ugo Scaiella. 2010. Tagme: On-the-fly annotation of short text fragments (by wikipedia entities). In Proceedings of the 19th ACM International Conference on Information and Knowledge Management, CIKM ’10, pages 1625– 1628, New York, NY, USA. ACM.
Johannes Hoffart, Mohamed Amir Yosef, Ilaria Bor-dino, Hagen Furstenau,¨ Manfred Pinkal, Marc Span-iol, Bilyana Taneva, Stefan Thater, and Gerhard Weikum. 2011. Robust disambiguation of named entities in text. In Proceedings of the 2011 Con-ference on Empirical Methods in Natural Language Processing, pages 782–792, Edinburgh.
Marek Lipczak, Arash Koushkestani, and Evangelos Milios. 2014. Tulip: Lightweight entity recog-nition and disambiguation using wikipedia-based topic centroids. In Proceedings of the First Inter-national Workshop on Entity Recognition & Disam-biguation, ERD ’14, pages 31–36, New York, NY, USA. ACM.
2003. Introduction to the conll-2003 shared task: Language-independent named entity recognition. In Proceedings of the Seventh Conference on Natural Language Learning at HLT-NAACL 2003 - Volume 4, CONLL ’03, pages 142–147, Stroudsburg, PA, USA. Association for Computational Linguistics.

Lightweight multilingual Named Entity Resource Extremely Extraction and Linking Using Page Rank and Semantic Graphs

Authors

Keywords:

Abstract

References

Downloads

Published

Issue

Section

License

How to Cite