An Empirical Analysis of Word Sense Disambiguation through Machine Learning Approaches

Authors

  • Vaishnavi Wasankar  BE Scholar, Department of Computer Science engineering, Nagpur Institute of Technology, Nagpur, Maharashtra, India
  • Yashika S. Nimje  BE Scholar, Department of Computer Science engineering, Nagpur Institute of Technology, Nagpur, Maharashtra, India
  • Kajal Pardhi  BE Scholar, Department of Computer Science engineering, Nagpur Institute of Technology, Nagpur, Maharashtra, India
  • Divyani C. Shende  BE Scholar, Department of Computer Science engineering, Nagpur Institute of Technology, Nagpur, Maharashtra, India
  • Aparitosh Gahankari  Assistant Professor, Department of Computer Science engineering, Nagpur Institute of Technology, Nagpur, Maharashtra, India

Keywords:

Classification of WSD Technique, Word Sense Disambiguation, Applications of WSD

Abstract

The procedure to identify the appropriate meaning for the particular word in an ambiguous statement is considered as Word Sense Disambiguation. It is a complicated problem since it necessitates the utilization of information from a variety of sources. Since the start of machine learning, a significant amount of time and effort has been devoted to overcoming this challenge, and the work is currently ongoing. In WSD, a variety of methodologies were employed and executed on a variety of corpora representing practically all languages. WSD algorithms are grouped into three groups in this paper: Supervised algorithms, unsupervised algorithms and knowledge-based algorithms. Every subcategory will be examined thoroughly, with details elaborated for nearly all of the algorithms within each area. As a result, work samples for every technique were selected based on the language being used, the corpora being used, and other considerations. Each method's advantages and disadvantages were meticulously documented. Some of these strategies have limits in certain scenarios, and our work will assist scientists in the fields of machine learning in selecting the most appropriate algorithms to tackle their specific problem in WSD. When comparing the works that have been used and indeed the procedures that were employed, it is possible to notice the distinctiveness of the piece of work that was created. As a result of this research, it was observed that (i) size of the dataset has an considerable impact on algorithm's performance, (ii) some methodologies provide high performance accuracy for one language where as it gives low performance for some other, (iii) a few of these methodologies can be run quickly but with a limitation on accuracy, and (iv) the large number among those methodologies have been implemented successfully for a wide range of different languages.

References

  1. Zhou, X. and H. Han, 2005. Survey of word sense disambiguation approaches. Proceedings of the FLAIRS Conference, May 1, pp: 307-313.
  2. Pal, A.R. and D. Saha, 2015. Word sense disambiguation: A survey.
  3. Giyanani, R., 2013. A survey on word sense disambiguation. IOSR J. Comput. Eng., 14: 30-33.
  4. Sarmah, J. and S.K. Sarma, 2016. Survey on word sense disambiguation: An initiative towards an Indo- Aryan language. Int. J. Eng. Math., 6: 37-52.
  5. Haroon, R.P., 2011. Word sense disambiguation-A survey. Proceedings of the International Colloquiums on Computer Electronics Electrical Mechanical and Civil, (EMC’ 11), ACEEE, pp: 58-60. DOI: 02.CEMC.2011.01.582
  6. Basile, P., A. Caputo and G. Semeraro, 2014. An enhanced lesk word sense disambiguation algorithm through a distributional semantic model. Proceedings of the Coling the 25th International Conference on Computational Linguistics: Technical Papers, Aug. 23- 29,Dublin, Ireland, pp: 1591-1600.
  7. Pal, A.R., D. Saha and A. Pal, 2017. A Knowledge based methodology for word sense disambiguation for low resource language. Adv. Computat. Sci. Technol., 10: 267-283.
  8. Bakhouche, A., T. Yamina, D. Schwab and A. Tchechmedjiev, 2015. Ant colony algorithm for Arabic word sense disambiguation through English lexical information. Int. J. Metadata, Semant. Ontol., 10: 202-211.
  9. Jiang, J.J. and D.W. Conrath, 1997. Semantic similarity based on corpus statistics and lexical taxonomy. Proceedings of the 10th Research on Computational Linguistics International Conference, Aug. 5-7, Taipei, Taiwan, 19-33.
  10. Meng, L., R. Huang and J. Gu, 2013. A review of semantic similarity measures in wordnet. Int. J. Hybrid Informat. Technol., 6: 1-12.
  11. Mihalcea, R., C. Corley and C. Strapparava, 2006. Corpus-based and knowledge-based measures of text semantic similarity. Proceedings of the 21st national conference on Artificial intelligence, July 16-20, Boston, Massachusetts, pp: 775-780.
  12. Karthikeyan, R. and V. Udhayakumar, 2015. A web search engine-based approach to measure semantic similarity between words. Transact. Know. Data Eng. 23: 977-990. DOI: 10.1109/TKDE.2010.172
  13. Agirre, E. and D. Martinez, 2001. Knowledge Sources for Word Sense Disambiguation. In: Text, Speech and Dialogue, Matoušek, V., P. Mautner, R. Mouček and K. Taušer (Eds.), ISBN-10: 978-3-540-42557-1, pp: 229-246.
  14. Sreenivasan, D., M. Vidya, D. Sreenivasan and M. Vidya, 2016. A walk through the approaches of word sense disambiguation. Int. J. Innov. Res. Sci. Technol., 2: 218-223.
  15. Agirre, E. and D. Martinez, 2004. Unsupervised WSD based on automatically retrieved examples: The importance of bias. Proceedings of the Conference on Empirical Methods in Natural Language Processing, Jul. 1-3, ACL, Barcelona, Spain, pp: 25-32.
  16. AL_Bayaty, B.F.Z. and S. Joshi, 2014. Empirical implementation decision tree classifier to WSD problem. Int. J. Adv. Technol. Eng. Sci., 2: 579-601.
  17. Lesk, M., 1986. Automatic sense disambiguation using machine readable dictionaries: how to tell a pine cone from an ice cream cone. In Proceedings of the 5th annual international conference on Systems documentation (pp. 24-26). ACM.
  18. Chatterjee, A., 2012. A survey of sense annotation, eye- tracking and wordnet linking approaches till June, 2012. Doctoral dissertation, Indian Institute of Technology, Bombay.
  19. El-Gamml, M.M., M.W. Fakhr, M.A. Rashwan and A. Al-Said, 2011. A comparative study for arabic word sense disambiguation using document preprocessing and machine learning techniques. Proceedings of the Arabic Language Technology International Conference, Bibliotheca Alexandrina, (CBA’ 11) Alexandria, Egypt.
  20. Escudero, G., L. Màrquez and G. Rigau, 2000. Naive Bayes and exemplar-based approaches to word sense disambiguation revisited. Comput. Lang. Artificial Intel., 3: 421-425.
  21. Liu, H., Y.A. Lussier and C. Friedman, 2001. Disambiguating ambiguous biomedical terms in biomedical narrative text: An unsupervised method. J. Biomed. Informat., 34: 249-261.

Downloads

Published

2022-04-30

Issue

Section

Research Articles

How to Cite

[1]
Vaishnavi Wasankar, Yashika S. Nimje, Kajal Pardhi, Divyani C. Shende, Aparitosh Gahankari, " An Empirical Analysis of Word Sense Disambiguation through Machine Learning Approaches" International Journal of Scientific Research in Computer Science, Engineering and Information Technology(IJSRCSEIT), ISSN : 2456-3307, Volume 8, Issue 2, pp.104-114, March-April-2022.