Improving Accuracy of The Sentence-Level Lexicon-Based Sentiment Analysis Using Machine Learning

Authors

  • Titya Eng  University of Battambang, Battambang, Cambodia
  • Md Rashed Ibn Nawab  Northwestern Polytechnical University, Xi'an, China
  • Kazi Md Shahiduzzaman  Jatiya Kabi Kazi Nazrul Islam University

DOI:

https://doi.org/10.32628/CSEIT21717

Keywords:

Sentiment Analysis; Machine Learning; Support vector machine; Lexicon, Natural language processing.

Abstract

Sentiment Analysis studies people's attitudes, opinions, evaluations, emotions, sentiments toward some entities such as products, topics, individuals, services, issues and classify them whether the opinion or evaluations inclines to that entities or not. It is getting more research focus in recent years due to its benefits for scientific and commercial purposes. This research aims at developing a better approach for sentiment analysis at the sentence level by using a combination of lexicon resources and a machine learning method. Moreover, as reviews data on the internet is unstructured and has much noise, this research uses different preprocessing techniques to clean the data before processing in different algorithms discussed in subsequent sections. Additionally, the lexicon building processes, how the lexicon is handled and combined with the machine learning algorithm for predicting sentiment is also discussed. In sentiment analysis, sentence's sentiment can be classified into three classes: positive sentiment, negative sentiment, or neutral. However, in this research work, we have excluded neutral sentiment for avoiding ambiguity and unnecessary complexity. The experiment results show that the proposed algorithm outperforms compared to the baseline machine learning algorithms. We have used four distinct datasets and different performance measures to check and validate the proposed method's robustness.

References

  1. A. Jurek, M. D. Mulvenna, and Y. Bi, "Improved lexicon-based sentiment analysis for social media analytics," Secur. Inform., vol. 4, no. 1, p. 9, Dec. 2015.
  2. B. Pang, L. Lee, and S. Vaithyanathan, "Thumbs up?: sentiment classification using machine learning techniques," in Proceedings of the ACL-02 conference on Empirical methods in natural language processing - EMNLP '02, Not Known, 2002, vol. 10, pp. 79–86.
  3. A. Gupta, J. Pruthi, and N. Sahu, "Sentiment Analysis of Tweets using Machine Learning Approach," Int. J. Comput. Sci. Mob. Comput., vol. 6, no. 4, pp. 444–458, Apr. 2017.
  4. Department of Computer Science & Engineering, Heritage Institute of Technology, Kolkata, India, L. Dey, S. Chakraborty, A. Biswas, B. Bose, and S. Tiwari, "Sentiment Analysis of Review Datasets Using Naïve Bayes' and K-NN Classifier," Int. J. Inf. Eng. Electron. Bus., vol. 8, no. 4, pp. 54–62, Jul. 2016.
  5. H. S and R. Ramathmika, "Sentiment Analysis of Yelp Reviews by Machine Learning," in 2019 International Conference on Intelligent Computing and Control Systems (ICCS), 2019, pp. 700–704.
  6. B. Yang and C. Cardie, "Context-aware Learning for Sentence-level Sentiment Analysis with Posterior Regularization," in Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Baltimore, Maryland, 2014, pp. 325–335.
  7. A. Cernian, V. Sgarciu, and B. Martin, "Sentiment analysis from product reviews using SentiWordNet as lexical resource," in 2015 7th International Conference on Electronics, Computers and Artificial Intelligence (ECAI), 2015, p. WE-15-WE-18.
  8. C. S. Khoo and S. B. Johnkhan, "Lexicon-based sentiment analysis: Comparative evaluation of six sentiment lexicons," J. Inf. Sci., vol. 44, no. 4, pp. 491–511, Aug. 2018.
  9. H. Wang and J. A. Castanon, "Sentiment expression via emoticons on social media," in 2015 IEEE International Conference on Big Data (Big Data), 2015, pp. 2404–2408.
  10. https://www.yelp.com/dataset (accessed Aug. 03, 2020).
  11. O. Täckström and R. McDonald, "Discovering Fine-Grained Sentiment with Latent Variable Structured Prediction Models," in Proceedings of the 33rd European Conference on Advances in Information Retrieval, Berlin, Heidelberg, 2011, pp. 368–374.
  12. D. Kotzias, M. Denil, N. de Freitas, and P. Smyth, "From Group to Individual Labels Using Deep Features," in Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, New York, NY, USA, 2015, pp. 597–606.
  13. B. Pang and L. Lee, "Seeing stars: exploiting class relationships for sentiment categorization with respect to rating scales," in Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics - ACL '05, Ann Arbor, Michigan, 2005, pp. 115–124.

Downloads

Published

2021-02-28

Issue

Section

Research Articles

How to Cite

[1]
Titya Eng, Md Rashed Ibn Nawab, Kazi Md Shahiduzzaman, " Improving Accuracy of The Sentence-Level Lexicon-Based Sentiment Analysis Using Machine Learning" International Journal of Scientific Research in Computer Science, Engineering and Information Technology(IJSRCSEIT), ISSN : 2456-3307, Volume 7, Issue 1, pp.57-69, January-February-2021. Available at doi : https://doi.org/10.32628/CSEIT21717