Enhancement of Performance of Clustering Technique during Data Mining To Investigating Sentiment Analysis Using KDD Process

Authors

  • Dr Kapil Kumar Kaswan Assistant Professor, Department of CSE, CDLU, Sirsa, Haryana, India Author
  • Monika M.Tech. Scholar, Department of CSE, CDLU, Sirsa, Haryana, India Author

Keywords:

Big Data, Data mining, clustering, Map Reduce, Performance

Abstract

Data mining is the act of searching through big data sets to find patterns and correlations that, when analyzed, might assist solve issues faced by businesses. The methodologies and tools of data mining provide businesses with the ability to forecast future trends and make better educated business choices. Finding unique groupings, or "clusters," within a data collection is the objective of the clustering technique. Using an algorithm written in machine language, the tool produces groups in which the individual objects in each group will, in most cases, share characteristics with the other members of the group. The major challenge to big data processing is management of unmanaged data. Map reduce function is used to get the frequency of unmanaged data and makes it manageable. Moreover soft computing mechanism might be used to improve the performance of clustering operations. Present research is focused on enhancement of performance of clustering techniques that are used in data mining. In order to gauge public opinion, researchers are analyzing tweets and user comments using sentiment analysis. As a result of technological development, the globe is altering at a breakneck pace. Having Internet connectivity is crucial in today's society. People are increasingly using social network applications to voice their opinions on current events. When trying to sell a product or improve a government service, collecting and analyzing customer feedback is essential. Data mining, also known as sentiment analysis, is often done in advance of a discussion in which the attitudes behind different points of view are to be found. The use of sentiment analysis to gauge consumer sentiment has exploded in recent years. The tweets of Twitter users are analyzed using neural networks in the latest study. There has been a rise in the use of Twitter data for survey research in recent years, and researchers are more interested in "tweets" (comments) and the content of these expressions. Accordingly, this research aims to evaluate the efficacy of several approaches to sentiment analysis applied to Twitter data. Sentiment analysis academics have been looking at how people feel about a wide range of things, like movies, commercial goods, and everyday social problems. Twitter is a very popular micro blog where customers can talk about what they think. Opinion research using Twitter data has been getting a lot of attention in the last decade. Because there is a lot of interest in sentiment analysis, the proposed work used an RNN model to predict sentiment based on text and graphic sentiments. These thoughts have been taken into account from Elon Musk's tweets. Research is using filters that people can set up to classify and remove useless content before training. The user-defined classification and filtering system has cut down on the amount of time it takes to learn. The accuracy of predictions has gone up because useless things have been removed. The proposed work used RNNs to come up with a more reliable and smart way to do things. This work has been flexible, scalable, and efficient when it comes to twitter sentiment analysis.

Downloads

Download data is not yet available.

References

Dutta NihamMonash University Malaysia Laura ElleSoutheast University Chinahttps://doi.org/10.34306/ijcitsm.v3i2.128

T. Sajana, C. M. Sheela Rani and K. V. Narayan Researchgate.net/profile/SajanaTiruveedhula/publication/298082409_A_Survey_on_Clustering_Techniques_for_Big_Data_Mining/links/5aa39ee145851543e63d7333/A-Survey-on-Clustering-Techniques-for-Big-Data-Mining.pdf

https://link.springer.com/chapter/10.1007/978-981-99-1075-5_2

Clarisse Dhaenens & Laetitia Jourdan https://link.springer.com/article/10.1007/s10479-021-04496-0

Heidari, S., Alborzi, M., Radfar, R., Afsharkazemi, M. A., & Rajabzadeh Ghatari, A. (2019). Big data clustering with varied density based on MapReduce. Journal of Big Data, 6(1). https://doi.org/10.1186/s40537-019-0236-x

Praveen, P., & Jayanth Babu, C. (2019). Big Data Clustering: Applying Conventional Data Mining Techniques in Big Data Environment. In Lecture Notes in Networks and Systems (Vol. 74). Springer Singapore. https://doi.org/10.1007/978-981-13-7082-3_58

Ismail, A., Shehab, A., & El-Henawy, I. M. (2019). Healthcare Analysis in Smart Big Data Analytics: Reviews, Challenges and Recommendations. Springer International Publishing. https://doi.org/10.1007/978-3-030-01560-2_2

Ilango, S. S., Vimal, S., Kaliappan, M., & Subbulakshmi, P. (2019). Optimization using Artificial Bee Colony based clustering approach for big data. Cluster Computing, 22, 12169–12177. https://doi.org/10.1007/s10586-017-1571-3

Rao, T. R., Mitra, P., Bhatt, R., & Goswami, A. (2019). The big data system, components, tools, and technologies: a survey. In Knowledge and Information Systems (Vol. 60, Issue 3). Springer London. https://doi.org/10.1007/s10115-018-1248-0

Mazumdar, S., Seybold, D., Kritikos, K., & Verginadis, Y. (2019). A survey on data storage and placement methodologies for Cloud-Big Data ecosystem. In Journal of Big Data (Vol.

Downloads

Published

04-04-2025

Issue

Section

Research Articles