Performance of Clustering Technique During Data Mining to Analyze Big Data

Authors

  • Dr. Kapil Kumar Kaswan  Assistant Professor, Department of CSE, CDLU, Sirsa, Haryana, India
  • Preeti  M.Tech. Scholar, Department of CSE, CDLU, Sirsa, Haryana, India

Keywords:

Big Data, Data mining, clustering, Map Reduce, Performance

Abstract

Data mining is the act of searching through big data sets to find patterns and correlations that, when analyzed, might assist solve issues faced by businesses. The methodologies and tools of data mining provide businesses with the ability to forecast future trends and make better educated business choices. Finding unique groupings, or "clusters," within a data collection is the objective of the clustering technique. Using an algorithm written in machine language, the tool produces groups in which the individual objects in each group will, in most cases, share characteristics with the other members of the group. The major challenge to big data processing is management of unmanaged data. Map reduce function is used to get the frequency of unmanaged data and makes it manageable. Moreover soft computing mechanism might be used to improve the performance of clustering operations. Present research is focused on enhancement of performance of clustering techniques that are used in data mining.

References

  1. Heidari, S., Alborzi, M., Radfar, R., Afsharkazemi, M. A., & Rajabzadeh Ghatari, A. (2019). Big data clustering with varied density based on MapReduce. Journal of Big Data, 6(1). https://doi.org/10.1186/s40537-019-0236-x
  2. Praveen, P., & Jayanth Babu, C. (2019). Big Data Clustering: Applying Conventional Data Mining Techniques in Big Data Environment. In Lecture Notes in Networks and Systems (Vol. 74). Springer Singapore. https://doi.org/10.1007/978-981-13-7082-3_58
  3. Ismail, A., Shehab, A., & El-Henawy, I. M. (2019). Healthcare Analysis in Smart Big Data Analytics: Reviews, Challenges and Recommendations. Springer International Publishing. https://doi.org/10.1007/978-3-030-01560-2_2
  4. Ilango, S. S., Vimal, S., Kaliappan, M., & Subbulakshmi, P. (2019). Optimization using Artificial Bee Colony based clustering approach for big data. Cluster Computing, 22, 12169–12177. https://doi.org/10.1007/s10586-017-1571-3
  5. Rao, T. R., Mitra, P., Bhatt, R., & Goswami, A. (2019). The big data system, components, tools, and technologies: a survey. In Knowledge and Information Systems (Vol. 60, Issue 3). Springer London. https://doi.org/10.1007/s10115-018-1248-0
  6. Mazumdar, S., Seybold, D., Kritikos, K., & Verginadis, Y. (2019). A survey on data storage and placement methodologies for Cloud-Big Data ecosystem. In Journal of Big Data (Vol. 6, Issue 1). Springer International Publishing. https://doi.org/10.1186/s40537-019-0178-3
  7. Zhu, L., Li, H., & Feng, Y. (2019). Research on big data mining based on improved parallel collaborative filtering algorithm. Cluster Computing, 22, 3595–3604. https://doi.org/10.1007/s10586-018-2209-9
  8. Barik, R. K., Misra, C., Lenka, R. K., Dubey, H., & Mankodiya, K. (2019). Hybrid mist-cloud systems for large scale geospatial big data analytics and processing: opportunities and challenges. Arabian Journal of Geosciences, 12(2). https://doi.org/10.1007/s12517-018-4104-3
  9. Joseph Manoj, R., Anto Praveena, M. D., & Vijayakumar, K. (2019). An ACO–ANN based feature selection algorithm for big data. Cluster Computing, 22, 3953–3960. https://doi.org/10.1007/s10586-018-2550-z
  10. Huang, W., Wang, H., Zhang, Y., & Zhang, S. (2019). A novel cluster computing technique based on signal clustering and analytic hierarchy model using hadoop. Cluster Computing, 22, 13077–13084. https://doi.org/10.1007/s10586-017-1205-9
  11. Khan, S., Shakil, K. A., & Alam, M. (2018). Cloud-based big data analytics—a survey of current research and future directions. Advances in Intelligent Systems and Computing, 654, 595–604. https://doi.org/10.1007/978-981-10-6620-7_57
  12. Peng, K., Zheng, L., Xu, X., Lin, T., & Leung, V. C. M. (2018). Balanced iterative reducing and clustering using hierarchies with principal component analysis (PBirch) for intrusion detection over big data in mobile cloud environment. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics): Vol. 11342 LNCS. Springer International Publishing. https://doi.org/10.1007/978-3-030-05345-1_14
  13. Skourletopoulos, G., Mavromoustakis, C. X., Mastorakis, G., Batalla, J. M., Dobre, C., Panagiotakis, S., & Pallis, E. (2017). Towards Mobile Cloud Computing in 5G Mobile Networks: Applications, Big Data Services and Future Opportunities. 43–62. https://doi.org/10.1007/978-3-319-45145-9_3
  14. Ularu, Elena Geanina, “Perspectives on Big Data and Big Data Analytics.“Journal of DBSJ, DBSJ (Database Systems Journal) 2012.
  15. NirmalKaur, Gurpinder Singh,” A Review Paper On Data Mining And Big Data”, ISSN No. 0976-5697, Jalandhar, Punjab, India, 2017
  16. Bandara, I., Ioras, F., Maher, K.: Cybersecurity concerns in e-learning education. In: ICERI2014 Conference, 728-734 (2014).
  17. Meslhy, E.: Data Security Model for Cloud Computing. Journal of Communication and Computer 10, 1047-1062, (2013).
  18. Yang, H., Tate, M.: A Descriptive Literature Review and Classification of Cloud Computing Research. Communication Association Info System 31, (2012).
  19. Kumar, A.: Secure Storage and Access of Data in Cloud Computing. In: International Conference on ICT Convergence, 15-17 (2012).

Downloads

Published

2023-08-30

Issue

Section

Research Articles

How to Cite

[1]
Dr. Kapil Kumar Kaswan, Preeti, " Performance of Clustering Technique During Data Mining to Analyze Big Data" International Journal of Scientific Research in Computer Science, Engineering and Information Technology(IJSRCSEIT), ISSN : 2456-3307, Volume 9, Issue 4, pp.124-130, July-August-2023.