Efficient Large Scale Frequent Itemset Mining with Hybrid Partitioning Approach

Authors

  • Priyanka R.  Department of Computer Science and Engineering, Dr. Mahalingam College of Engineering and Technology Pollachi, Coimbatore, Tamil Nadu, India
  • Mohammed Ibrahim M.  Department of Computer Science and Engineering, Dr. Mahalingam College of Engineering and Technology Pollachi, Coimbatore, Tamil Nadu, India
  • Ranjith Kumar M.  Department of Computer Science and Engineering, Dr. Mahalingam College of Engineering and Technology Pollachi, Coimbatore, Tamil Nadu, India

DOI:

https://doi.org//10.32628/CSEIT1952206

Keywords:

Dist-Eclat Algorithm, Frequent Itemset Mining, Mapreduce, K-Itmesets, Large Data, Data Mining,Frequent Itemset Mining

Abstract

In today’s world, voluminous data are available which are generated from various sources in various forms. Mining or analyzing this large scale data in an efficient way so as to make them useful for the mankind is difficult with the existing approaches. Frequent itemset mining is one such technique used for analyzing in many fields like finance, health care system where the main focus is gathering frequent patterns and grouping them to be meaningful inorder to gather useful insights from the data. Some major applications include customer segmentation in marketing, shopping cart analyses, management relationship, web usage mining, player tracking and so on. Many parallel algorithms, like Dist-Eclat Algorithm, Big FIM algorithm are available to perform large scale Frequent itemset mining. In Dist-Eclat algorithm, datasets are partitioned using Round Robin technique which uses a hybrid partitioning approach, which can improve the overall efficiency of the system. The system works as follows: Initially the data collected are distributed by mapreduce. Then the local frequent k-itmesets are computed using FP-Tree and sent to the map phase. Later the mining results are combined to the center node. Finally, global frequent itemsets are gathered by mapreduce. The proposed system is expected to improve in efficiency by using hybrid partitioning approach in the datasets based on the identification of frequent items.

References

  1. Gosta Grahne and Jianfei Zhu,” Fast Algorithms for Frequent Itemset Mining Using FP-Trees”, IEEE Transactions on Knowledge and Data Engineering, October-2005,USA.
  2. Go sta Grahne and Jianfei Zhu, “Fast Algorithms for Frequent Itemset Mining Using FP-Trees”, IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 17, NO. 10, OCTOBER 2005.
  3. Siddhrajsinh Solanki, Neha Soni,” A Survey on Frequent Pattern Mining Methods Apriori, Eclat, FP growth”, International Journal of Computer Techniques, 2013, India.
  4. J. Han, H. Pei and Y. Yin,” Mining Frequent Patterns without candidate generation”, Conference on the Management of Data, 2014, New York.
  5. Manjit kaur, Urvashi Grag,” ECLAT Algorithm for Frequent Itemsets Generation”, International Journal of Computer Systems (ISSN: 2394-1065), Volume 01– Issue 03, December, 2014, India.
  6. S.N. Patil, “Frequent Itemset Mining for Big Data”, International Conference on Green Computing and Internet of Things (ICGCIoT), 2015, India.
  7. Ferenc Kovács & János Illés,” Frequent Itemset Mining on Hadoop“, IEEE 9th International Conference on Computational Cybernetics July 8-10, 2015, Tihany, Hungary.
  8. Savo Tomovic & Predrag Stanišiü,” Fast Algorithm for Enumerating Frequent Itemset Pairs in Database of Transactions”, 4 th Mediterranean Conference on Embedded Computing MECO – 2015,Budva, Montenegro.
  9. Zahra Farzanyar, Nick Cercone,”Efficient Mining of Frequent itemsets in Social Network Data based on MapReduce Framework ”, IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining,2015, Toronto, Canada.
  10. Zhang Xin, Li Kunlun, and Liao Pin,” A Depth-First Search Algorithm of Mining Maximal Frequent Itemsets”, 7th International Conference on Advanced Computational Intelligence Mount Wuyi, Fujian, March 27-29, 2015,China.
  11. Tushar M. Chaure and Kavita R. Singh, ” Frequent Itemset Mining Techniques - A Technical Review”, World Conference on Futuristic Trends in Research and Innovation for Social Welfare (WCFTR’16), 2016, India.
  12. Zhigang Zhang, Genlin Ji, Mengmeng Tang,”MREclat: an Algorithm for Parallel Mining Frequent Itemsets”, International Conference on Advanced Cloud and Big Data, 2016,Nanjing, China.
  13. Ankit N. Dharsandiya & Mihir R. Patel,” A Review on Frequent Itemset Mining Algorithms in Social Network Data”, IEEE WiSPNET, 2016 conference, India.
  14. Dr. Ruchi Agarwal,Sunny Singh & Satvik Vats,” Implementation of Improved Algorithm for Frequent Itemset Mining using Hadoop ”, International Conference on Computing, Communication and Automation (ICCCA2016), April,2016, Galgotias University, India.
  15. Savo Tomovic and Predrag Stanisi,” Fast Algorithm for Enumerating Frequent Itemset Pairs in Database of Transactions”, 4th Mediterranean Conference on Embedded Computing,2016.

Downloads

Published

2019-04-30

Issue

Section

Research Articles

How to Cite

[1]
Priyanka R., Mohammed Ibrahim M., Ranjith Kumar M., " Efficient Large Scale Frequent Itemset Mining with Hybrid Partitioning Approach, IInternational Journal of Scientific Research in Computer Science, Engineering and Information Technology(IJSRCSEIT), ISSN : 2456-3307, Volume 5, Issue 2, pp.845-852, March-April-2019. Available at doi : https://doi.org/10.32628/CSEIT1952206