Mining Association Rules in Cloud Computing Environments using Modified Apriori Algorithm

Avinash Sharma; Dr. N. K. Tiwari

doi:10.32628/CSEIT1831144

Authors

Avinash Sharma Research Scholar, Bansal Group of Institutions, Bhopal, Madhya Pradesh, India
Dr. N. K. Tiwari Director, Bansal Group of Institutions, Bhopal, Madhya Pradesh, India

Keywords:

Data Mining, Cloud Computing Association Rules

Abstract

An association rule mining helps in finding relation between the items or item sets in the given data. The performance of the algorithm was evaluated by testing it in the cloud (EC2) by increasing the number of nodes in the testing set up. The association rules are developed on the basis of the frequent item set generated from the data. The frequent item set were generated following the Apriori algorithm. As the input data and number of distinct items in the data set is large, lots of space and memory is required. Association rules are dependency rules which predict occurrence of an item based on occurrences of other items. Apriori is the best-known algorithm to mine association rules. The Apriori algorithm had a major problem of multiple scans through the entire data. It required a lot of space and time. The modification in our paper suggests that we do not scan the whole database to count the support for every attribute. This is possible by keeping the count of minimum support and then comparing it with the support of every attribute. The support of an attribute is counted only till the time it reaches the minimum support value. In this paper we use Modified Apriori algorithm to mine the data from the cloud using sector/sphere framework with association rules.

References

KawuuW.Lin,Yu-ChinLuo ," Efficient Strategies for Many-task Frequent Pattern Mining in Cloud Computing Environments",2010 IEEE.
Yang Lai , Shi ZhongZhi ," An Efficient Data Mining Framework on Hadoop using Java Persistence API" , 2010 10th IEEE International Conference on Computer and Information Technology (CIT 2010).
Jiabin Deng, JuanLi Hu, Anthony Chak Ming LIU, Juebo Wu, "Research and Application of Cloud Storage",2010 IEEE.
Lingjuan Li , Min Zhang , "The Strategy of Mining Association Rule Based on Cloud Computing", 2011 IEEE.
T.R. Gopalakrishnan Nair, K.Lakshmi Madhuri , "Data Mining Using Hierarchical Virtual KMeans Approach Integrating Data Fragments In Cloud Computing Environment",2011 IEEE.
L. J. Li and M. Zhang, "The strategy of mining association rule based on cloud computing," in Proc. 2011 International Conference on Business Computing and Global Informatization.
F. Marozzo, D. Talia, and P. Trunfio, "A cloud framework for parameter sweeping data mining applications," in Proc. 2011 Third IEEE International Conference on Coud Computing Technology and Science.
R. Agrawal, R. Srikant, Mining Sequential Patterns, in: Proc. of the 11th Int’l Conf. on Data Engineering, 1995, pp. 3-14.
R. J. Bayardo, Jr., Brute-force mining of high-confidence classification rules. In Proceedings of the 3rd international conference on knowledge discovery and data mining (KDD'97), Newport Beach, California, USA.
M. Ester, H.-P. Kriegel, J. Sander, and X. Xu, 1996, "A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise", In Proc. 2nd Int. Conf. on Knowledge Discovery and Data Mining, Portland, OR, AAAI Press, pp. 226-231.
G. Grahne and J. Zhu, 2003, "Efficiently Using Prefix-trees in Mining Frequent Itemsets", In Proc. of the IEEE ICDM Workshop on Frequent Itemset Mining Implementations.
J. Han, J. Pei, and Y. Yin, 2000, "Mining Frequent Patterns without Candidate Generation", In Proc. of the ACM SIGMOD Int. Conf. on Management of Data, pp.1-12.
A. Javed, and A. Khokhar, 2004, "Frequent Pattern Mining on Message Passing Multiprocessor Systems", Distributed and Parallel Databases, vol. 16, pp. 321–334.
K. W. Lin, Y.-C. Luo, 2009, "A Fast Parallel Algorithm for Discovering Frequent Patterns", GRC '09. IEEE Int. Conf. on Granular Computing, pp. 398 – 403.
J. Zhou and K.-M. Yu, 2008, "Tidset-based Parallel FP-tree Algorithm for the Frequent Pattern Mining Problem on PC Clusters", Lecture Notes in Computer Science 5036, pp. 18- 28.
J. Zhou and K.-M. Yu, 2008, "Balanced Tidset-based Parallel FP-tree Algorithm for the Frequent Pattern Mining on Grid System", Fourth Int. Conf. on Semantics, Knowledge and Grid, pp. 103-108.
R. Agrawal and R. Srikant. Quest Synthetic Data Generator. IBM Almaden Research Center, San Jose, California, http://www.almaden.ibm.com/cs/quest/syndata.html.
R. Agrawal, T. Imielinski, and A. Swami, 1993, "Mining association rules between sets of items in large databases", In Proc. of the 1993 ACM-SIGMOD Int. Conf. on management of data (SIGMOD’93), pp. 207–216.

Learn how to solve a Rubix Cube with the easiest method. You can have an amazing new skill in an hour!

Mining Association Rules in Cloud Computing Environments using Modified Apriori Algorithm

Authors

Keywords:

Abstract

References

Downloads

Published

Issue

Section

License

How to Cite