Efficient High Utility Top-K Frequent Pattern Mining from High Dimensional Datasets

Authors

  • J. Krishna  AITS, Department of CSE Research Scholar, RU, Kurnool, India
  • M. Rupesh Kumar Reddy  M.Tech.,(PG Scholar), Department of CSE,Annamacharya Institute of Technology & Sciences, Rajampet, Kadapa, Andhra Pradesh, India
  • Dr. M. Rudra Kumar  Professor, Department of CSE, Annamacharya Institute of Technology & Sciences, Rajampet, Kadapa, Andhra Pradesh, India

Keywords:

Utility mining, high utility item set mining, top-k pattern mining, top-k high utility item set mining, Data mining, frequent itemset, transactional database.

Abstract

High utility pattern mining can be defined as discovering sets of patterns that not only co-occurs but they carry high profit. In two-phase pattern mining an apriori algorithm is used for candidate generation. However candidate generation is costly and it is challenging problem that if number of candidate are huge then scalability and efficiency are bottleneck problems. As a rule, finding a fitting least utility edge by experimentation is a monotonous procedure for clients. In the event that min_util is set too low, an excessive number of HUIs will be produced, which may bring about the mining procedure to be exceptionally wasteful. Then again, if min_util is set too high, it is likely that no HUIs will be found. In this paper, we address the above issues by proposing another structure for top-k high utility thing set mining, where k is the coveted number of HUIs to be mined. Two sorts of proficient calculations named TKU (mining Top-K Utility thing sets) and TKO (mining Top-K utility thing sets in one stage) are proposed for mining such thing sets without the need to set min_util. We give an auxiliary examination of the two calculations with talks on their preferences and restrictions. Exact assessments on both genuine and manufactured datasets demonstrate that the execution of the proposed calculations is near that of the ideal instance of best in class utility mining calculations.

References

  1. R. Agrawal and R. Srikant, "Fast algorithms for mining associationrules," in Proc. Int. Conf. Very Large Data Bases, 1994, pp. 487-499.
  2. C. Ahmed, S. Tanbeer, B. Jeong, and Y. Lee, "Efficient tree structuresfor high-utility pattern mining in incremental databases,"IEEE Trans. Knowl. Data Eng., vol. 21, no. 12, pp. 1708-1721, Dec.2009.
  3. K. Chuang, J. Huang, and M. Chen, "Mining top-k frequent patternsin the presence of the memory constraint," VLDB J., vol. 17,pp. 1321-1344,2008.
  4. R. Chan, Q. Yang, and Y. Shen, "Mining high-utility item sets," inProc. IEEE Int. Conf. Data Mining, 2003, pp. 19-26.
  5. P. Fournier-Viger and V. S. Tseng, "Mining top-k sequentialrules," in Proc. Int. Conf. Adv. Data Mining Appl., 2011, pp. 180-194.
  6. P. Fournier-Viger, C.Wu, and V. S. Tseng, "Mining top-k associationrules," in Proc. Int. Conf. Can. Conf. Adv. Artif. Intell., 2012, pp. 61-73.
  7. P. Fournier-Viger, C. Wu, and V. S. Tseng, "Novel concise representationsof high utility item sets using generator patterns," inProc. Int. Conf.Adv. Data Mining Appl. Lecture Notes Comput. Sci.,2014, vol. 8933, pp. 30-43.
  8. J. Han, J. Pei, and Y. Yin, "Mining frequent patterns without candidategeneration," in Proc. ACM SIGMOD Int. Conf. Manag. Data,2000, pp. 1-12.
  9. J. Han, J. Wang, Y. Lu, and P. Tzvetkov, "Mining top-k frequentclosed patterns without minimum support," in Proc. IEEE Int.Conf. Data Mining,2002, pp. 211-218.
  10. S. Krishnamurthy, "Pruning strategies for mining high utilityitem sets," Expert Syst. Appl., vol. 42, no. 5, pp. 2371-2381, 2015.
  11. C. Lin, T. Hong, G. Lan, J. Wong, and W. Lin, "Efficient updatingof discovered high-utility item sets for transaction deletion indynamic databases," Adv. Eng. Informat., vol. 29, no. 1, pp. 16-27,2015.
  12. G. Lan, T. Hong, V. S. Tseng, and S. Wang, "Applying the maximumutility measure in high utility sequential pattern mining,"Expert Syst. Appl., vol. 41, no. 11, pp. 5071-5081, 2014.
  13. Y. Liu, W. Liao, and A. Choudhary, "A fast high utility item setsmining algorithm," in Proc. Utility-Based Data Mining Workshop,2005, pp. 90-99.
  14. M. Liu and J. Qu, "Mining high utility item sets without candidategeneration," in Proc. ACM Int. Conf. Inf. Knowl. Manag., 2012,pp. 55-64.
  15. J. Liu, K. Wang, and B. Fung, "Direct discovery of high utilityitem sets without candidate generation," in Proc. IEEE Int. Conf.Data Mining, 2012, pp. 984-989.
  16. Y. Lin, C. Wu, and V. S. Tseng, "Mining high utility item sets in bigdata," in Proc. Int. Conf. Pacific-Asia Conf. Knowl. Discovery DataMining, 2015, pp. 649-661.
  17. Y. Li, J. Yeh, and C. Chang, "Isolated items discarding strategy fordiscovering high-utility item sets," Data Knowl. Eng., vol. 64, no. 1,pp. 198-217, 2008.
  18. G. Pyun and U. Yun, "Mining top-k frequent patterns with combinationreducing techniques," Appl. Intell., vol. 41, no. 1, pp. 76-98,2014.
  19. T. Quang, S. Oyanagi, and K. Yamazaki, "ExMiner: An efficientalgorithm for mining top-k frequent patterns," in Proc. Int. Conf.Adv. Data Mining Appl., 2006, pp. 436 - 447.

Downloads

Published

2017-08-31

Issue

Section

Research Articles

How to Cite

[1]
J. Krishna, M. Rupesh Kumar Reddy, Dr. M. Rudra Kumar, " Efficient High Utility Top-K Frequent Pattern Mining from High Dimensional Datasets, IInternational Journal of Scientific Research in Computer Science, Engineering and Information Technology(IJSRCSEIT), ISSN : 2456-3307, Volume 2, Issue 4, pp.625-631, July-August-2017.