Chui : Mining Closed High Utility Itemsets

Authors

  • Khushali Kumari  Computer Science, SPPU, Pune, Maharashtra, India
  • A.R. Deshpande  Computer Science, SPPU, Pune, Maharashtra, India

Keywords:

High Utility Itemsets, Itemsets Utility, Transactional Database, Transactional Utility

Abstract

In association rule mining, a transaction is a set of items called itemset, where each item represents a product or a service that customer buy in one transaction. In an e-commerce application, an itemset represents a set of items that a customer bought in one transaction. Frequent Itemset Mining (FIM) is a very popular data mining approach which is essential to a wide range of applications. For a transactional database, FIM generates frequent itemsets i.e. groups of items (itemset) appearing frequently in transactions. However, one of the drawback of FIM is that it assumes that each item can appear only once in every transaction and that all items have the same importance (weight, unit profit or value). To address above mentioned issues, the High-Utility Itemset Mining (HUIM) has been defined. As opposed to FIM, HUI considers the case where items can appear any number of times in a transaction and where each item has a weight called utility (e.g. unit profit). Therefore, mining high utility itemset can be used to discover itemsets having a high-importance (e.g. high profit), that is called High-Utility Itemsets. An itemset is called high utility itemset (HUI) only if its utility is not less than a user-specified minimum utility threshold minutil. Discovering or generating high-utility itemsets in transactional databases is a popular data mining task. A limitation of traditional algorithms is that too many number of high-utility itemsets may be presented to the user out of which some are redundant. To provide a concise and lossless representation of results the support count measure can be considered, hence the concept of closed itemset mining can be used.

References

  1. Agrawal, R., Srikant, R.: Fast algorithms for mining association rules in large databases. In: Proc. Int. Conf. Very Large Databases, pp. 487{499, (1994)
  2. Ahmed, C. F., Tanbeer, S. K., Jeong, B.-S., Lee, Y.-K.: 'Efficient tree structures for high-utility pattern mining in incremental databases'. IEEE Trans. Knowl. Data Eng. 21(12), 1708{1721 (2009)
  3. Fournier-Viger, P., Wu, C.-W., Zida, S., Tseng, V. S.: FHM: Faster high-utility itemset mining using estimated utility co-occurrence pruning. In: Proc. 21st Intern. Symp. on Methodologies for Intell. Syst., pp. 83{92 (2014)}
  4. Fournier-Viger, P., Gomariz, A., Gueniche, T., Mwamikazi, E., Thomas, R.: Efficient Mining of Top-K Sequential Patterns. In: Proc. 9th Intern. Conf. on Advanced Data Mining and Applications Part I, pp. 109{120, Springer (2013)}
  5. Fournier-Viger, P., Gomariz, A., Gueniche, T., Soltani, A., Wu., C., Tseng, V. S.: SPMF: a Java Open-Source Pattern Mining Library. Journal of Machine Learning Research (JMLR), 15, pp. 3389-3393 (2014)
  6. Lan, G. C., Hong, T. P., Tseng, V. S.: An efficient projection-based indexing approach for mining high utility itemsets. Knowl. and Inform. Syst. 38(1), 85{107(2014)}
  7. Song, W., Liu, Y., Li, J.: BAHUI: Fast and memory efficient mining of high utility itemsets based on bitmap. Intern. Journal of Data Warehousing and Mining. 10(1), 1{15 (2014)}
  8. Liu, M., Qu, J.: Mining high utility itemsets without candidate generation. In: Proc.22nd ACM Intern. Conf. Info. and Know. Management, pp. 55{64 (2012)}
  9. Liu, Y., Liao, W., Choudhary, A.: A two-phase algorithm for fast discovery of high utility itemsets. In: Proc. 9th Pacific-Asia Conf. on Knowl. Discovery and Data Mining, pp. 689{695 (2005) }
  10. Tseng, V. S., Shie, B.-E., Wu, C.-W., Yu., P. S.:Efficient algorithms for mining high utility itemsets from transactional databases. IEEE Trans. Knowl. Data Eng. 25(8), 1772{1786 (2013)}
  11. Tseng, V., Wu, C., Fournier-Viger, P., Yu, P.: Efficient algorithms for mining the concise and lossless representation of closed+ high utility itemsets. IEEE Trans Knowl. Data Eng. 27(3), 726{739 (2015)}
  12. T. Uno, M. Kiyomi, H. Arimura, "LCM ver. 2: Efficient mining algorithms for frequent/closed/maximal itemsets," Proc. ICDM'04 Workshop on Frequent Itemset Mining Implementations, CEUR, 2004.
  13. Wang, J., Han, J., Li, C.: Frequent closed sequence mining without candidate maintenance. IEEE Trans. on Knowledge Data Engineering 19(8), 10421056 (2007)
  14. Yun, U., Ryang, H., Ryu, K. H.: 'High utility itemset mining with techniques for reducing overestimated utilities and pruning candidates'. Expert Syst. with Appl. 41(8), 3861{3878 (2014)}
  15. Zida, S., Fournier-Viger, P.,Wu, C.-W., Lin, J. C. W., Tseng, V.S.: Efficient mining of high utility sequential rules. In: Proc. 11th Intern. Conf. Machine Learning and Data Mining (MLDM 2015), pp. 1{15 (2015)}
  16. http://www.philippe-fournier-viger.com.

The Rubix Cube is is not the only twisty puzzle, there are many more. Learn about the Pyraminx, the 2x2 and 4x4 cubes, the Megaminx on Ruwix.

Downloads

Published

2018-08-30

Issue

Section

Research Articles

How to Cite

[1]
Khushali Kumari, A.R. Deshpande, " Chui : Mining Closed High Utility Itemsets, IInternational Journal of Scientific Research in Computer Science, Engineering and Information Technology(IJSRCSEIT), ISSN : 2456-3307, Volume 3, Issue 6, pp.435-438, July-August-2018.