Hybrid Data Cost Setting using K-Means & ACO to Optimize Data Cost

Authors

  • Anita Bishnoi  Rajasthan College of Engineering for Women, Jaipur, Rajasthan, India
  • Mr. Vinod Todwal  Rajasthan College of Engineering for Women, Jaipur, Rajasthan, India

Keywords:

ACO, Clusters, K-means, Mahalanobis Distance.

Abstract

In k-means clustering, we are given a set of n data points in d-dimensional space Rd and an integer k and the problem is to determine a set of k points in Rd, called centers, so as to minimize the mean squared distance from each data point to its nearest center. A widespread heuristic for k-means clustering is Lloyd's algorithm. In this paper, we present a simple and efficient implementation of Lloyd's k-means clustering algorithm, which we call the filtering algorithm. This algorithm is easy to implement, requiring a kd-tree as the only major data structure .We establish the practical efficiency of the filtering algorithm in two ways. First, we present a data-sensitive analysis of the algorithm's running time, which shows that the algorithm runs faster as the separation between clusters increases. Second, we present a number of empirical studies both on synthetically generated data and on real data sets from applications in color quantization, data compression, and image segmentation.

References

  1. AmiraBoukhdhir Oussama Lachiheb, Mohamed Sala Gouider. "An improved Map Reduce Design of Kmeans for clustering very large datasets", IEEE transaction.
  2. Huang Xiuchang , SU Wei ,"An Improved K-means Clustering Algorithm" ,JOURNAL OF NETWORKS, VOL. 9, NO. 1, JANUARY 2014
  3. Yugal Kumar and G. Sahoo, "A New Initialization Method to Originate Initial Cluster Centers for K-Means Algorithm", International Journal of Advanced Science and Technology Vol.62, (2014).
  4. Honga Tzung-Pei, Chun-Hao Chenc, Feng-Shih Lin, "Usinggroup genetic algorithm to improve performance of attribute clustering," Elsevier, pp.1-8, 2015.
  5. Danial Gomes Ferrari, Leandro Numes de Castro, " Clustering algorithm selection by meta-learning systems: A new distance based problems characterization and ranking combination methods," Elsevier, pp.181-194, 2015.
  6. Rajashree Dash and Rasmita Dash, "Comparative analysis of K means and Genetic algorithm based data clustering," International Journal of Advanced Computer and Mathematical Sciences, pp.257-265, 2012.
  7. Edvin Aldana-Bobadhilla, Angel Kuri-Morales, "A Clustering based method on the maximum entropy principle," Entropy Article, pp. 151-180, 2015.
  8. Kannuri Lahari, M. Ramakrishna Murty, and Suresh C. Satapathy, "Prediction based clustering using genetic algorithm and Learning Based Optimization Performance Analysis," Advances in Intelligent Systems and Computing," pp. 338, 2015.
  9. Rahila H. Sheikh, M. M.Raghuwanshi, Anil N. Jaiswal, "Genetic algorithm based clustering: A Survey," IEEE, pp.314-319, 2008. C. J. Kaufman, Rocky Mountain Research Laboratories, Boulder, CO, private communication, 2004.
  10. Y. Yorozu, M. Hirano, K. Oka, and Y. Tagawa, "Electron spectroscopy studies on magneto-optical media and plastic substrate interface," IEEE Transl. J. Magn. Jpn., vol. 2, pp. 740-741, August 1987 [Dig. 9th Annual Conf. Magn. Jpn., p. 301, 1982].
  11. M. Young, The Technical Writer’s Handbook. Mill Valley, CA: University Science, 1989.

Downloads

Published

2018-02-28

Issue

Section

Research Articles

How to Cite

[1]
Anita Bishnoi, Mr. Vinod Todwal, " Hybrid Data Cost Setting using K-Means & ACO to Optimize Data Cost , IInternational Journal of Scientific Research in Computer Science, Engineering and Information Technology(IJSRCSEIT), ISSN : 2456-3307, Volume 3, Issue 1, pp.755-760, January-February-2018.