Hybrid Data Cost Setting using K-Means & ACO to Optimize Data Cost

Authors(2) :-Anita Bishnoi, Mr. Vinod Todwal

In k-means clustering, we are given a set of n data points in d-dimensional space Rd and an integer k and the problem is to determine a set of k points in Rd, called centers, so as to minimize the mean squared distance from each data point to its nearest center. A widespread heuristic for k-means clustering is Lloyd's algorithm. In this paper, we present a simple and efficient implementation of Lloyd's k-means clustering algorithm, which we call the filtering algorithm. This algorithm is easy to implement, requiring a kd-tree as the only major data structure .We establish the practical efficiency of the filtering algorithm in two ways. First, we present a data-sensitive analysis of the algorithm's running time, which shows that the algorithm runs faster as the separation between clusters increases. Second, we present a number of empirical studies both on synthetically generated data and on real data sets from applications in color quantization, data compression, and image segmentation.

Authors and Affiliations

Anita Bishnoi
Rajasthan College of Engineering for Women, Jaipur, Rajasthan, India
Mr. Vinod Todwal
Rajasthan College of Engineering for Women, Jaipur, Rajasthan, India

ACO, Clusters, K-means, Mahalanobis Distance.

  1. AmiraBoukhdhir Oussama Lachiheb, Mohamed Sala Gouider. "An improved Map Reduce Design of Kmeans for clustering very large datasets", IEEE transaction.
  2. Huang Xiuchang , SU Wei ,"An Improved K-means Clustering Algorithm" ,JOURNAL OF NETWORKS, VOL. 9, NO. 1, JANUARY 2014
  3. Yugal Kumar and G. Sahoo, "A New Initialization Method to Originate Initial Cluster Centers for K-Means Algorithm", International Journal of Advanced Science and Technology Vol.62, (2014).
  4. Honga Tzung-Pei, Chun-Hao Chenc, Feng-Shih Lin, "Usinggroup genetic algorithm to improve performance of attribute clustering," Elsevier, pp.1-8, 2015.
  5. Danial Gomes Ferrari, Leandro Numes de Castro, " Clustering algorithm selection by meta-learning systems: A new distance based problems characterization and ranking combination methods," Elsevier, pp.181-194, 2015.
  6. Rajashree Dash and Rasmita Dash, "Comparative analysis of K means and Genetic algorithm based data clustering," International Journal of Advanced Computer and Mathematical Sciences, pp.257-265, 2012.
  7. Edvin Aldana-Bobadhilla, Angel Kuri-Morales, "A Clustering based method on the maximum entropy principle," Entropy Article, pp. 151-180, 2015.
  8. Kannuri Lahari, M. Ramakrishna Murty, and Suresh C. Satapathy, "Prediction based clustering using genetic algorithm and Learning Based Optimization Performance Analysis," Advances in Intelligent Systems and Computing," pp. 338, 2015.
  9. Rahila H. Sheikh, M. M.Raghuwanshi, Anil N. Jaiswal, "Genetic algorithm based clustering: A Survey," IEEE, pp.314-319, 2008. C. J. Kaufman, Rocky Mountain Research Laboratories, Boulder, CO, private communication, 2004.
  10. Y. Yorozu, M. Hirano, K. Oka, and Y. Tagawa, "Electron spectroscopy studies on magneto-optical media and plastic substrate interface," IEEE Transl. J. Magn. Jpn., vol. 2, pp. 740-741, August 1987 [Dig. 9th Annual Conf. Magn. Jpn., p. 301, 1982].
  11. M. Young, The Technical Writer’s Handbook. Mill Valley, CA: University Science, 1989.

Publication Details

Published in : Volume 3 | Issue 1 | January-February 2018
Date of Publication : 2018-02-28
License:  This work is licensed under a Creative Commons Attribution 4.0 International License.
Page(s) : 755-760
Manuscript Number : CSEIT1831178
Publisher : Technoscience Academy

ISSN : 2456-3307

Cite This Article :

Anita Bishnoi, Mr. Vinod Todwal, "Hybrid Data Cost Setting using K-Means & ACO to Optimize Data Cost ", International Journal of Scientific Research in Computer Science, Engineering and Information Technology (IJSRCSEIT), ISSN : 2456-3307, Volume 3, Issue 1, pp.755-760, January-February-2018.
Journal URL : http://ijsrcseit.com/CSEIT1831178

Article Preview