Improve of Fuzzy C-Means Clustering in Feature Extraction Phase on the Breast Cancer Analysis

Authors(2) :-A. Josekin, D. Sudhakar

Cancer analysis is one of the broadly advised acreage in the healthcare domain. The objective of the breast cancer problem is to predict the property of a new tumor (malignant or benign). The existing method hybridizes K-means algorithm and SVM (K-SVM) for breast cancer diagnosis. To reduce the high dimensionality of feature space, it extracts abstract malignant and benign tumor patterns separately before the original data is trained to obtain the classifier. In order to improve the quality of prediction, Fuzzy c-means clustering is hybridizes with SVM (F-SVM). An improved fuzzy c-means algorithm is proposed to deal with the cancer data. The proposed algorithm improves the traditional Fuzzy c-means algorithm in terms of selecting the initial cluster centre. Thereby, it avoids the basic drawback of Fuzzy C-means and improves the quality of prediction over k-means algorithm. It helps to predict the benign and malignant tumors. Based on the derived membership, each tumor pattern is considered as a model. Further support vector machine (SVM) technique is used to obtain the new classifier to discriminate the data.

Authors and Affiliations

A. Josekin
Scholar, Department of Computer Science ,CSI Bishop Appasamy College of Arts and Science, Race Course, Coimbatore, Tamil Nadu, India
D. Sudhakar
Assistant Professor, Department of Computer Science, CSI Bishop Appasamy College of Arts and Science, Race Course, Coimbatore, Tamil Nadu, India

K-means, Fuzzy C-means, K-SVM, F-SVM.

  1. Al Shalabi, L and Shaaban, Z. (2006). Normalization as a preprocessing engine for data mining and the approach of preference matrix. In International Conference on Dependability of Computer Systems, DepCos-RELCOMEX'06, pp. 207-214.
  2. Alba, E, Garcia-Nieto, J, Jourdan, L and Talbi, E.G. (2007). Gene selection in cancer classification using PSO/SVM and GA/SVM hybrid algorithms. In IEEE Congress on Evolutionary Computation, pp. 284-290
  3. Alshamlan, H.M, Badr, G.H and Alohali, Y.A, (2015). Genetic Bee Colony (GBC) algorithm: A new gene selection method for microarray cancer classification. Computational biology and chemistry, vol. 56, pp. 49-60.
  4. Amini, A, Wah, T.Y, Saybani, M.R and Yazdi, S.R.A.S, (2011). A study of density-grid based clustering algorithms on data streams. In Eighth International Conference on Fuzzy Systems and Knowledge Discovery (FSKD), vol. 3, pp. 1652-1656.
  5. Amiri, B, Hossain, L, Crawford, J.W and Wigand, R.T. (2013). Community detection in complex networks: Multi–objective enhanced firefly algorithm. Knowledge-Based Systems, vol. 46, pp. 1-11.
  6. Anderson, P.E., Reo, N.V., DelRaso, N.J., Doom, T.E and Raymer, M.L. (2008). Gaussian binning: a new kernel-based method for processing NMR spectroscopic data for metabolomics. Metabolomics, vol. 4, no. 3, pp. 261-272.
  7. Baskar, S.S., Arockiam, L and Charles, S. (2013). A systematic approach on data pre-processing in data mining. Compusoft, vol. 2, no. 11, pp. 335.
  8. Bellazzi, R and Zupan, B. (2008). Predictive data mining in clinical medicine: current issues and guidelines. International journal of medical informatics, vol. 77, no. 2, pp. 81-97.
  9. Ben-Dor, A, Chor, B, Karp, R and Yakhini, Z, (2003). Discovering local structure in gene expression data: the order-preserving sub matrix problem. Journal of computational biology, vol. 10, no. 3-4, pp. 373-384.
  10. Blekas, K, Galatsanos, N.P, Likas, A and Lagaris, I.E, (2005). Mixture model analysis of DNA microarray images. IEEE Transactions on Medical Imaging, vol. 24, no. 7, pp. 901-909.
  11. Boeringer, D.W and Werner, D.H. (2004). Particle swarm optimization versus genetic algorithms for phased array synthesis. IEEE Transactions on antennas and propagation, vol. 52, vol. 3, pp. 771-779.
  12. Chakraborty, G and Chakraborty, B, (2013). Multi-objective optimization using Pareto GA for gene-selection from microarray data for disease classification. Proceedings of IEEE international conference on systems, man, and cybernetics (SMC), pp. 2629-2634.
  13. Chang, D.X, Zhang, X.D and Zheng, C.W, (2009). A genetic algorithm with gene rearrangement for K-means clustering. Pattern Recognition, vol. 42, no. 7, pp. 1210-1222.
  14. Chen, M.S, Han, J and Yu, P.S, (1996). Data mining: an overview from a database perspective. IEEE Transactions on Knowledge and data Engineering, vol. 8, no. 6, pp. 866-883.
  15. Chen, M.S, Park, J.S and Yu, P.S, (1998). Efficient data mining for path traversal patterns. IEEE Transactions on knowledge and data engineering, vol. 10, no. 2, pp. 209-221.
  16. Chu, F and Wang, L, (2005). Applications of support vector machines to cancer classification with microarray data. International journal of neural systems, vol. 15, no. 06, pp. 475-484.
  17. Chu, F and Wang, L, (2006). Applying rbf neural networks to cancer classification based on gene expressions. In International Joint Conference on Neural Networks, pp. 1930-1934.
  18. Chuang, H.Y, Liu, H, Brown, S, McMunn-Coffran, C, Kao, C.Y and Hsu, D.F. (2004). Identifying significant genes from microarray data. In Fourth IEEE Symposium on Bioinformatics and Bioengineering, pp. 358-365.
  19. Corso, J.J, Sharon, E, Dube, S, El-Saden, S, Sinha, U and Yuille, A, (2008). Efficient multilevel brain tumor segmentation with integrated bayesian model classification. IEEE transactions on medical imaging, vol.27, no.5, pp. 629-640.
  20. Crespo, F and Weber, R. (2005). A methodology for dynamic data mining based on fuzzy clustering. Fuzzy Sets and Systems, vol. 150, no. 2, pp. 267-284.
  21. Damayanti, A and Pratiwi, A.B. (2016). Epilepsy detection on EEG data using back propagation, firefly algorithm and simulated annealing. In International Conference on Science and Technology-Computer (ICST), pp. 167-171.
  22. A. Bonnaccorsi, “On the Relationship between Firm Size and Export Intensity,” Journal of International Business Studies, XXIII (4), pp. 605-635, 1992. (journal style)
  23. R. Caves, Multinational Enterprise and Economic Analysis, Cambridge University Press, Cambridge, 1982. (book style)
  24. M. Clerc, “The Swarm and the Queen: Towards a Deterministic and Adaptive Particle Swarm Optimization,” In Proceedings of the IEEE Congress on Evolutionary Computation (CEC), pp. 1951-1957, 1999. (conference style)
  25. H.H. Crokell, “Specialization and International Competitiveness,” in Managing the Multinational Subsidiary, H. Etemad and L. S, Sulude (eds.), Croom-Helm, London, 1986. (book chapter style)
  26. K. Deb, S. Agrawal, A. Pratab, T. Meyarivan, “A Fast Elitist Non-dominated Sorting Genetic Algorithms for Multiobjective Optimization: NSGA II,” KanGAL report 200001, Indian Institute of Technology, Kanpur, India, 2000. (technical report style)
  27. J. Geralds, "Sega Ends Production of Dreamcast,", para. 2, Jan. 31, 2001. [Online]. Available: news/1116995. [Accessed: Sept. 12, 2004]. (General Internet site)

Publication Details

Published in : Volume 2 | Issue 6 | November-December 2017
Date of Publication : 2017-12-31
License:  This work is licensed under a Creative Commons Attribution 4.0 International License.
Page(s) : 411-417
Manuscript Number : CSEIT1726110
Publisher : Technoscience Academy

ISSN : 2456-3307

Cite This Article :

A. Josekin, D. Sudhakar, "Improve of Fuzzy C-Means Clustering in Feature Extraction Phase on the Breast Cancer Analysis", International Journal of Scientific Research in Computer Science, Engineering and Information Technology (IJSRCSEIT), ISSN : 2456-3307, Volume 2, Issue 6, pp.411-417, November-December-2017.
Journal URL :

Article Preview