Classification Algorithms in Data Mining : A Survey

Authors

  • C. Parimala  PG Scholar, Department of Computer Science, Bharathiar University, Coimbatore, Tamilnadu, India
  • R. Porkodi  Assistant Professor, Department of Computer Science, Bharathiar University, Coimbatore, Tamilnadu, India

Keywords:

Classification Algorithm, Bayesian Net, J48, LMT, Random Tree, REP Tree.

Abstract

Classification is a data mining task that assigns items in a collection to target categories or classes. The scope of classification is to accurately predict the target class for each case in the data. In the hypothesis build training procedure, a classification algorithm find relationships between the worth of the predictors and the values of the goal. Different classification algorithms use dissimilar techniques for finding relationships. These relationships are summarized in a model, which container afterward be apply to a different data set in which the class assignments are unknown. Classification has many applications in customer segmentation, business modeling, marketing, credit analysis, bio medical and drug responsemodeling. This paper presents the study and analysis of five classification algorithms manually Bayesian network, j48, logistic model tree, random tree and rep tree for liver disorders dataset and the performance of these algorithms are compared using the various performance metrics such as Precision, Recalland F measure in which random tree algorithm gives 100% accuracy. The experimental result shows that random tree provides high accuracy than the Bayesian algorithm, j48, logistic model tree and rep tree.

References

  1. Fayyad, Ussama; Piattetsky-Shapiro,Gregory; Smyth, Padhraic (1996). "From Data Mining to Knowledge Discovery in Databases" (PDF). Retrieve December 2008.
  2. https://graduatedegrees.online.njit.edu/resources/mscs/mscs-articles/current-trends-in-data-mining/
  3. Yun Wan, Dr. QigangGao," An Ensemble Sentiment Classification System of Twitter Data for Airline Services Analysis",2015 IEEE 15th International Conference on Data Mining Workshops.
  4. Ibrahim M. El-Hasnony, Hazem M. El-Bakry, Ahmed A. Saleh, ," Classification of Breast Cancer Using Softcomputing Techniques", International Journal of Electronics and Information Engineering, Vol.4, No.1, PP.45-54, Mar. 2016.
  5. Anita kumar,"A Study on Cancer Perpetuation Using the Classification Algorithms", International Journal of Recent Research in Mathematics Computer Science and Information Technology Vol. 2, Issue 1, pp: (96-99), Month: April 2015 – September 2015, Available at: www.paperpublications.org
  6. HakizimanaLeopord, Dr. Wilson KiprutoCheruiyot, Dr. Stephen Kimani," A Survey and Analysis on Classification and Regression Data Mining Techniques for Diseases Outbreak Prediction in Datasets", The International Journal Of Engineering And Science (IJES) || Volume || 5 || Issue || 9 || Pages || PP -01-11 || 2016 || ISSN (e): 2319 – 1813 ISSN (p): 2319 – 1805.
  7. NityaUpadhyay, VinodiniKatiyar," A Survey on the Classification Techniques In Educational Data Mining", International Journal of Computer Applications Technology and ResearchVolume 3– Issue 11, 725 - 728, 2014, ISSN: 2319–8656.
  8. Fabien Lotte, Marco Congedo, Anatole Lecuyer, FabriceLamarche, Bruno Arnaldi, "A review of classification algorithms for EEG-based Brain computer interfaces", Journal of Neural Engineering, IOP Publishing, 2007, 4, pp.24. <inria-00134950>.
  9. Patel Pinky S. Devendra V. Thakor," A Survey of Email Classification Algorithms in Data Mining", International Journal of Engineering Technology, Management and Applied Sciences www.ijetmas.com January 2015, Volume 3 Issue 1, ISSN 2349-4476.
  10. Arvind Kumar, ParminderKaur andPratibha Sharma,"A Survey on Hoeffding Tree Stream Data ClassificationAlgorithms", CPUH-Research Journal: 2015, 1(2), 28-32ISSN (Online): 2455-6076 http://www.cpuh.in/academics/academic_journals.php
  11. VandanaKorde and C NamrataMahender," Text classification and classifiers:A survey", International Journal of Artificial Intelligence & Applications (IJAIA), Vol.3, No.2, March 2012.
  12. Divya Jain, VijendraSingh ," Utilization of Data Mining Classification Approach for Disease Prediction: A Survey", I.J. Education and Management Engineering, 2016, 6, 45-52 Published Online November 2016 in MECS (http://www.mecs-press.net) DOI:10.5815/ijeme.2016.06.05.
  13. L. Tao, F. Sun, and S. Yang, A fast and robust sparse approach for hyper spectral data classification using a few labelled samples," IEEE Transactionson Geoscience and Remote Sensing, vol. 50, no. 6, pp. 2287-2302, 2012.
  14. Delveen Luqman Abd AL-Nabi1, Shereen Shukri Ahmed2, Survey on Classification Algorithms for Data Mining:(Comparison and Evaluation),Computer Engineering and Intelligent SystemsISSN 2222-1719 (Paper) , ISSN 2222-2863 (Online)Vol.4, No.8, 2013.
  15. R. Sivanesan1, K. Devika Rani Dhivya2 , "A Review on Diabetes Mellitus diagnoses using classification on Pima Indian Diabetes Data Set", International Journal of Advance Research in Computer Science and Management Studies Research Article / Survey Paper / Case Study, Volume 5, Issue 1, January 2017, Available online at: www.ijarcsms.com
  16. Wikipedia (2017). Logistic model tree. Available:https://en.wikipedia.org/wiki/Logistic model tree.
  17. Jiawei Han and Micheline Kamber Data Mining:Concepts and Techniques, second edition.
  18. Ghosh, S. R. and Waheed, S. (2017). Analysis of classification algorithms for liver disease diagnosis. Journal of Science, Technology and Environment Informatics, 05(01), 361-370. https://doi.org/10.18801/jstei.050117.38 .
  19. C.L. Blake, D.J. Newman, S. Hettich and C.J. Merz. (2012) UCI machine learning repository databases. [Online]. Available: http://mlr.cs.umass.edu/ml/machine-learning-databases/0022
  20. Cheng, Hong, et al. "Discriminative frequent pattern analysis for effective classification." Data Engineering, 2007. ICDE 2007. IEEE 23rd International Conference on. IEEE, 2007.
  21. M. El-Hasnony, H. M. El Bakry, A. A. Saleh, "Comparative study among data reduction techniques over classification accuracy," International Journal of Computer Applications, vol. 122, no. 2, pp. 8,15, 2015.
  22. John C. Bailar, Thomas A. Louis, Philip W. Lavori, Marcia Polansky, "A Classification for Biomedical Research Reports," N Engl J Med, Vol. 311, No. 23 pp. 1482-1487, in the year 2010.
  23. Ada, Rajneet Kaur. "Using Some Data Mining Techniques to Predict the Survival Year of Lung Cancer Patient." (2013).
  24. M. A. Nishara Banu, B. Gomathy ,"Disease Forecasting System Using Data Mining Methods", 2014 International Conference on Intelligent Computing Applications.

Downloads

Published

2018-02-28

Issue

Section

Research Articles

How to Cite

[1]
C. Parimala, R. Porkodi, " Classification Algorithms in Data Mining : A Survey, IInternational Journal of Scientific Research in Computer Science, Engineering and Information Technology(IJSRCSEIT), ISSN : 2456-3307, Volume 3, Issue 1, pp.349-355, January-February-2018.