Comparative Analysis of Classification Methods in R Environment with two Different Data Sets

Authors(2) :-B Nithya, Dr. V Ilango

Machine Learning methods are widely used in various domains as they are influential in classification and prediction processes. The frequently used supervised machine learning task is classification. There are various types of classification algorithms with strengths and weaknesses appropriate for different types of input data. This paper depicts the implementation of few classification methods such as Decision Tree, K Nearest Neighbour and Naïve Byes classifier for different datasets in R environment. This paper presents the comparative study of these methods using open source tool R. The aim of this paper is to analyse the performance of these methods in two different datasets based on the evaluation metrics like accuracy and error rate. The implementation procedure show that the performance of any classification algorithm is based on the type of attributes of datasets and their characteristics. This paper shows that based on the constraints, requirements with type of input datasets specific algorithm and tool can be chosen for implementation.

Authors and Affiliations

B Nithya
Senior Assistant Professor & Research Scholar, Department of MCA, New Horizon College of Engineering, Bangalore, India
Dr. V Ilango
Professor, Department of MCA, New Horizon College of Engineering, Bangalore, India

Machine Learning, Classification, Decision Tree, K Nearest Neighbour, Naïve Bayes Classifier, Performance, R Tool.

  1. B Nithya, "An Analysis on Applications of Machine Learning Tools, Techniques and Practices in Health Care System", International Journal of Advanced Research in Computer Science and Software Engineering 6(6), June- 2016, pp. 1-8.
  2. Pellakuri et al., "Performance Analysis and Optimization of Supervised Learning Techniques for Medical Diagnosis Using Open Source Tools", International Journal of Computer Science and Information Technologies, Vol. 6 (1), 2015, 380-383.
  3. Sujata, Priyanka, "Performance Analysis of Different Classification Methods in Data Mining for Diabetes Dataset Using WEKA Tool", International Journal on Recent and Innovation Trends in Computing and Communication, Volume: 3 Issue: 3, Mar’ 2015.
  4. Subrata Kumar, "Performance Analysis of Data Mining Algorithms For Breast Cancer Cell Detection Using Naïve Bayes, Logistic Regression and Decision Tree", International Journal of Engineering And Computer Science ISSN: 2319-7242, Volume 6, Issue 2, Feb. 2017, Page No. 20388-20391.
  5. Sayali D. Jadhav, H. P. Channe, "Comparative Study of K-NN, Naive Bayes and Decision Tree Classification Techniques", International Journal of Science and Research, Volume 5 Issue 1, January 2016.
  6. Bhrigu Kapur et al., "Comparative Study on Marks Prediction using Data Mining and Classification Algorithms", International Journal of Advanced Research in Computer Science, Volume 8, No. 3, March – April 2017.
  7. Sara Silva et al., "A Comparison of Machine Learning Methods for the Prediction of Breast Cancer", EvoBIO 2011, LNCS 6623, pp. 159–170, Springer-Verlag Berlin Heidelberg 2011.
  8. Vanneschi et al., "A comparison of machine learning techniques for survival prediction in breast cancer", BioData Mining 2011, 4:12.
  9. Leonardo et al., "Identification of Individualized Feature Combinations for Survival Prediction in Breast Cancer: A Comparison of Machine Learning Techniques", EvoBIO 2010, LNCS 6023, pp. 110–121, Springer-Verlag Berlin Heidelberg 2010.
  10. Liyang Wei et. al., "A Study on Several Machine-Learning Methods for Classification of Malignant and Benign Clustered Microcalcifications", IEEE Transactions on Medical Imaging, Vol. 24, No. 3, March 2005.
  11. https://www.analyticsvidhya.com
  12. Brett Lantz, "Machine Learning with R", 2nd Edition, PACKT Publishing.

Publication Details

Published in : Volume 2 | Issue 6 | November-December 2017
Date of Publication : 2017-12-31
License:  This work is licensed under a Creative Commons Attribution 4.0 International License.
Page(s) : 136-141
Manuscript Number : CSEIT172612
Publisher : Technoscience Academy

ISSN : 2456-3307

Cite This Article :

B Nithya, Dr. V Ilango, "Comparative Analysis of Classification Methods in R Environment with two Different Data Sets", International Journal of Scientific Research in Computer Science, Engineering and Information Technology (IJSRCSEIT), ISSN : 2456-3307, Volume 2, Issue 6, pp.136-141, November-December-2017. |          | BibTeX | RIS | CSV

Article Preview