Development of a Rule Based Classification System to Identify a Suitable Classifier for a Particular Dataset.

Authors(3) :-Subham Datta, Gautam, Tapas Saha

Big data in the present world have a tremendous scope of research along with machine learning. Given a large amount of data, performing operations on these data is a very tedious task. One such process involves the classification of these huge amounts of the dataset. For classification of a dataset, we use different set of classifiers. The dataset, when tested on these classifiers, shows some results which are obviously different from each other. For a known dataset and the set of classifiers, we know the classifier from the set of classifiers that gives the best result. However, it might be the case that some unknown, a newly created or a modified dataset, the set of classifiers which gives the best result is a real challenge [14]. In this paper, we have applied 16 classifiers on IRIS dataset and the experimental results show GRID search classifier provides the best accuracy. So, from here on, we can conclude that the dataset which has similarity with IRIS dataset, GRID search classifier can be applied to get high accuracy as compared to other classifiers.

Authors and Affiliations

Subham Datta
Department of Computer Science, Pondicherry University, Puducherry, India
Department of Computer Science, Pondicherry University, Puducherry, India
Tapas Saha
Department of Computer Science, Pondicherry University, Puducherry, India

Classification, Holdout Method, Rule Based Classification, GRID Search.

  1. A. M. Law and W. D. Kelton. 2000. Simulation Modeling and Analysis. Boston, MA, USA: McGraw Hill.
  2. A. M. Molinaro, R. Simon, and R. M. Pfeiffer. 2005. “Prediction error estimation: a comparison of resampling methods,” Bioinformatics, vol. 21, no. 15, pp. 3301–3307.
  3. Briand, L. C. 2008. Novel Applications of Machine Learning in Software Testing. 2008 The Eighth International Conference on Quality Software, 3–10.
  4. Chen, X.W. and Lin, X. 2014. Big data deep learning: challenges and perspectives. IEEE Access. 2, 514–525
  5. Chung, C.-T., Tsai, C.-Y., Liu, C.-H., & Lee, L.-S. 2017. Unsupervised Iterative Deep Learning of Speech Features and Acoustic Tokens with Applications to Spoken Term Detection, 25(10), 1914–1928. Retrieved from
  6. Dong, W., & Zhou, M. 2017. A Supervised Learning and Control Method to Improve Particle Swarm Optimization Algorithms. IEEE Transactions on Systems, Man, and Cybernetics: Systems, 47(7), 1135–1148.
  7. Dougherty, E. R., Brun, M., & Xu, Q. 2008. Which is better: Holdout or full-sample classifier design? Eurasip Journal on Bioinformatics and Systems Biology.
  8. H. He and E. A. Garcia. 2009. “Learning from imbalanced data,” IEEE Trans. Knowl. Data Eng., vol. 21, no. 9, pp. 1263–1284,( Sep. 2009).
  10. J. Karimi, H. Nobahari, and S. H. Pourtakdoust, . 2012 “A new hybrid approach for dynamic continuous optimization problems,” Appl. Soft Comput., vol. 12, no. 3, pp. 1158–1167,( Mar. 2012)
  11. J.Zhu and T.Hastie. 2005 “Kernel logistic regression and the import vector machine,” J. Comput. Graph. Statist., vol. 14, no. 1, pp. 185–205.
  12. Kumar, S. 2016. Kinematic Control of Redundant Manipulators using Neural Networks. Ieee Transactions on Neural Networks and Learning Systems, 28(10), 1–12.
  13. Li, J., Modares, H., Chai, T., Lewis, F. L., & Xie, L. 2017. Off-Policy Reinforcement Learning for Synchronization in Multiagent Graphical Games. IEEE Transactions on Neural Networks and Learning Systems, 28(10), 2434–2445.
  14. M. Galar, A. Fernandez, E. Barrenechea, H. Bustince, and F. Herrera. 2012. “A review on ensembles for the class imbalance problem: Bagging-, boosting-, and hybrid-based approaches,” IEEE Trans. Syst. Man Cybern. Part C Appl. Rev., vol. 42, no. 4, pp. 463–484,( Jul. 2012. )
  15. Machine, R., Algorithms, L., Data, B., & Systems, D. V. 2017. Efficient and Rapid Machine Learning Algorithms for Big Data and     Dynamic Varying Systems, 47(10), 2625–2626.
  16. Ohsaki, M., Wang, P., Matsuda, K., Katagiri, S., Watanabe, H., & Ralescu, A. 2017. Confusion-matrix-based Kernel Logistic Regression for Imbalanced Data Classification. IEEE Transactions on Knowledge and Data Engineering, 29(9), 1–1.
  17. R. Salakhutdinov and A. Mnih. 2011. “Probabilistic matrix factorization,” in Proc. NIPS, vol. 20,  pp. 1–8.
  18. Rodrigues, C. N. M., Gonc?lves, A. B., Silva, G. G., & Pistori, H. 2015. Evaluation of Machine Learning and Bag of Visual Words Techniques for Pollen Grains Classification. IEEE Latin America Transactions, 13(10), 3498–3504.
  19. S. Wang and X. Yao. 2013. “Relationships between diversity of classification ensembles and single-class performance measures,” IEEE Trans. Knowl. Data Eng., vol. 25, no. 1, pp. 206–2019.( Jan. 2013).
  20. U. M. Braga-Neto and E. R. Dougherty. 2004. “Is cross-validation valid for small-sample microarray classification?” Bioinformatics, vol. 20, no. 3, pp. 374–380.
  21. V. Roth. 2001. “Probabilistic discriminative kernel classifiers formulti-class problems,” LectureNotes Comput. Sci., vol. 2191, pp. 246–253.
  22. W.Wang, J. Xi and H. Chen. 2014.“Modeling and recognizing driver behavior based on driving data: A survey,” Math. Probl. Eng., vol.( Feb. 2014), Art. no. 245641.
  23. Wang, W., Xi, J., Chong, A., & Li, L. 2017. Driving Style Classification Using a Semisupervised Support Vector Machine. IEEE Transactions on Human-Machine Systems, 47(5), 650–660.
  24. Xu, J., Moon, K. H., & Van Der Schaar, M. 2017. A Machine Learning Approach for Tracking and Predicting Student Performance in Degree Programs. IEEE Journal of Selected Topics in Signal Processing, 11(5), 1–13.
  25. Xu, X., & Hua, Q. 2017. Industrial Big Data Analysis in Smart Factory: Current Status and Research Strategies. IEEE Access, 5, 1–1
  26. Y. Koren et al. 2009. “Matrix factorization techniques for recommender systems,” Computer, vol. 42, no. 8, pp. 30–37.
  27. Yang, Z., Zhang, T., Lu, J., Zhang, D. & Kalui, D. 2017. Optimizing area under the ROC curve via extreme learning machines. Knowledge-Based Systems, 130, 74–89.
  28. Zhang, L., Tan, J., Han, D., & Zhu, H. 2017. From machine learning to deep learning: progress in machine intelligence for rational drug discovery. Drug Discovery Today, 0(0), 1–6.
  29. Zhen, X., Yu, M., Islam, A., Bhaduri, M., Chan, I., & Li, S. 2016. Descriptor Learning via Supervised Manifold Regularization for Multioutput Regression. IEEE Transactions on Neural Networks and Learning Systems, 28(9), 2035–2047.

Publication Details

Published in : Volume 2 | Issue 5 | September-October 2017
Date of Publication : 2017-10-31
License:  This work is licensed under a Creative Commons Attribution 4.0 International License.
Page(s) : 478-487
Manuscript Number : CSEIT1725106
Publisher : Technoscience Academy

ISSN : 2456-3307

Cite This Article :

Subham Datta, Gautam, Tapas Saha, "Development of a Rule Based Classification System to Identify a Suitable Classifier for a Particular Dataset.", International Journal of Scientific Research in Computer Science, Engineering and Information Technology (IJSRCSEIT), ISSN : 2456-3307, Volume 2, Issue 5, pp.478-487, September-October-2017.
Journal URL :

Article Preview

Follow Us

Contact Us