Development of a Rule Based Classification System to Identify a Suitable Classifier for a Particular Dataset.

Authors

  • Subham Datta  Department of Computer Science, Pondicherry University, Puducherry, India
  • Gautam  Department of Computer Science, Pondicherry University, Puducherry, India
  • Tapas Saha  Department of Computer Science, Pondicherry University, Puducherry, India

Keywords:

Classification, Holdout Method, Rule Based Classification, GRID Search.

Abstract

Big data in the present world have a tremendous scope of research along with machine learning. Given a large amount of data, performing operations on these data is a very tedious task. One such process involves the classification of these huge amounts of the dataset. For classification of a dataset, we use different set of classifiers. The dataset, when tested on these classifiers, shows some results which are obviously different from each other. For a known dataset and the set of classifiers, we know the classifier from the set of classifiers that gives the best result. However, it might be the case that some unknown, a newly created or a modified dataset, the set of classifiers which gives the best result is a real challenge [14]. In this paper, we have applied 16 classifiers on IRIS dataset and the experimental results show GRID search classifier provides the best accuracy. So, from here on, we can conclude that the dataset which has similarity with IRIS dataset, GRID search classifier can be applied to get high accuracy as compared to other classifiers.

References

  1. A. M. Law and W. D. Kelton. 2000. Simulation Modeling and Analysis. Boston, MA, USA: McGraw Hill.
  2. A. M. Molinaro, R. Simon, and R. M. Pfeiffer. 2005. “Prediction error estimation: a comparison of resampling methods,” Bioinformatics, vol. 21, no. 15, pp. 3301–3307.
  3. Briand, L. C. 2008. Novel Applications of Machine Learning in Software Testing. 2008 The Eighth International Conference on Quality Software, 3–10. https://doi.org/10.1109/QSIC.2008.29
  4. Chen, X.W. and Lin, X. 2014. Big data deep learning: challenges and perspectives. IEEE Access. 2, 514–525
  5. Chung, C.-T., Tsai, C.-Y., Liu, C.-H., & Lee, L.-S. 2017. Unsupervised Iterative Deep Learning of Speech Features and Acoustic Tokens with Applications to Spoken Term Detection, 25(10), 1914–1928. Retrieved from http://arxiv.org/abs/1707.05315
  6. Dong, W., & Zhou, M. 2017. A Supervised Learning and Control Method to Improve Particle Swarm Optimization Algorithms. IEEE Transactions on Systems, Man, and Cybernetics: Systems, 47(7), 1135–1148. https://doi.org/10.1109/TSMC.2016.2560128
  7. Dougherty, E. R., Brun, M., & Xu, Q. 2008. Which is better: Holdout or full-sample classifier design? Eurasip Journal on Bioinformatics and Systems Biology. https://doi.org/10.1155/2008/297945
  8. H. He and E. A. Garcia. 2009. “Learning from imbalanced data,” IEEE Trans. Knowl. Data Eng., vol. 21, no. 9, pp. 1263–1284,( Sep. 2009).
  9. Hea, Tate. 2007. COGNITIVE WIRELESS NETWORKS APPLICATIONS OF MACHINE LEARNING TO COGNITIVE RADIO NETWORKS., (2007,August), 47–52.
  10. J. Karimi, H. Nobahari, and S. H. Pourtakdoust, . 2012 “A new hybrid approach for dynamic continuous optimization problems,” Appl. Soft Comput., vol. 12, no. 3, pp. 1158–1167,( Mar. 2012)
  11. J.Zhu and T.Hastie. 2005 “Kernel logistic regression and the import vector machine,” J. Comput. Graph. Statist., vol. 14, no. 1, pp. 185–205.
  12. Kumar, S. 2016. Kinematic Control of Redundant Manipulators using Neural Networks. Ieee Transactions on Neural Networks and Learning Systems, 28(10), 1–12. https://doi.org/10.1109/TNNLS.2016.2574363
  13. Li, J., Modares, H., Chai, T., Lewis, F. L., & Xie, L. 2017. Off-Policy Reinforcement Learning for Synchronization in Multiagent Graphical Games. IEEE Transactions on Neural Networks and Learning Systems, 28(10), 2434–2445. https://doi.org/10.1109/TNNLS.2016.2609500
  14. M. Galar, A. Fernandez, E. Barrenechea, H. Bustince, and F. Herrera. 2012. “A review on ensembles for the class imbalance problem: Bagging-, boosting-, and hybrid-based approaches,” IEEE Trans. Syst. Man Cybern. Part C Appl. Rev., vol. 42, no. 4, pp. 463–484,( Jul. 2012. )
  15. Machine, R., Algorithms, L., Data, B., & Systems, D. V. 2017. Efficient and Rapid Machine Learning Algorithms for Big Data and     Dynamic Varying Systems, 47(10), 2625–2626.
  16. Ohsaki, M., Wang, P., Matsuda, K., Katagiri, S., Watanabe, H., & Ralescu, A. 2017. Confusion-matrix-based Kernel Logistic Regression for Imbalanced Data Classification. IEEE Transactions on Knowledge and Data Engineering, 29(9), 1–1. https://doi.org/10.1109/TKDE.2017.2682249
  17. R. Salakhutdinov and A. Mnih. 2011. “Probabilistic matrix factorization,” in Proc. NIPS, vol. 20,  pp. 1–8.
  18. Rodrigues, C. N. M., Goncąlves, A. B., Silva, G. G., & Pistori, H. 2015. Evaluation of Machine Learning and Bag of Visual Words Techniques for Pollen Grains Classification. IEEE Latin America Transactions, 13(10), 3498–3504. https://doi.org/10.1109/TLA.9907
  19. S. Wang and X. Yao. 2013. “Relationships between diversity of classification ensembles and single-class performance measures,” IEEE Trans. Knowl. Data Eng., vol. 25, no. 1, pp. 206–2019.( Jan. 2013).
  20. U. M. Braga-Neto and E. R. Dougherty. 2004. “Is cross-validation valid for small-sample microarray classification?” Bioinformatics, vol. 20, no. 3, pp. 374–380.
  21. V. Roth. 2001. “Probabilistic discriminative kernel classifiers formulti-class problems,” LectureNotes Comput. Sci., vol. 2191, pp. 246–253.
  22. W.Wang, J. Xi and H. Chen. 2014.“Modeling and recognizing driver behavior based on driving data: A survey,” Math. Probl. Eng., vol.( Feb. 2014), Art. no. 245641.
  23. Wang, W., Xi, J., Chong, A., & Li, L. 2017. Driving Style Classification Using a Semisupervised Support Vector Machine. IEEE Transactions on Human-Machine Systems, 47(5), 650–660. https://doi.org/10.1109/THMS.2017.2736948
  24. Xu, J., Moon, K. H., & Van Der Schaar, M. 2017. A Machine Learning Approach for Tracking and Predicting Student Performance in Degree Programs. IEEE Journal of Selected Topics in Signal Processing, 11(5), 1–13. https://doi.org/10.1109/JSTSP.2017.2692560
  25. Xu, X., & Hua, Q. 2017. Industrial Big Data Analysis in Smart Factory: Current Status and Research Strategies. IEEE Access, 5, 1–1 https://doi.org/10.1109/ACCESS.2017.2741105
  26. Y. Koren et al. 2009. “Matrix factorization techniques for recommender systems,” Computer, vol. 42, no. 8, pp. 30–37.
  27. Yang, Z., Zhang, T., Lu, J., Zhang, D. & Kalui, D. 2017. Optimizing area under the ROC curve via extreme learning machines. Knowledge-Based Systems, 130, 74–89. https://doi.org/10.1016/j.knosys.2017.05.013
  28. Zhang, L., Tan, J., Han, D., & Zhu, H. 2017. From machine learning to deep learning: progress in machine intelligence for rational drug discovery. Drug Discovery Today, 0(0), 1–6. https://doi.org/10.1016/j.drudis.2017.08.010
  29. Zhen, X., Yu, M., Islam, A., Bhaduri, M., Chan, I., & Li, S. 2016. Descriptor Learning via Supervised Manifold Regularization for Multioutput Regression. IEEE Transactions on Neural Networks and Learning Systems, 28(9), 2035–2047. https://doi.org/10.1109/TNNLS.2016.2573260

Downloads

Published

2017-10-31

Issue

Section

Research Articles

How to Cite

[1]
Subham Datta, Gautam, Tapas Saha, " Development of a Rule Based Classification System to Identify a Suitable Classifier for a Particular Dataset., IInternational Journal of Scientific Research in Computer Science, Engineering and Information Technology(IJSRCSEIT), ISSN : 2456-3307, Volume 2, Issue 5, pp.478-487, September-October-2017.