Development of a Rule Based Classification System to Identify a Suitable Classifier for a Particular Dataset.

Subham Datta; Gautam; Tapas Saha

doi:10.32628/CSEIT1725106

Authors

Subham Datta Department of Computer Science, Pondicherry University, Puducherry, India
Gautam Department of Computer Science, Pondicherry University, Puducherry, India
Tapas Saha Department of Computer Science, Pondicherry University, Puducherry, India

Keywords:

Classification, Holdout Method, Rule Based Classification, GRID Search.

Abstract

Big data in the present world have a tremendous scope of research along with machine learning. Given a large amount of data, performing operations on these data is a very tedious task. One such process involves the classification of these huge amounts of the dataset. For classification of a dataset, we use different set of classifiers. The dataset, when tested on these classifiers, shows some results which are obviously different from each other. For a known dataset and the set of classifiers, we know the classifier from the set of classifiers that gives the best result. However, it might be the case that some unknown, a newly created or a modified dataset, the set of classifiers which gives the best result is a real challenge [14]. In this paper, we have applied 16 classifiers on IRIS dataset and the experimental results show GRID search classifier provides the best accuracy. So, from here on, we can conclude that the dataset which has similarity with IRIS dataset, GRID search classifier can be applied to get high accuracy as compared to other classifiers.

References

A. M. Law and W. D. Kelton. 2000. Simulation Modeling and Analysis. Boston, MA, USA: McGraw Hill.
A. M. Molinaro, R. Simon, and R. M. Pfeiffer. 2005. “Prediction error estimation: a comparison of resampling methods,” Bioinformatics, vol. 21, no. 15, pp. 3301–3307.
Briand, L. C. 2008. Novel Applications of Machine Learning in Software Testing. 2008 The Eighth International Conference on Quality Software, 3–10. https://doi.org/10.1109/QSIC.2008.29
Chen, X.W. and Lin, X. 2014. Big data deep learning: challenges and perspectives. IEEE Access. 2, 514–525
Chung, C.-T., Tsai, C.-Y., Liu, C.-H., & Lee, L.-S. 2017. Unsupervised Iterative Deep Learning of Speech Features and Acoustic Tokens with Applications to Spoken Term Detection, 25(10), 1914–1928. Retrieved from http://arxiv.org/abs/1707.05315
Dong, W., & Zhou, M. 2017. A Supervised Learning and Control Method to Improve Particle Swarm Optimization Algorithms. IEEE Transactions on Systems, Man, and Cybernetics: Systems, 47(7), 1135–1148. https://doi.org/10.1109/TSMC.2016.2560128
Dougherty, E. R., Brun, M., & Xu, Q. 2008. Which is better: Holdout or full-sample classifier design? Eurasip Journal on Bioinformatics and Systems Biology. https://doi.org/10.1155/2008/297945
H. He and E. A. Garcia. 2009. “Learning from imbalanced data,” IEEE Trans. Knowl. Data Eng., vol. 21, no. 9, pp. 1263–1284,( Sep. 2009).
Hea, Tate. 2007. COGNITIVE WIRELESS NETWORKS APPLICATIONS OF MACHINE LEARNING TO COGNITIVE RADIO NETWORKS., (2007,August), 47–52.
J. Karimi, H. Nobahari, and S. H. Pourtakdoust, . 2012 “A new hybrid approach for dynamic continuous optimization problems,” Appl. Soft Comput., vol. 12, no. 3, pp. 1158–1167,( Mar. 2012)
J.Zhu and T.Hastie. 2005 “Kernel logistic regression and the import vector machine,” J. Comput. Graph. Statist., vol. 14, no. 1, pp. 185–205.
Kumar, S. 2016. Kinematic Control of Redundant Manipulators using Neural Networks. Ieee Transactions on Neural Networks and Learning Systems, 28(10), 1–12. https://doi.org/10.1109/TNNLS.2016.2574363
Li, J., Modares, H., Chai, T., Lewis, F. L., & Xie, L. 2017. Off-Policy Reinforcement Learning for Synchronization in Multiagent Graphical Games. IEEE Transactions on Neural Networks and Learning Systems, 28(10), 2434–2445. https://doi.org/10.1109/TNNLS.2016.2609500
M. Galar, A. Fernandez, E. Barrenechea, H. Bustince, and F. Herrera. 2012. “A review on ensembles for the class imbalance problem: Bagging-, boosting-, and hybrid-based approaches,” IEEE Trans. Syst. Man Cybern. Part C Appl. Rev., vol. 42, no. 4, pp. 463–484,( Jul. 2012. )
Machine, R., Algorithms, L., Data, B., & Systems, D. V. 2017. Efficient and Rapid Machine Learning Algorithms for Big Data and Dynamic Varying Systems, 47(10), 2625–2626.
Ohsaki, M., Wang, P., Matsuda, K., Katagiri, S., Watanabe, H., & Ralescu, A. 2017. Confusion-matrix-based Kernel Logistic Regression for Imbalanced Data Classification. IEEE Transactions on Knowledge and Data Engineering, 29(9), 1–1. https://doi.org/10.1109/TKDE.2017.2682249
R. Salakhutdinov and A. Mnih. 2011. “Probabilistic matrix factorization,” in Proc. NIPS, vol. 20, pp. 1–8.
Rodrigues, C. N. M., Goncąlves, A. B., Silva, G. G., & Pistori, H. 2015. Evaluation of Machine Learning and Bag of Visual Words Techniques for Pollen Grains Classification. IEEE Latin America Transactions, 13(10), 3498–3504. https://doi.org/10.1109/TLA.9907
S. Wang and X. Yao. 2013. “Relationships between diversity of classification ensembles and single-class performance measures,” IEEE Trans. Knowl. Data Eng., vol. 25, no. 1, pp. 206–2019.( Jan. 2013).
U. M. Braga-Neto and E. R. Dougherty. 2004. “Is cross-validation valid for small-sample microarray classification?” Bioinformatics, vol. 20, no. 3, pp. 374–380.
V. Roth. 2001. “Probabilistic discriminative kernel classifiers formulti-class problems,” LectureNotes Comput. Sci., vol. 2191, pp. 246–253.
W.Wang, J. Xi and H. Chen. 2014.“Modeling and recognizing driver behavior based on driving data: A survey,” Math. Probl. Eng., vol.( Feb. 2014), Art. no. 245641.
Wang, W., Xi, J., Chong, A., & Li, L. 2017. Driving Style Classification Using a Semisupervised Support Vector Machine. IEEE Transactions on Human-Machine Systems, 47(5), 650–660. https://doi.org/10.1109/THMS.2017.2736948
Xu, J., Moon, K. H., & Van Der Schaar, M. 2017. A Machine Learning Approach for Tracking and Predicting Student Performance in Degree Programs. IEEE Journal of Selected Topics in Signal Processing, 11(5), 1–13. https://doi.org/10.1109/JSTSP.2017.2692560
Xu, X., & Hua, Q. 2017. Industrial Big Data Analysis in Smart Factory: Current Status and Research Strategies. IEEE Access, 5, 1–1 https://doi.org/10.1109/ACCESS.2017.2741105
Y. Koren et al. 2009. “Matrix factorization techniques for recommender systems,” Computer, vol. 42, no. 8, pp. 30–37.
Yang, Z., Zhang, T., Lu, J., Zhang, D. & Kalui, D. 2017. Optimizing area under the ROC curve via extreme learning machines. Knowledge-Based Systems, 130, 74–89. https://doi.org/10.1016/j.knosys.2017.05.013
Zhang, L., Tan, J., Han, D., & Zhu, H. 2017. From machine learning to deep learning: progress in machine intelligence for rational drug discovery. Drug Discovery Today, 0(0), 1–6. https://doi.org/10.1016/j.drudis.2017.08.010
Zhen, X., Yu, M., Islam, A., Bhaduri, M., Chan, I., & Li, S. 2016. Descriptor Learning via Supervised Manifold Regularization for Multioutput Regression. IEEE Transactions on Neural Networks and Learning Systems, 28(9), 2035–2047. https://doi.org/10.1109/TNNLS.2016.2573260

Development of a Rule Based Classification System to Identify a Suitable Classifier for a Particular Dataset.

Authors

Keywords:

Abstract

References

Downloads

Published

Issue

Section

License

How to Cite