An Improved Data Reduction Technique Based On KNN & NB with Hybrid Selection Method for Effective Software Bugs Triage

Authors(3) :-Kapil Sahu, Dr. Umesh Kumar Lilhore, Prof Nitin Agarwal

In software development process testing process ensures quality management of the product by ensuring bugs free product. Existing methods are based on Na´ve byes, SVM methods, which encounters with several issues such as poor precision, recall, TPR and accuracy results. In this research, we are presenting an improved data reduction technique based on kNN & NB with hybrid selection method for effective software bugs triage. Reasons behind the selection of two methods are, k nearest neighbor technique will help in word counts from bug report data and Na´ve byes method helps to measure the frequency of the word. The proposed method uses bug report classification, bug report retrieval, and bug report triage. In this proposed method we are also using hybrid selection method for reducing the database, feature selection, and Instance selection methods. Existing method Na´ve byes and proposed (kNN + NB with Hybrid selection) are implemented over MATLAB simulator and various performance measuring parameters such as precision, recall, accuracy, detection time and TPR are calculated. An experimental study clearly shows that our proposed method shows outstanding in terms of all the performance measuring parameters as compared to the existing method for bug triage and data reduction.

Authors and Affiliations

Kapil Sahu
M. Tech Scholar, Department of CSE, NIIST Bhopal, Madhya Pradesh, India
Dr. Umesh Kumar Lilhore
Head PG, Department of CSE, NIIST Bhopal, Madhya Pradesh, India
Prof Nitin Agarwal
Assistant Professor, Department of CSE, NIIST Bhopal, Madhya Pradesh, India

Bug triage, Data Mining, Native Byes, kNN, Instance selection, Feature selection

  1. ShanthiPriya Duraisamy, Laxmi Raja, KalaiSelvi Kandaswamy,"An Approach for Predicting Bug Triage using Data Reduction Methods”, International Journal of Computer Applications (0975 – 8887) Volume 177 – No. 5, November 2017, PP 1-6.
  2. Pooja S. Dhole, Prof. Avinash P. Wadhe, "Anatomization of Bug Triage using Data Reduction Techniques”, Satellite Conference ICS SD 2016 International Conference on Science and Technology for Sustainable Development, Kuala Lumpur,Malaysia, May 24-26, 2016,, PP 124-130.
  3. Attika Ahmed, Rozaida Ghazali, "An Improved Self-Organizing Map for Bugs Data Clustering",2016 IEEE International Conference on Automatic Control and Intelligent Systems (I2CACIS), 22 October 2016, Shah Alam, Malaysia PP 135-141.
  4. Dhyan Chandra Yadav, Saurabh Pal,"Software Bug Detection using Data Mining", International Journal of Computer Applications (0975 – 8887) Volume 115 – No. 15, April 2015, PP 21-27.
  5. Sangameshwar Patil,"Concept-based Classification of Software Defect Reports",2017 IEEE/ACM 14th International Conference on Mining Software Repositories (MSR),978-1-5386-1544-7/17 $31.00 © 2017 IEEE, PP 182-187.
  6. Poonam Pandey, Radhika Prabhakar, "An Analysis of Machine Learning Techniques (J48 & AdaBoost) -for Classification", IEEE 2016 Conference CNC-16, PP 978-984.
  7. Haidar Osman, Mohammad Ghafari,"An Extensive Analysis of Efficient Bug Prediction Configurations", ACM PROMISE Conferences November 8, 2017, Toronto, Canada, PP 978-988.
  8. Seyed Ali Asghar Mostafavi Sabet, Alireza Moniri, Farshad Mohebbi, "Root-Cause and Defect Analysis based on a Fuzzy Data Mining Algorithm", (IJACSA) International Journal of Advanced Computer Science and Applications, Vol. 8, No. 9, 2017, PP 21-29.
  9. SEYED MOHAMMAD GHAFFARIAN and HAMID REZA SHAHRIARI,"Sofware Vulnerability Analysis and Discovery Using Machine-Learning and Data-Mining Techniques: A Survey", ACM Computing Surveys, Vol. 50, No. 4, Article 56. Publication date: August 2017, PP 56-92.
  10. Yu Zhou, Yanxiang Tong, Ruihang Gu and Harald Gall,"Combining text mining and data mining for bug report classification ", JOURNAL OF SOFTWARE: EVOLUTION AND PROCESS J. Softw. Evol. and Proc. 2016; 28:150–176.
  11. Rafael Alcala, Maria Jose Gacto, Jesus Alcala Fdez, "Evolutionary data mining and applications: A revision on the most cited papers from the last 10 years (2007–2017)", WIREs Data Mining Knowl Discov. 2017, Wiley, PP 1-17.
  12. GOWTHAM ATLURI, ANUJ KARPATNE, VIPIN KUMAR, "Spatio-Temporal Data Mining: A Survey of Problems and Methods", ACM Computing Surveys, Vol. 1, No. 1, Article. Publication date: November 2017, PP 1-37.
  13. D. Kavitha, "SURVEY OF DATA MINING TECHNIQUES FOR SOCIAL NETWORKING WEBSITES", IJCSMC, Vol. 6, Issue. 4, April 2017, pg.418 – 426.
  14. Naresh.E, Vijaya Kumar B.P, Sahana.P.Shankar, "Comparative Analysis of the Various Data Mining Techniques for Defect Prediction using the NASA MDP Datasets for Better Quality of the Software Product", Advances in Computational Sciences and Technology ISSN 0973-6107 Volume 10, Number 7 (2017) pp. 2005-2017
  15. ZHANG Jie, WANG XiaoYin, HAO Dan, XIE Bing, ZHANG Lu MEI Hong, "A survey on bug-report analysis", SCIENCE CHINA Information Sciences Sand printer-Verlag Berlin Heidelberg 2015, Vol. 58, PP 1-24.
  16. Anvik J, Hiew L, Murphy G C. Who should fix this bug? In: Proceedings of the International Conference on Software Engineering, Shanghai, 2006. 361–370.
  17. Hooimeijer P, Weimer W. Modeling bug report quality. In: Proceedings of the IEEE/ACM International Conference on Automated Software Engineering, Atlanta, 2007. 34–43.
  18. Nguyen AT, Nguyen T T, Nguyen T N, et al. Duplicate bug report detection with a combination of information retrieval and topic modeling. In: Proceedings of the IEEE/ACM International Conference on Automated Software Engineering, Essen, 2012. 70–79.
  19. Xie J, Zhou M, Mockus A. Impact of triage: a study of Mozilla and gnome. In: Proceedings of the ACM / IEEE International Symposium on Empirical Software Engineering and Measurement, Baltimore, 2013. 247–250.
  20. J. W. Park, M. W. Lee, J. Kim, S. W. Hwang, and S. Kim, “Costriage: A cost-aware triage algorithm for bug reporting systems,” in Proc. 25th Conf. Artif. Intell. Aug. 2011, pp. 139–144.
  21. A. Tamrawi, T. Nguyen, J. Al-Kofahi, and T. Nguyen, “Fuzzy set and cache-based approach for bug triaging,” in Proc. 19th ACM SIGSOFT Symposium and the 13th European Conference on Foundations of Software Engineering, 2011, pp. 365–375.
  22. Q. Shao, Y. Chen, S. Tao, X. Yan, and N. Anerousis, “Efficient ticket routing by resolution sequence mining,” in Proc. 14th ACM SIGKDD Int. Conf. Knowl. Discovery Data Mining, Aug. 2008, pp. 605–613.
  23. Jifeng Xuan, He Jiang, Yan Hu, Zhilei Ren, Weiqin Zou, Zhongxuan Luo, and Xindong Wu, “Towards Effective Bug Triage with Software Data Reduction Techniques”, IEEE transactions on knowledge and data engineering, vol. 27, no. 1, January 2015.
  24. T. Zhang, G. Yang, B. Lee, I. Shin “Role Analysis-based Automatic Bug Triage Algorithm”, 2012.
  25. P. Bhattacharya, L. Neamtiu, C. R. Shelton “Automated, highly-accurate, bug assignment using Machine learning and tossing graphs”, 2012.

This document was edited with the online HTML5 composer. Please subscribe for a membership to remove similar messages from the edited documents.

Publication Details

Published in : Volume 3 | Issue 5 | May-June 2018
Date of Publication : 2018-06-30
License:  This work is licensed under a Creative Commons Attribution 4.0 International License.
Page(s) : 633-639
Manuscript Number : CSEIT1835146
Publisher : Technoscience Academy

ISSN : 2456-3307

Cite This Article :

Kapil Sahu, Dr. Umesh Kumar Lilhore, Prof Nitin Agarwal, "An Improved Data Reduction Technique Based On KNN & NB with Hybrid Selection Method for Effective Software Bugs Triage", International Journal of Scientific Research in Computer Science, Engineering and Information Technology (IJSRCSEIT), ISSN : 2456-3307, Volume 3, Issue 5, pp.633-639, May-June-2018.
Journal URL :

Article Preview