Performance Analysis of Data Mining Classification Algorithms to Predict Diabetes

Authors

  • Rakesh Singh Sambyal  Department of Information Technology and Engineering, Baba Ghulam Shah Badshah University Rajouri, Jammu and Kashmir, India
  • Tanzeela Javid  Department of Computer Science and Engineering, Baba Ghulam Shah Badshah University Rajouri, Jammu and Kashmir, India
  • Abhinav Bansal  Department Of Computer Science and Engineering, PEC university Of Technology, Chandigarh, India

Keywords:

Data Mining, Diabetes, Classification and Prediction, Neural Networks, Microsoft Azure, Python diagnosis System.

Abstract

Data mining refers to non-trivial extraction of valid, implicit, novel, potentially useful and ultimately understandable information patterns of data from enormous volumes of data. Classification and prediction are two forms of data analysis that can be used to extract models describing important data classes or to predict future data trends. One of the most important application of data mining is in disease prediction. In this paper we present a classification model developed using cloud platform Microsoft Azure that predicts the occurrence of Diabetes in an individual on the basis of non-pathological parameters – age, gender, family history of being diabetic, smoking and drinking habits, frequency of thirst and urination, weight height and fatigue. Six different algorithms have been compared among which the model created using “Two-Class Neural Network Algorithm” has the highest accuracy of 98.3% and hence has been deployed as a web service. Finally, a GUI is been developed in python to access the web service.

References

  1. Fayyad, U., Shapiro, G. P., Smyth, P., and Uthurusamy R., (1996d) “Advances in Knowledge Discovery and Data Mining”. AAAI/MIT Press, 1996.
  2. Malik, M. B., Ghazi, M. A., Ali, R. (2012), “Privacy Preserving Data Mining Techniques: Current Scenario and Future Prospects”, Third International Conference on computer and Communication Technology 2012, Allahabad, India.
  3. Syed Umar Amin, Kavita Agarwal Dr. Rizwan Beg, (2013). “Genetic Neural Network Based Data Mining in Prediction of Heart Disease Using Risk Factors”. Proceedings of 2013 IEEE Conference on Information and Communication Technologies (ICT 2013)
  4. Phattharat Songthung and Kunwadee Sripanidkulchai, (2016), “Improving Type 2 Diabetes Mellitus Risk Prediction Using Classification”. 2016 13th International Joint Conference on Computer Science and Software Engineering (JCSSE), 13-15 July 2016 Khon Kaen, Thailand.
  5. Cavoukian A., (1997), Information and Privacy Commissioner, Ontario, “Data Mining Staking a Claim on Your Privacy”, www.ipc.on.ca
  6. Divanis, G. A. and Verikios, S. V. (2010), “An Overview of Privacy Preserving Data Mining”, Published by The ACM Student Magazine 2010.
  7. Ammar Asjad Raja, Madiha Guftar, Madiha Guftar, Tamim Ahmed Khan, and Dominik Greibl, (2016). “Intelligent Syncope Disease Prediction Framework using DM-Ensemble Techniques”. FTC 2016 - Future Technologies Conference 2016, San Francisco, United States.
  8. https://en.wikipedia.org/wiki/Confusion_matrix.
  9. Data Mining: Concepts Methodologies, Tools and Applications Volume 1 Edited By Management Association, Information.
  10. Girija D.K., Dr. M.S. Shashidhara, and M. Giri, (2013), “Data mining approach for prediction of fibroid Disease using Neural Networks”. 2013 International Conference on Emerging Trends in Communication, Control, Signal Processing and Computing Applications (C2SPCA) 10-12 October 2013 Bangalore, India .
  11. Fundamentals of Neural Networks: Arquitectures, Algorithms, and Applications – Laurene Fausett.
  12. W. G. Baxt, (1990) “Use of an artificial neural network for data analysis in clinical decisionmaking: The diagnosis of acute coronary occlusion,” Neural Comput., vol. 2, pp. 480–489..
  13. Dr. A. Kandaswamy, (1997) “Applications of Artificial Neural Networks in Bio Medical Engineering”, The Institute of Electronics and Telecommunicatio Engineers, Proceedings of the Zonal Seminar on Neural Networks, Nov 20-21.
  14. Scales, R., & Embrechts, M., (2002) “Computational Intelligence Techniques for Medical Diagnostic”, Proceedings of Walter Lincoln Hawkins, Graduate Research Conference.
  15. S. Moein, S. A. Monadjemi and P. Moallem, (2009) "A Novel Fuzzy-Neural Based Medical Diagnosis System", International Journal of Biological & Medical Sciences, Vol.4, No.3, pp. 146-150.
  16. D Gil, M Johnsson, JM Garcia Chamizo, (2009) , ”Application of artificial neural networks in the diagnosis of urological dysfunctions”, Expert Systems with Applications Volume 36, Issue 3, Part 2, Pages 5754-5760, Elsevier.
  17. Hasan Temurtas, Nejat Yumusak, Feyzullah Temurtas, (2009)” A comparative study on diabetes disease diagnosis using neural networks”, Expert Systems with Applications: An International Journal , Volume 36 Issue 4.
  18. S. M. Kamruzzaman , Md. Monirul Islam, (2006)” An Algorithm to Extract Rules from Artificial Neural Networks for Medical Diagnosis Problems”, International Journal of Information Technology, Vol. 12 No. 8.
  19. Dr. K. Usha Rani (2011), “Analysis Of Heart Diseases Dataset Using Neural Network Approach”. International Journal of Data Mining & Knowledge Management Process (IJDKP) Vol.1, No.5, September 2011.
  20. Monika Gandhi, and Dr. Shailendra Narayan Singh, (2015). “Predictions in Heart Disease Using Techniques of Data Mining”. 2015 1st International Conference on Futuristic trend in Computational Analysis and Knowledge Management (ABLAZE-2015), Noida, India.

Downloads

Published

2018-04-25

Issue

Section

Research Articles

How to Cite

[1]
Rakesh Singh Sambyal, Tanzeela Javid, Abhinav Bansal, " Performance Analysis of Data Mining Classification Algorithms to Predict Diabetes, IInternational Journal of Scientific Research in Computer Science, Engineering and Information Technology(IJSRCSEIT), ISSN : 2456-3307, Volume 4, Issue 1, pp.56-63, March-April-2018.