Risk Factor Analysis of Diseases Using Machine Learning Techniques

Authors

  • Vanishri Arun  Department of Information Science and Engineering, S.J.C.E., JSS S&T University, Mysuru, Karnataka, India
  • Rakshitha Hathwar  Department of Information Science and Engineering, S.J.C.E., JSS S&T University, Mysuru, Karnataka, India
  • Keerthana Basavaraj  Department of Information Science and Engineering, S.J.C.E., JSS S&T University, Mysuru, Karnataka, India
  • Sonali C H  Department of Information Science and Engineering, S.J.C.E., JSS S&T University, Mysuru, Karnataka, India
  • Chaitra J P  Department of Information Science and Engineering, S.J.C.E., JSS S&T University, Mysuru, Karnataka, India
  • Dr. Murali Krishna  Consultant Psychiatrist, FRAME, Mysuru, Karnataka, India
  • Dr. Arun Kumar B V  Department of Anaesthesiology, BGS Apollo Hospital, Mysuru, Karnataka, India

Keywords:

Electronic Health Records, Correlation, Regression Analysis, Random Forests

Abstract

Analysing the risk factors of Mental health from Electronic Health Records is a challenging task as it is difficult to assess the prevalence of diseases due to lack of culturally adapted and validated assessments. In this study, we find the risk factors of Memory deterioration using Machine Learning techniques by implementing Correlation, Regression Analysis and Random Forest algorithms on MYNAH cohort (Mysore Studies of Natal effect on Ageing and Health) which was carried out at the Epidemiological Research Unit, CSI Holdsworth Memorial Hospital, Mysuru, South India. Correlation is used to find the influence of one parameter on the other which play roles in identifying risk factors of Memory deterioration. Regression analysis helps in estimating the relationships among parameters that are used for disease prediction. Random forests or random decision forests algorithm brings extra randomness into the model to search for the best parameter among a random subset of parameters. It is an ensemble learning method for classification, regression and other tasks in which a multitude of decision trees are constructed at training time and the class is output. In Classification problem, the ensemble of simple trees vote for the most popular class. In the Regression problem, the responses are averaged to obtain an estimate of the dependent parameter. Implementation of tree ensembles has lead to significant improvement in prediction accuracy. This work facilitates health care organizations to perform analysis on sector of population prone to various diseases using Electronic Health Records and educate people regarding the risk factors of diseases to enable effective therapy at the right time and place.

References

  1. https://en.wikipedia.org/wiki/Electronic_health_record
  2. https://searchhealthit.techtarget.com/definition/electronic-health-record-EHR
  3. Vanishri Arun, Arunkumar B.V., Padma S.K., Shyam V. (2018) Evidence-Based Technological Approach for Disease Prediction Using Classification Technique. Proceedings of International Conference on Cognition and Recognition. Lecture Notes in Networks and Systems, vol 14. Springer, Singapore. DOI : https://doi.org/10.1007/978-981-10-5146-3_27.
  4. https://en.wikipedia.org/wiki/Memory_disorder
  5. http://www.human-memory.net/disorders.html
  6. https://www.statpac.com/statistics-calculator/correlation-regression.htm
  7. https://www.hbrascend.in/topics/a-refresher-on-regression-analysis-2/
  8. https://medium.com/@Synced/how-random-forest-algorithm-works-in-machine-learning-3c0fe15b6674
  9. Prof Anisor and PhD Flavia.“Analysis of performance of organization using multiple regression”, International conference of scientific Research in Computer Science. (2014).
  10. Murali Krishna et al, “Cohort Profile: The 1934–66 Mysore Birth Records Cohort in South India”, International Journal of Epidemiology, 2015, 1833–1841 doi: 10.1093/ije/dyv176.
  11. Mizanur Khondoker, Richard Dobson, Caroline Skirrow, Andrew Simmons and Daniel Stall for Alzheimer’s disease, “A comparison of machine learning methods for classification using simulation with multiple real data examples from mental health studies”, International Journal of Engineering and Innovative on Computer Systems (May 2014).
  12. Vrushali Y Kulkarni, Pradeep K Sinha, “Effective learning and classification using random forest algorithm”, International Journal of Engineering and Innovative on Computer Systems (2014).
  13. Arun et al, “Disease Classification and Prediction using Principal Component Analysis and Ensemble Classification Framework “, International Journal of Control theory and Applications, 2017, ISSN: 0974-5572, Vol 10, no. 14.
  14. K.Rajeswari, Dr.V.Vaithyanathan and Shailaja V Pedi, “Feature Selection for classification on medical Data Mining”, International Journal of Engineering and Innovative on Computer Systems. (April 2013).
  15. Bhagyashree S R et al., “Diagnosis of Dementia by Machine learning methods in Epidemiological studies: a pilot exploratory study from south India”, Soc Psychiatry Psychiatr Epidemiol. 2018 Jan;53(1):77-86. doi: 10.1007/s00127-017-1410-0. Epub 2017 Jul 11.
  16. Sumathi M.R., “Prediction of Mental Health Problems Among Children Using Machine Learning Techniques”, (IJACSA) International Journal of Advanced Computer Science and Applications, Vol. 7, No. 1, 2016.
  17. Aram So et al, “Early Diagnosis of Dementia from Clinical Data by Machine Learning Techniques” Applied Sciences, ISSN 2076-3417.
  18. Ana Luiza Dallora et al., “Machine learning and microsimulation techniques on the prognosis of dementia: A systematic literature review”, PLOS one, 2017, https://doi.org/10.1371/journal.pone.0179804

Downloads

Published

2018-05-08

Issue

Section

Research Articles

How to Cite

[1]
Vanishri Arun, Rakshitha Hathwar, Keerthana Basavaraj, Sonali C H, Chaitra J P, Dr. Murali Krishna, Dr. Arun Kumar B V, " Risk Factor Analysis of Diseases Using Machine Learning Techniques, IInternational Journal of Scientific Research in Computer Science, Engineering and Information Technology(IJSRCSEIT), ISSN : 2456-3307, Volume 4, Issue 6, pp.318-324, May-June-2018.