Disease Prediction by Machine Learning Over Big Data Lung Cancer

Authors

  • T. Shanmuga Priya  Department of Computer Science, Alagappa University, Karaikudi, Tamil Nadu, India
  • Dr. T. Meyyappan  Department of Computer Science, Alagappa University, Karaikudi, Tamil Nadu, India

DOI:

https://doi.org/10.32628/CSEIT206669

Keywords:

Data mining, Lung Cancer Naive Bayes (NB), Support vector Machine (SVM), Random forest

Abstract

Lung Cancer is one of the deadly diseases in the world today. Lung Cancer is caused because of some genetic factors and/or environmental factors and/or today’s modern lifestyle. Lung cancer has become the primary reason of death in developed countries. The majority effective way to decrease lung cancer death is to detect it earlier. The in advance detection of cancer is not easier method but if it is detecte it is curable. Various works have been done in predicting lung cancer different data mining approach and algorithm were adopt by different people. All work has some limits such as lack of intelligent prediction, and incompetent in structure that forced to take up this problem and to implement the Data mining based cancer prediction System (DMBCPS). This has proposed the Lung cancer prediction system based on data mining. This system is validated by comparing its predicted results with patient’s prior medical information and it was analyzed by using weka tool system. We analyzed the lung cancer prediction using classification algorithm such as Naive Bayes, SVM and Random forest algorithm. The dataset have 782 instances and 31 attributes. The main aim of this paper is to provide the earlier warning to the users and the performance analysis of the classification algorithms.

References

  1. Nikita Jain,Vishal Srivastava, “Data Mining techniques : A survey paper” , International Journal of Research in Engineerning and Technology, pp. 116-119, 2013.
  2. M.S.B PhridviRaj, C.V. GuruRao, “ Data Mining – Past present and future data streams,” Elsevier, pp. 256-264, 2013.
  3. K.Kameshwaran, K. Malarvizhi, “Survey on Clustering Techniques in Data Mining,” International Journal of Computer Science and Information Technologies, pp.2272-2276, 2014.
  4. Gunjan Verma, Vineeta Verma, “Role and Application of Genetic Algorithm in Data Mining,” International Journal of Computer Application, pp. 5-8, 2012.
  5. Sharaf Ansari,Sailendra Chetlur, Srikanth Prabhu, N. Gopalakrishna Kini, Govardhan Hegde, Yusuf Hyder, “An Overview of Clustering Analysis Techniques used in Data Mining ,” International Journal of Emerging Technology
  6. Aastha Joshi, Rajneet Kaur, “ A Review: Comparative Study of Various Clustering Techniques in Data Mining,” International Journal of Advanced Research in Computer Science and Software Engineering, pp.55-57,2013.
  7. Manoj Kumar, Mohammad Husian, Naveen Upreti, Deepti Gupta, Genetic Algorithm “: Review and Application,” International Journal of Information Technology and Knowledge Management, pp.451-454, 2010.
  8. L.E. Agustın-Blas, S. Salcedo-Sanz, S. Jimenez-Fernandez, L. Carro- Calvo, J. Del Ser, J.A. Portilla-Figueras K. Elissa, “A new grouping genetic algorithm for clustering problems,” Elsevier, pp.9695-9703, 2012.
  9. Honga Tzung-Pei, Chun-Hao Chenc, Feng-Shih Lin, “Using group genetic algorithm to improve performance of attribute clustering,” Elsevier, pp.1-8, 2015.
  10. Danial Gomes Ferrari, Leandro Numes de Castro, “Clustering algorithm selection by meta-learning systems: A new distance based problems characterization and ranking combination methods,” Elsevier, pp.181-194, 2015.
  11. Rajashree Dash and Rasmita Dash, “Comparative analysis of K-means and Genetic algorithm based data clustering,” International Journal of Advanced Computer and Mathematical Sciences, pp.257-265, 2012.
  12. Edvin Aldana-Bobadhilla, Angel Kuri-Morales, “A Clustering based method on the maximum entropy principle,” Entropy Article, pp. 151-180, 2015.
  13. Kannuri Lahari, M. Ramakrishna Murty, and Suresh C. Satapathy, “Study of Classification Algorithm for Lung Cancer Prediction,” Advances in Intelligent Systems and Computing,” pp. 338, 2015.
  14. Rahila H. Sheikh, M. M.Raghuwanshi, Anil N. Jaiswal, “Predicting Lung Cancer Survivability using SVM and Logistic Regression Algorithms,” IEEE, pp.314-319, 2008.
  15. K.Arun Prabha, R.Saranya, “Refinement of K-means clustering using Genetic algorithm,” Journal of Computer Application, pp. 256-261, 2011.
  16. M.Anusha and J.G.R.Sathiaseelan, “An Enhanced K-means Genetic Algorithms for Optimal Clustering”, IEEE, pp.580-584, 2014. 17M.Anusha and J.G.R.Sathiaseelan, “An Improved K-Means Genetic Algorithm for Multi-objective Optimization”, International Journal of Applied Engineering Research, pp. 228-231, 2015.

Downloads

Published

2021-01-30

Issue

Section

Research Articles

How to Cite

[1]
T. Shanmuga Priya, Dr. T. Meyyappan, " Disease Prediction by Machine Learning Over Big Data Lung Cancer" International Journal of Scientific Research in Computer Science, Engineering and Information Technology(IJSRCSEIT), ISSN : 2456-3307, Volume 7, Issue 1, pp.16-24, January-February-2021. Available at doi : https://doi.org/10.32628/CSEIT206669