Enhancing Medicare Fraud Detection through ML: Addressing Class Imbalance with SMOTE-ENN

Cheni Sruneethi; P Chandra Prakash

Authors

Cheni Sruneethi Assistant Professor, Department of MCA, Annamacharya Institute of Technology & Sciences, Tirupati, Andhra Pradesh, India Author
P Chandra Prakash Post Graduate, Department of MCA, Annamacharya Institute of Technology & Sciences, Tirupati, Andhra Pradesh, India Author

Keywords:

SMOTE-ENN, XGBoost, AdaBoost, LGBM, Decision Tree, Logistic Regression, Random Forest classifier

Abstract

Medicare fraud poses serious challenges, leading to considerable financial losses and damaging the integrity of healthcare systems. Conventional approaches to fraud detection often fall short due to the complex and ever-changing tactics used by fraudsters. This project focuses on improving the detection of Medicare fraud by utilizing machine learning techniques, particularly addressing the problem of class imbalance where fraudulent claims are far fewer than legitimate ones. We are developing a classification system that can differentiate between fraudulent and non-fraudulent Medicare claims using several advanced ML algorithms. These include XGBoost, AdaBoost, LightGBM, Decision Tree, Logistic Regression, and Random Forest classifiers. To tackle the issue of class imbalance, we implement SMOTE-ENN, which helps to balance the dataset and enhances the performance of our models. Our experiments reveal that using SMOTE-ENN significantly boosts the detection rate of fraudulent claims. By evaluating the models on both the imbalanced and balanced datasets, we observe notable improvements in essential metrics such as accuracy, precision, recall, and F1-score. Overall, our findings suggest that integrating SMOTE-ENN with ensemble learning techniques offers a strong method for detecting Medicare fraud effectively.

📊 Article Downloads

References

Alam, M. S., Rai, P., Tiwari, R. K., Pandey, V., & Hussain, S. (2023). Evaluation of Healthcare Data in ML Model Used in Fraud Detection. Communications in Computer and Information Science, 1822 CCIS, 29–39. https://doi.org/10.1007/978-3-031-37303-9_3

Amponsah, I. A., & Amponsah, I. A. (2024). Pandemic profiteering at a time of crisis: Using python to detect fraud in covid-19 testing and treatment payments. Https://Gsconlinepress.Com/Journals/Gscarr/Sites/Default/Files/GSCARR-2024-0183.Pdf, 19(2), 208–218. https://doi.org/10.30574/GSCARR.2024.19.2.0183

Bauder, R. A., & Khoshgoftaar, T. M. (2020). A study on rare fraud predictions with big Medicare claims fraud data. Intelligent Data Analysis, 24(1), 141–161. https://doi.org/10.3233/IDA-184415

Bounab, R., Guelib, B., & Zarour, K. (2024). A Novel ML Approach For handling Imbalanced Data: Leveraging SMOTE-ENN and XGBoost. PAIS 2024 - Proceedings: 6th International Conference on Pattern Analysis and Intelligent Systems. https://doi.org/10.1109/PAIS62114.2024.10541220

Chirchi, K. E., & Kavya, B. (2024). Unraveling Patterns in Healthcare Fraud through Comprehensive Analysis. Proceedings of the 18th INDIAcom; 2024 11th International Conference on Computing for Sustainable Global Development, INDIACom 2024, 585–591. https://doi.org/10.23919/INDIACOM61295.2024.10498727

Gong, J., Zhang, H., & Du, W. (2020). Research on Integrated Learning Fraud Detection Method Based on Combination Classifier Fusion (THBagging): A Case Study on the Foundational Medical Insurance Dataset. Electronics 2020, Vol. 9, Page 894, 9(6), 894. https://doi.org/10.3390/ELECTRONICS9060894

Hamid, Z., Khalique, F., Mahmood, S., Daud, A., Bukhari, A., & Alshemaimri, B. (2024). Healthcare insurance fraud detection using data mining. BMC Medical Informatics and Decision Making, 24(1), 1–24. https://doi.org/10.1186/S12911-024-02512-4/TABLES/9

Hancock, J. T., & Khoshgoftaar, T. M. (2022). Hyperparameter Tuning for Medicare Fraud Detection in Big Data. SN Computer Science, 3(6), 1–13. https://doi.org/10.1007/S42979-022-01348-X/METRICS

Herland, M., Bauder, R. A., & Khoshgoftaar, T. M. (2020). Approaches for identifying U.S. medicare fraud in provider claims data. Health Care Management Science, 23(1), 2–19. https://doi.org/10.1007/S10729-018-9460-8/TABLES/18

Lekkala, L. R., & Lekkala, L. R. (2023). Importance of ML Models in Healthcare Fraud Detection. Voice of the Publisher, 9(4), 207–215. https://doi.org/10.4236/VP.2023.94017

ML Methods to Detect Medicare Fraud and Abuse in US Healthcare - ProQuest. (n.d.). Retrieved September 25, 2024, from https://www.proquest.com/openview/e78cd6cdc8574f1391176a5c59a4f2e7/1?pq-origsite=gscholar&cbl=18750&diss=y

Matloob, I., Khan, S., ur Rahman, H., & Hussain, F. (2020). Medical Health Benefit Management System for Real-Time Notification of Fraud Using Historical Medical Records. Applied Sciences 2020, Vol. 10, Page 5144, 10(15), 5144. https://doi.org/10.3390/APP10155144

Nabrawi, E., & Alanazi, A. (2023a). Fraud Detection in Healthcare Insurance Claims Using ML. https://doi.org/10.3390/risks11090160

Nabrawi, E., & Alanazi, A. (2023b). Fraud Detection in Healthcare Insurance Claims Using ML. Risks 2023, Vol. 11, Page 160, 11(9), 160. https://doi.org/10.3390/RISKS11090160

Optimizing Efficiency and Accuracy in Medicare and Medicaid Fraud Detection Through Artificial Intelligence and ML - ProQuest. (n.d.). Retrieved September 25, 2024, from https://www.proquest.com/openview/3a2e20814cfe86637a413f896a67e79a/1?pq-origsite=gscholar&cbl=18750&diss=y

Sayem, M. A., Taslima, N., Sidhu, G. S., & Ferry, Dr. J. W. (2024). A QUANTITATIVE ANALYSIS OF HEALTHCARE FRAUD AND UTILIZATION OF AI FOR MITIGATION. International Journal of Business and Management Sciences, 4(07), 13–36. https://doi.org/10.55640/IJBMS-04-07-03

Settipalli, L., & Gangadharan, G. R. (2023). WMTDBC: An unsupervised multivariate analysis model for fraud detection in health insurance claims. Expert Systems with Applications, 215, 119259. https://doi.org/10.1016/J.ESWA.2022.119259

Shekhar, S., Leder-Luis, J., & Akoglu, L. (2023). Unsupervised ML for Explainable Health Care Fraud Detection. https://doi.org/10.3386/W30946

Smita, K., Pranathi, D., Pravalika, D., Supraja, E., & Harika, G. (n.d.). Detection of Fraudulent Medicare Providers using Decision Tree and Logistic Regression.

Yoo, Y., Shin, J., & Kyeong, S. (2023). Medicare Fraud Detection Using Graph Analysis: A Comparative Study of ML and Graph Neural Networks. IEEE Access, 11, 88278–88294. https://doi.org/10.1109/ACCESS.2023.3305962

Enhancing Medicare Fraud Detection through ML: Addressing Class Imbalance with SMOTE-ENN

Authors

Keywords:

Abstract

📊 Article Downloads

References

Downloads

Published

Issue

Section

License

How to Cite

IssueDate

RightSideBlock

Latest publications