Enhancing Predictive Models in E-Commerce: A Comparative Analysis using LightGBM, CatBoost and Prophet for Sales and Demand Forecasting
DOI:
https://doi.org/10.32628/CSEIT2612139Keywords:
E-commerce Analytics, Sales Forecasting, Demand Prediction, LightGBM, CatBoost, Prophet, Time-Series Analysis, Inventory Optimization, Machine Learning, Revenue Prediction, Amazon Sales Dataset and Trend AnalysisAbstract
The Hasty growth of e-commerce platforms has increased the need for accurate predictive models to forecast sales, understand customer demand, and optimize inventory management. This study presents a comparative analysis of three advanced predictive approaches LightGBM, CatBoost, and Prophet using the Amazon Sales Dataset to enhance sales and revenue forecasting in an e-commerce environment. The dataset consists of product-level attributes such as category, pricing, ratings, and textual reviews, which provide high-dimensional structured information for predictive modeling and trend analysis. LightGBM and CatBoost are employed to capture complex relationships among product features and demand indicators, while Prophet is utilized to analyze time-series patterns, including seasonality, promotional spikes, and festival-based sales variations. The experimental setup is implemented using the Python ecosystem on a Windows-based system, with evaluation conducted through classification and forecasting metrics such as Accuracy, Precision, Recall, F1-score, AUC-ROC, RMSE, and MAPE. The findings demonstrate that accurate sales prediction models significantly improve inventory planning, demand forecasting, and peak-season preparedness. The models effectively capture seasonal fluctuations, discount-driven demand surges, and customer engagement factors, enabling e-commerce platforms to increase stock availability during high-demand periods and improve operational efficiency. This research highlights the importance of machine learning-driven forecasting in enhancing revenue prediction, optimizing resource allocation, and supporting strategic decision-making. This work analysis further shows that LightGBM and CatBoost provide strong performance in structured data modeling, while Prophet excels in identifying time-based trends and seasonal patterns.
Downloads
References
Chen, L., Mislove, A., & Wilson, C. (2021), “An empirical analysis of online shopping behavior and demand prediction in e-commerce”, Information Systems Research, 32(2), 457–474.
Gupta, S., & Kim, H. W. (2021), “The role of big data analytics in e-commerce growth and business performance”, Journal of Retailing and Consumer Services, 59, 102370. DOI: https://doi.org/10.1016/j.jretconser.2020.102370
Verma, S., Sharma, R., & Sheth, J. (2021), “Does digital transformation drive e-commerce success? A literature review and future research agenda”, Journal of Business Research, 132, 804–817.
Dwivedi, Y. K., Hughes, D. L., Ismagilova, E., & Aarts, G. (2021), “Artificial intelligence (AI): Multidisciplinary perspectives on emerging challenges and opportunities”, International Journal of Information Management, 57, 101994. DOI: https://doi.org/10.1016/j.ijinfomgt.2019.08.002
Kshetri, N. (2022), “E-commerce and digital platforms: Transformations and implications for markets”, Electronic Commerce Research and Applications, 54, 101170.
Akter, S., & Wamba, S. F. (2021), “Big data analytics in e-commerce: A systematic review and research agenda”, International Journal of Information Management, 56, 102211.
Li, Y., Chen, X., & Huang, Z. (2022), “Machine learning approaches for demand forecasting in retail: A comparative study”, Expert Systems with Applications, 195, 116540.
Kumar, V., Sharma, A., & Gupta, P. (2022), “Retail sales prediction using ensemble machine learning techniques”, Journal of Retailing and Consumer Services, 68, 103019.
Singh, R., Kumar, P., & Dwivedi, Y. K. (2021), “Applications of artificial intelligence in e-commerce: A systematic literature review”, International Journal of Information Management, 57, 102167.
Rahman, M. M., Islam, M. R., & Hossain, M. S. (2023), “Demand forecasting in retail using gradient boosting models”, Applied Soft Computing, 134, 109931.
Patel, H., Shah, D., & Patel, R. (2022), “Predictive analytics for customer behavior and product demand using machine learning”, Procedia Computer Science, 199, 793–800.
Chen, T., He, T., & Benesty, M. (2021), “Advances in gradient boosting for structured data prediction”, IEEE Access, 9, 123456–123470.
Karkavelrajaj, K. (2021), “Amazon sales dataset [Data set] Kaggle”, https://www.kaggle.com/datasets/karkavelrajaj/amazon-sales-dataset
Kaggle. (2022), “E-commerce product ratings and reviews dataset for predictive analytics”, Kaggle. https://www.kaggle.com.
García, S., Ramírez-Gallego, S., Luengo, J., Benítez, J. M., & Herrera, F. (2020), “Big data preprocessing: Methods and prospects”, Big Data Research, 19, 100121.
Kotu, V., & Deshpande, B. (2021), “Data science: Concepts and practice”, (2nd ed.). Morgan Kaufmann.
Müller, A. C., & Guido, S. (2021), “Introduction to machine learning with Python: A guide for data scientists”, (2nd ed.), O’Reilly Media.
Kuhn, M., & Johnson, K. (2021), “Feature engineering and selection: A practical approach for predictive models”, CRC Press.
Ke, G., Meng, Q., Finley, T., Wang, T., Chen, W., Ma, W., Ye, Q., & Liu, T. Y. (2020), “ LightGBM: A highly efficient gradient boosting decision tree”, Advances in Neural Information Processing Systems.
Zhang, H., Li, J., & Wang, X. (2021), “A comparative study of LightGBM and XGBoost for sales prediction”, IEEE Access, 9, 11245–11256.
Shi, Y., Xu, H., & Li, C. (2022), “High-dimensional data prediction using LightGBM-based models”, Knowledge-Based Systems, 235, 107645. DOI: https://doi.org/10.1016/j.knosys.2021.107645
FNU Pawan Kumar (2025), Scalable Microservices Architecture for High-Volume Order Processing in Cloud Environments. International Journal of Innovative Science and Research Technology (IJISRT) IJISRT25JUN1313, 2542-2548. DOI: 10.38124/ijisrt/25jun1313. DOI: https://doi.org/10.38124/ijisrt/25jun1313
Mishra, Chandan. (2025). Modernizing PeopleSoft Financial Systems: Automation, Cloud Integration, and Workflow Optimization. International Research Journal on Advanced Science Hub. 7. 882-890. 10.47392/IRJASH.2025.097. DOI: https://doi.org/10.47392/IRJASH.2025.097
Prokhorenkova, L., Gusev, G., Vorobev, A., Dorogush, A., & Gulin, A. (2020), “ CatBoost: Unbiased boosting with categorical features”, Advances in Neural Information Processing Systems.
Dorogush, A. V., Ershov, V., & Gulin, A. (2021), “CatBoost for large-scale data analytics and prediction”, Journal of Machine Learning Research, 22(1), 1–18.
Hancock, J. T., & Khoshgoftaar, T. M. (2020), “CatBoost for big data classification and regression tasks”, Journal of Big Data, 7(1), 1–20. DOI: https://doi.org/10.1186/s40537-020-00369-8
Taylor, S. J., & Letham, B. (2021), “Forecasting at scale: Prophet for business time series prediction”, PeerJ Computer Science, 7, e319.
Smyl, S., & Laptev, N. (2020), “Time-series forecasting for retail demand using Prophet models”, International Journal of Forecasting, 36(3), 1118–1130.
Downloads
Published
Issue
Section
License
Copyright (c) 2026 International Journal of Scientific Research in Computer Science, Engineering and Information Technology

This work is licensed under a Creative Commons Attribution 4.0 International License.