Real-Time Data Processing Pipelines: Enhancing Decision Intelligence with Apache Spark and Kafka
DOI:
https://doi.org/10.32628/CSEIT25112356Keywords:
Real-time data processing, Apache Spark, Kafka, decision intelligence, big data, stream processing, data pipeline, distributed systemsAbstract
In this era of data-driven workflows, the need for real-time data processing and analysis is essential to stay competitive. The explosion in the amount of data generated and the requirement for actions to be taken as quickly as possible has led organizations to adopt a real-time data processing pipeline. In this paper, we discuss how Apache Spark and Kafka are modern big data technologies promoting decision intelligence by means of building fast data-processing pipelines. We cover the following topics in this article; Apache Spark and Kafka integrations, Stream Processing, Concepts, challenges and best practices for implementing real time data pipelines. They outline the implications and practical applications for developers working with these cutting-edge technologies, alongside case studies across different sectors including finance, healthcare, and e-commerce.
Downloads
References
Ghosh, S., & Saha, A. (2016). Kafka: A Distributed Messaging System for Real-Time Analytics. ACM Computing Surveys, 48(3).
Zaharia, M., Chowdhury, M., Das, T., Dave, A., & Ma, J. (2012). Resilient Distributed Datasets: A Fault-Tolerant Abstraction for In-Memory Cluster Computing. ACM SIGOPS Operating Systems Review, 46(1).
Kreps, J., Narkhede, N., & Rao, J. (2011). Kafka: A Distributed Messaging System for Log Processing. Proceedings of the 6th International Workshop on Networking Meets Databases (NetDB).
Marz, N., & Warren, J. (2015). Big Data: Principles and Paradigms. O'Reilly Media, Inc.
Chen, X., & Xie, Y. (2020). Structured Streaming in Apache Spark: Towards Real-Time Stream Processing. ACM Transactions on Big Data Analytics, 9(2).
Downloads
Published
Issue
Section
License
Copyright (c) 2025 International Journal of Scientific Research in Computer Science, Engineering and Information Technology

This work is licensed under a Creative Commons Attribution 4.0 International License.