Real-Time Data Processing Pipelines: Enhancing Decision Intelligence with Apache Spark and Kafka

Authors

  • Nishitha Reddy Nalla Software Application Engineer, WORKDAY INC, GA, USA Author

DOI:

https://doi.org/10.32628/CSEIT25112356

Keywords:

Real-time data processing, Apache Spark, Kafka, decision intelligence, big data, stream processing, data pipeline, distributed systems

Abstract

In this era of data-driven workflows, the need for real-time data processing and analysis is essential to stay competitive. The explosion in the amount of data generated and the requirement for actions to be taken as quickly as possible has led organizations to adopt a real-time data processing pipeline. In this paper, we discuss how Apache Spark and Kafka are modern big data technologies promoting decision intelligence by means of building fast data-processing pipelines. We cover the following topics in this article; Apache Spark and Kafka integrations, Stream Processing, Concepts, challenges and best practices for implementing real time data pipelines. They outline the implications and practical applications for developers working with these cutting-edge technologies, alongside case studies across different sectors including finance, healthcare, and e-commerce.

Downloads

Download data is not yet available.

References

Ghosh, S., & Saha, A. (2016). Kafka: A Distributed Messaging System for Real-Time Analytics. ACM Computing Surveys, 48(3).

Zaharia, M., Chowdhury, M., Das, T., Dave, A., & Ma, J. (2012). Resilient Distributed Datasets: A Fault-Tolerant Abstraction for In-Memory Cluster Computing. ACM SIGOPS Operating Systems Review, 46(1).

Kreps, J., Narkhede, N., & Rao, J. (2011). Kafka: A Distributed Messaging System for Log Processing. Proceedings of the 6th International Workshop on Networking Meets Databases (NetDB).

Marz, N., & Warren, J. (2015). Big Data: Principles and Paradigms. O'Reilly Media, Inc.

Chen, X., & Xie, Y. (2020). Structured Streaming in Apache Spark: Towards Real-Time Stream Processing. ACM Transactions on Big Data Analytics, 9(2).

Downloads

Published

03-03-2025

Issue

Section

Research Articles