Reaching Consensus for Async Distributed Systems : A Guide to Harmonized Data Decision-Making

Authors

  • Gnana Teja Reddy  Software Engineer, Google, USA
  • Nelavoy Rajendra  San Francisco Bay Area, USA

Keywords:

Consensus, Distributed Systems, Fault Tolerance, Paxos, Raft, Blockchain, Consistency, Byzantine Fault Tolerance (BFT).

Abstract

Consensus algorithms must be highly reliable in distributed systems due to their vast use in asynchronous environments for fault tolerance and consistent data consistency. These systems require that multiple nodes, typically spread across large areas, replicate a common view or value, even in the presence of hardware or network failures or a condition known as Byzantine failure. This paper discusses consensus mechanisms essential in cloud environments, blockchains, and real-time data management. This article reviews consensus algorithms such as Paxos, Raft, and Byzantine Fault Tolerance and discusses their working model, advantages, and challenges. Paxos is safe under crash failures but may prove tough to implement. Raft also makes leadership and log replication easy while making reliability practical in real-world applications through BFT, preventing the influence of antagonistic actors in secure areas. Issues that might hinder the consensus process include network ruling, leader elections, and security threats. A comprehensive analysis of technological consensus approaches, including quorum-based decision-making, conflict resolution, and observability practices, is provided. The paper discusses the various developments of consensus to establish the importance of distributed applications such as distributed databases, blockchain systems, and microservices orchestration for integrity and availability. Growing trends like HCM, Layer 2 solutions like Rollups and State Channels, and serverless infrastructure imply the continued evolution of the space. This guide is for engineers, architects, and researchers interested in consensus to build systems capable of handling the operational requirements that characterize distributed systems.

References

  1. Almeida, J., Rufino, J., Alam, M., & Ferreira, J. (2019). A survey on fault tolerance techniques for wireless vehicular networks. Electronics, 8(11), 1358.
  2. Barnickel, J. (2013). Authentication and identity privacy in the wireless domain (Doctoral dissertation, Aachen, Techn. Hochsch., Diss., 2013).
  3. Beard, J. C., Li, P., & Chamberlain, R. D. (2015, February). RaftLib: a C++ template library for high performance stream parallel processing. In Proceedings of the Sixth International Workshop on Programming Models and Applications for Multicores and Manycores (pp. 96-105).
  4. Belotti, M., Božić, N., Pujolle, G., & Secci, S. (2019). A vademecum on blockchain technologies: When, which, and how. IEEE Communications Surveys & Tutorials, 21(4), 3796-3838.
  5. Bernabe, J. B., Canovas, J. L., Hernandez-Ramos, J. L., Moreno, R. T., & Skarmeta, A. (2019). Privacy-preserving solutions for blockchain: Review and challenges. Ieee Access, 7, 164908-164940.
  6. Bernstein, P. A., & Newcomer, E. (2009). Principles of transaction processing (2nd ed.). Morgan Kaufmann.
  7. Birman, K. P. (1993). The process group approach to reliable distributed computing. Communications of the ACM, 36(12), 37-53.
  8. Bracha, G., & Toueg, S. (1985). Asynchronous consensus and broadcast protocols. Journal of Algorithms, 4(4), 557–573.
  9. Brewer, E. (2012). CAP twelve years later: How the "rules" have changed. Computer, 45(2), 23–29.
  10. Castro, M., & Liskov, B. (1999). Practical Byzantine fault tolerance. In Proceedings of the Third Symposium on Operating Systems Design and Implementation (pp. 173-186).
  11. Castro, M., & Liskov, B. (1999). Practical Byzantine Fault Tolerance. In Proceedings of the Third Symposium on Operating Systems Design and Implementation (OSDI '99).
  12. Chandra, T. D., Griesemer, R., & Redstone, J. (2007). Paxos made live: An engineering perspective. In Proceedings of the twenty-sixth annual ACM symposium on Principles of distributed computing (pp. 398–407).
  13. Chandy, K. M., & Lamport, L. (1985). Distributed snapshots: Determining global states of distributed systems. ACM Transactions on Computer Systems (TOCS), 3(1), 63-75.
  14. Copeland, C., & Zhong, H. (2016). Tangaroa: a byzantine fault tolerant raft. Stanford University.
  15. Corbett, J. C., Dean, J., Epstein, M., Fikes, A., Frost, C., Furman, J., & Woodford, D. (2012). Spanner: Google’s globally-distributed database. In OSDI (Vol. 12, pp. 261-264).
  16. Correia Júnior, A. T. (2010). Practical database replication.
  17. Cristian, F. (1991). Synchronous and asynchronous recovery primitives. Proceedings of the Twenty-First IEEE International Symposium on Fault-Tolerant Computing, 82–89.
  18. Dragoni, N., Giallorenzo, S., Lafuente, A. L., Mazzara, M., Montesi, F., Mustafin, R., & Safina, L. (2017). Microservices: yesterday, today, and tomorrow. In Present and Ulterior Software Engineering (pp. 195-216). Springer, Cham.
  19. Fischer, M. J., Lynch, N. A., & Paterson, M. S. (1985). Impossibility of distributed consensus with one faulty process. Journal of the ACM, 32(2), 374-382.
  20. Fischer, M. J., Lynch, N. A., & Paterson, M. S. (1985). Impossibility of Distributed Consensus with One Faulty Process. Journal of the ACM (JACM), 32(2), 374-382.
  21. Gifford, D. K. (1979). Weighted voting for replicated data. In Proceedings of the seventh ACM symposium on Operating systems principles (pp. 150-162).
  22. Gilbert, S., & Lynch, N. (2002). Brewer’s conjecture and the feasibility of consistent, available, partition-tolerant web services. ACM SIGACT News, 33(2), 51-59.
  23. Gill, A. (2018). Developing a real-time electronic funds transfer system for credit unions. International Journal of Advanced Research in Engineering and Technology (IJARET), 9(1), 162–184. [Primary Source]
  24. Gray, J., & Lamport, L. (2006). Consensus on transaction commit. ACM Transactions on Database Systems, 31(1), 133–160.
  25. Kemme, B., Schiper, A., Ramalingam, G., & Shapiro, M. (2014). Dagstuhl seminar review: Consistency in distributed systems. ACM SIGACT News, 45(1), 67-89.
  26. King, V., Saia, J., Sanwalani, V., & Vitta, E. (2011). Scalable leader election. In Distributed Computing (pp. 490–502). Springer.
  27. Kraft, D. (2016). Difficulty control for blockchain-based consensus systems. Peer-to-peer Networking and Applications, 9, 397-413.
  28. Kreps, J., Narkhede, N., & Rao, J. (2011). Kafka: a distributed messaging system for log processing. In Proceedings of the NetDB (pp. 1-7).
  29. Kumar, A. (2019). The convergence of predictive analytics in driving business intelligence and enhancing DevOps efficiency. International Journal of Computational Engineering and Management, 6(6), 118-142. https://ijcem.in/wp-content/uploads/THE-CONVERGENCE-OF-PREDICTIVE-ANALYTICS-IN-DRIVING-BUSINESS-INTELLIGENCE-AND-ENHANCING-DEVOPS-EFFICIENCY.pdf
  30. Lamport, L. (1998). The part-time parliament. ACM Transactions on Computer Systems, 16(2), 133–169.
  31. Lamport, L. (1998). The Part-Time Parliament. ACM Transactions on Computer Systems, 16(2), 133-169.
  32. Lynch, N. (1996). Distributed Algorithms. Morgan Kaufmann.
  33. Merkle, R. (1988). A Digital Signature Based on a Conventional Encryption Function. In C. Pomerance (Ed.), Advances in Cryptology — CRYPTO’ 87 (pp. 369-378). Springer.
  34. Misra, J., & Chandy, K. M. (1982). Distributed simulation: A case study in design and verification of distributed programs. IEEE Transactions on Software Engineering, SE-5(5), 440–452.
  35. Nakamoto, S. (2008). Bitcoin: A peer-to-peer electronic cash system. Retrieved from https://bitcoin.org/bitcoin.pdf
  36. Nakamoto, S. (2008). Bitcoin: A Peer-to-Peer Electronic Cash System.
  37. Nyati, S. (2018). Revolutionizing LTL Carrier Operations: A Comprehensive Analysis of an Algorithm-Driven Pickup and Delivery Dispatching Solution. International Journal of Science and Research (IJSR), 7(2), 1659–1666. https://www.ijsr.net/getabstract.php?paperid=SR24203183637
  38. Nyati, S. (2018). Transforming Telematics in Fleet Management: Innovations in Asset Tracking, Efficiency, and Communication. International Journal of Science and Research (IJSR), 7(10), 1804-1810. https://www.ijsr.net/getabstract.php?paperid=SR24203184230
  39. Oki, B. M., & Liskov, B. (1988). Viewstamped replication: A new primary copy method to support highly-available distributed systems. Proceedings of the Seventh Annual ACM Symposium on Principles of Distributed Computing, 8–17.
  40. Ongaro, D., & Ousterhout, J. (2014). In search of an understandable consensus algorithm (Raft). In USENIX Annual Technical Conference (pp. 305-319).
  41. Pease, M., Shostak, R., & Lamport, L. (1980). Reaching agreement in the presence of faults. Journal of the ACM, 27(2), 228–234.
  42. Schneider, F. B. (1990). Implementing fault-tolerant services using the state machine approach: A tutorial. ACM Computing Surveys, 22(4), 299–319.
  43. Sheehy, J. (2015). There is No Now: Problems with simultaneity in distributed systems. Queue, 13(3), 20-27.
  44. Tanenbaum, A. S., & van Steen, M. (2007). Distributed systems: principles and paradigms. Prentice Hall.
  45. Vukolić, M. (2012). Latency-efficient Quorum Systems. In Quorum Systems: with Applications to Storage and Consensus (pp. 81-108). Cham: Springer International Publishing.
  46. Wang, X., Sun, N., & Wickersham, K. (2017). Turning math remediation into" homeroom:" Contextualization as a motivational environment for community college students in remedial math. The Review of Higher Education, 40(3), 427-464.
  47. Yin, M., Malkhi, D., Reiter, M. K., Gueta, G. G., & Abraham, I. (2018). HotStuff: BFT consensus in the lens of blockchain. arXiv preprint arXiv:1803.05069.
  48. Zhang, I., Sharma, N. K., Szekeres, A., Krishnamurthy, A., & Ports, D. R. (2018). Building consistent transactions with inconsistent replication. ACM Transactions on Computer Systems (TOCS), 35(4), 1-37.

Downloads

Published

2020-12-30

Issue

Section

Research Articles

How to Cite

[1]
Gnana Teja Reddy, Nelavoy Rajendra, " Reaching Consensus for Async Distributed Systems : A Guide to Harmonized Data Decision-Making" International Journal of Scientific Research in Computer Science, Engineering and Information Technology(IJSRCSEIT), ISSN : 2456-3307, Volume 6, Issue 6, pp.394-418, November-December-2020.