Impossibility of Distributed Consensus with One Faulty Process

Abstract

📜 Abstract

The consensus problem involves an asynchronous system of processes, some of which may be unreliable. The problem is to devise an algorithm in which the reliable processes agree on a binary value. This paper shows that every protocol for this problem has the possibility of nontermination, even with only one faulty process. This impossibility result stems from certain scenarios referred to as "valent" and "bivalent" configurations and applies to a wide class of failure models.

Description

✨ Summary

The paper titled “Impossibility of Distributed Consensus with One Faulty Process” by Michael J. Fischer, Nancy A. Lynch, and Michael S. Paterson is a landmark contribution to the field of distributed systems and computer science theory. It proves that distributed consensus is impossible in an asynchronous network if even a single process is faulty. This groundbreaking result is widely known as the FLP impossibility and has influenced a large body of subsequent research and practical applications in distributed computing.

The work disrupted the way consensus protocols were understood and paved the way for significant advancements in how distributed systems are designed and analyzed. Researchers have built upon this foundational result to develop Byzantine fault tolerance and consensus algorithms under different network assumptions, such as partial synchrony.

Key papers that reference this work include “Practical Byzantine Fault Tolerance” by Miguel Castro and Barbara Liskov (https://pmg.csail.mit.edu/papers/osdi99.pdf), which tackles practical implications of consensus in real-world systems, and “Viewstamped Replication: A New Primary Copy Method to Support Highly-Available Distributed Systems” (https://dspace.mit.edu/handle/1721.1/6455), which expands on strategies for fault-tolerant operation in distributed environments. Additionally, Leslie Lamport’s research on Paxos (http://research.microsoft.com/en-us/um/people/lamport/pubs/paxos-simple.pdf) provides a mechanism for achieving consensus in asynchronous networks, influenced by the impossibility result.

Despite its initial perception as a limitation, the impossibility result profoundly shaped distributed systems’ theory, leading to innovative fault-tolerant algorithms that address its constraints under different assumptions. The FLP impossibility has cemented its place as a critical theoretical baseline in distributed systems literature.