System Design Space
Knowledge graphSettings

Updated: March 25, 2026 at 3:00 AM

Consensus: Paxos and Raft

expert

How systems negotiate a single meaning: quorums, two-phase Paxos, and leader-centric Raft.

Consensus is best seen not as a badge of engineering sophistication, but as an expensive tool for the cases where the system truly needs one agreed state history.

In practice, this chapter helps separate situations where Raft or Paxos are justified from those where the team is only buying extra latency and complexity without gaining anything critical in correctness.

In interviews and architecture discussions, it is strongest when you talk less about algorithm names and more about the cost of quorum: write latency, recovery behavior, leader failover, and debugging overhead.

Practical value of this chapter

Design in practice

Clarifies where consensus is truly required versus where simpler guarantees are enough.

Decision quality

Helps choose Raft/Paxos with awareness of workload shape and fault-tolerance goals.

Interview articulation

Enables concise explanation of quorum, term, and commit behavior.

Risk and trade-offs

Highlights consensus cost: write latency, recovery complexity, and debugging overhead.

Related book

Designing Data‑Intensive Applications

The chapter on consensus and replication is a must-read for understanding Paxos/Raft.

Read review

Consensus is a way to agree on a single meaning in a distributed system despite failures, delays and network divisions. Without consensus, you cannot reliably elect a leader, synchronize metadata, or provide linear writes to a quorum cluster.

Foundation

TCP protocol

Consensus relies on network rounds and stable transport.

Читать обзор

Where is consensus needed?

Leader selection

Cluster coordination and coordinator/primary assignment.

Metadata

Configurations, membership, data schema and routing.

Consistent entry

Linearizable operations during replication.

Paxos

Paxos is a classic Lamport algorithm. It guarantees the selection of a single value through a two-phase protocol and quorum acceptors. In real systems, Multi-Paxos with a dedicated leader is often used.

Interactive Paxos diagram

Select a step to highlight active participants and messages.

Proposer
Acceptors
Learners
Prepare(n)to an acceptor quorum
Promise(n, v?)reply to proposer
Accept(n, v)value proposal
Accepted(n, v)notify learners

Multi‑Paxos

Paxos optimization for command flow: the leader takes on the Prepare phase once, and then only performs the Accept round for each entry.

What does it give

  • Fewer network rounds per entry
  • Higher throughput
  • The Leader Makes Progress Easy in Conflicts

Multi‑Paxos operation modes

Select a phase to see how the message flow changes.

Steady state

Each new log entry uses only the Accept round, reducing RTT.

Accept(n, v) to quorum
Accepted responses
Higher throughput

Raft

Raft was designed to be a "understandable consensus". It divides the task into leader election, log replication, and membership management. This makes the protocol easier to explain and implement.

Raft node states

Switch state to see how node role and message flow change.

Leader

Accepts client commands and replicates them to followers.

Client command
AppendEntries
Commit index

Paxos vs Raft

Paxos

  • More difficult to understand and implement
  • More formal and "academic"
  • Often hidden behind Multi‑Paxos

Raft

  • Easier to explain to the team and support
  • Explicit leader-centered model
  • Used in etcd, Consul, CockroachDB

Key Findings

  • Consensus is needed for consistent value selection in the face of disruption.
  • Paxos is fundamental but complex; Raft is engineering friendly.
  • Both protocols require a majority quorum and are tolerant of partial failures.

Consensus makes the system more reliable, but increases latency and complexity. Use it only where strict consistency is truly needed.

Related chapters

Enable tracking in Settings