Knowledge graphSettings

Updated: April 30, 2026 at 10:03 PM

CAP theorem

medium

Foundational chapter on the CAP theorem: network partitions, the consistency-versus-availability choice, common misconceptions, and the relationship to ACID and BASE.

CAP matters not as a classroom triangle, but as a reminder of an ugly fact: once the network partitions, a clean architecture turns into a forced choice between conflicting properties.

In practice, this chapter helps teams leave the abstract debate behind and decide what will be lost first: parts of writes, parts of reads, parts of the user flow, or operational predictability.

In interviews and architecture reviews, it is strongest when you focus not on the definition of CAP, but on the consequences for APIs, UX, service degradation, and recovery rules under network faults.

Practical value of this chapter

Design in practice

Helps choose CP or AP behavior by domain scenario instead of treating the decision as pure theory.

Decision quality

Clarifies when stale answers are acceptable and when freshness is critical to the business.

Interview articulation

Makes it easier to explain why a specific trade-off fits the workload in front of you.

Risk and trade-offs

Makes partition consequences explicit: API degradation, user impact, and operational guardrails.

Original

Telegram: Book Cube

Original post with a walkthrough of the CAP theorem.

Перейти на сайт

The CAP theorem is more than twenty-five years old, yet it is still often reduced to the shallow slogan of “pick two out of three.” Using Eric Brewer's own retrospective, this chapter separates the original idea from the meme and connects it back to practical distributed-system behavior.

Foundation

TCP protocol

Network partitions and latency directly shape the system behavior discussed by CAP.

Читать обзор

What the CAP theorem says

The classical formulation describes the limits of a distributed system at the moment a network partition appears. It does not say that the whole system permanently picks two properties forever. It says that during a loss of connectivity, the system must make a concrete choice about what it will protect first.

"The CAP theorem states that any networked shared-data system can have at most two of three desirable properties"
CConsistency

Reads observe one up-to-date state, as if the system were working from a single coherent copy of the data.

AAvailability

Every request to a healthy node receives a response, even if the answer may not reflect the newest state.

PPartition tolerance

The system keeps operating even when one part of the cluster temporarily loses contact with another part of the network.

Foundation

OSI model

Useful for reasoning about the network layers and the mechanisms that can produce partitions.

Читать обзор

CAP visualization

Interactive CAP theorem diagram

Click a property or system type to inspect how the trade-offs connect.

CAPCPAPCA
Consistency

Linearizable view of data

Availability

Responds to every request

Partition tolerance

Keeps working through network splits

System types

Important nuance

In real distributed systems partition tolerance is not optional. Network partitions are inevitable, so the practical question is how the system behaves on the CP/AP spectrum.

Presentation

PODC 2000

Eric Brewer's original symposium presentation.

Перейти на сайт

How the theorem emerged

1998

Eric Brewer formulates the idea that will later become known as the CAP theorem.

1999

The idea enters wider technical discussion through "Harvest, Yield, and Scalable Tolerant Systems".

2000

Brewer presents the thesis at the Symposium on Principles of Distributed Computing.

2002

Seth Gilbert and Nancy Lynch formally prove the theorem and sharpen the meaning of consistency through linearizability.

Related chapter

DDIA

Adds the deeper context on consistency, replication, and the operational consequences of network failures.

Читать обзор

Common misconceptions

CAP is often remembered as “pick two out of three,” but Brewer points out several important corrections that make the theorem far more useful in practice.

1

Partitions are rare, but not optional

Most of the time a system can aim to preserve both consistency and availability. The theorem becomes relevant the moment connectivity breaks and the system must choose what to protect first.

2

The trade-off is local, not global

The choice is made per operation, per data path, or per user flow. One part of the system may insist on fresh reads while another can survive on stale data.

3

The properties are not binary switches

Availability is measured across a range, consistency comes in multiple levels, and partitions may look like an outright disconnection or simply an intolerably slow network.

What to do during a network partition

In normal operation the system tries to preserve both consistency and availability. Once connectivity breaks, it needs a clear failure-mode playbook.

Detect the problem

Recognize that communication between parts of the cluster is lost or too unstable.

Constrain operations

Decide explicitly which reads and writes should be rejected, degraded, or simplified.

Recover the state

Reconcile data, resolve conflicts, and run compensations after connectivity returns.

Related chapter

Database Internals

Provides the deeper background on transactions, isolation, and storage behavior.

Читать обзор

How CAP relates to ACID and BASE

CAP is often mixed together with ACID and BASE. The comparison is useful, but the three concepts answer different questions.

ACID

AAtomicity
CConsistency
IIsolation
DDurability

In ACID, consistency is about business invariants and transaction correctness, not the distributed-systems meaning used by CAP.

BASE

BABasic Availability
SSoft state
EEventual consistency

BASE describes an architectural style that favors availability and accepts temporary divergence between copies of data.

Why latency still matters

Latency is not named explicitly in the original theorem, but in practice it is how a system discovers that the network is behaving like a partition. The real decision is often made through timeouts and response deadlines.

Reject the operation

Reduce availability in order to preserve stricter consistency.

Continue the operation

Preserve availability while accepting the risk of stale or divergent data.

Key insight

From an engineering perspective, a partition often shows up as a timeout. The stricter the response-time boundary, the more often the system will behave as if the network were partitioned, even if the real problem is “just” a very slow link.

What to remember

  • A network partition is not a global flag: different nodes may observe the situation differently.
  • During a partition, the system is not choosing “two properties out of three” in the abstract. It is choosing a degradation strategy for a specific operation.
  • Aggressive timeouts make the system more sensitive to slow networks, even when there is no physical disconnection.
  • CAP is useful when it helps reason about API behavior, user impact, and recovery procedures, not as a decorative diagram.

Original

Telegram: Book Cube

Post covering the formal proof of the CAP theorem.

Перейти на сайт

Formal proof

In 2002, Seth Gilbert and Nancy Lynch published the paper that turned Brewer's conjecture into a formal theorem.

"It is impossible for a web service to provide the following three guarantees: consistency, availability, partition-tolerance"

How the properties are formalized

1. Consistency

The authors define it through linearizability: all operations look as if they happened instantaneously in one total order between invocation and response.

"Under this consistency guarantee, there must exist a total order on all operations such that each operation looks as if it were completed at a single instant."

In other words, the system behaves as though it were running on one node and processing operations one at a time, even if multiple replicas exist internally.

2. Availability

Every request that reaches a non-failing node must complete with a response. This is formally a weak definition because there is no upper bound on response time. Yet under partitions it becomes strong in practice: even with lost messages, the request is still expected to terminate.

3. Partition tolerance

The model allows an arbitrary number of messages to be lost between parts of the cluster:

"When a network is partitioned, all messages sent from nodes in one component of the partition to nodes in another component are lost."

Proof

1Theorem for asynchronous systems

Theorem 1:

It is impossible in the asynchronous network model to implement a read/write data object that guarantees Availability and Atomic consistency in all fair executions (including those in which messages are lost).

In an asynchronous system, availability and atomic consistency cannot both be guaranteed for all fair executions if message loss is possible.

The contradiction argument is straightforward:

  1. Assume such an algorithm exists.
  2. Consider a system with two nodes, G₁ and G₂.
  3. Assume all messages between them are lost.
  4. A write happens on G₁.
  5. A later read happens on G₂.
  6. G₂ cannot return the value written on G₁.
  7. The system violates either consistency or availability.

2Theorem for partially synchronous systems

Theorem 2:

It is impossible in the partially synchronous network model to implement a read/write data object that guarantees Availability and Atomic consistency in all executions (even those in which messages are lost).

Real systems are closer to partially synchronous ones: nodes have clocks, timeouts, and operational expectations about delay. Even in that stronger model, the result remains the same.

Note: The rest of the paper discusses Delayed-t consistency in partially synchronous systems, a compromise that relaxes freshness guarantees.

Related chapters

Enable tracking in Settings