System Design Space
Knowledge graphSettings

Updated: May 3, 2026 at 6:53 AM

Jepsen and consistency models

expert

How Jepsen validates distributed-system guarantees: operation histories, Nemesis, network partitions, linearizability, serializability, and real-world anomalies.

Jepsen matters because it tests what a system actually does under failure, not what the documentation promises. In distributed systems, that is the real moment of truth.

In real work, this chapter helps teams define testable consistency properties for concrete workloads and avoid trusting vendor claims blindly where the cost of being wrong is high.

In interviews and architecture discussions, it is especially useful when you need to show the gap between claimed and actual guarantees under network faults, delay, and coordination loss.

Practical value of this chapter

Design in practice

Promotes guarantee validation before incidents instead of trusting vendor claims.

Decision quality

Shows how to define testable consistency properties for real workloads.

Interview articulation

Strengthens answers with practical linearizability/serializability testing strategies.

Risk and trade-offs

Exposes gaps between claimed and actual guarantees under network failures.

Official website

Jepsen.io

A project that tests distributed systems for correctness under failure.

Перейти на сайт

Jepsen is an independent distributed-systems analysis and testing project created by Kyle Kingsbury, also known as Aphyr. It has uncovered critical correctness bugs in dozens of popular databases and became a practical standard for validating consistency claims.

Foundation

TCP protocol

Jepsen models network failures and partitions at the transport layer.

Читать обзор

What is Jepsen?

Testing tool

Jepsen is a Clojure library for testing distributed systems. It generates load, injects network partitions, kills processes, shifts clocks, and checks whether the stated guarantees still hold.

ClojureOpen sourceBlack-box testing

Report series

Each analysis is published as a detailed report: test setup, detected anomalies, vendor response, and follow-up fixes. The reports became required reading for distributed-system architects.

MongoDBPostgreSQLCassandraetcd+40 others

Related chapter

CAP theorem

A fundamental limitation of distributed systems.

Читать обзор

Why Jepsen matters

Testing marketing claims

Many databases promise strong consistency or ACID semantics, but do not always preserve those guarantees in practice. Jepsen has shown confirmed-write loss in MongoDB, dirty reads in RethinkDB, and data loss in Redis Cluster even without process crashes.

A shared language for guarantees

The project made the hierarchy of consistency models easier to reason about and separated transaction isolation in databases from linearizability in distributed operations.

Better systems

Public reports lead vendors to fix bugs. CockroachDB, TiDB, and YugabyteDB, for example, worked closely with Jepsen to substantiate their serializability guarantees.

Source

Jepsen: Consistency Models

Interactive consistency-model hierarchy.

Перейти на сайт

Consistency-model hierarchy

Jepsen collects consistency models into a hierarchy and shows where two traditions meet: transaction isolation in relational databases and linearizability for distributed operations.

Consistency-model hierarchy

Two branches: transaction serializability and linearizability for distributed operations

Source: Jepsen.io
Strict Serializable
Serializable
Linearizable
Repeatable Read
Snapshot Isolation
Sequential
Cursor Stability
Monotonic Atomic View
Causal
Read Committed
PRAM
Read Uncommitted
Writes Follow Reads
Monotonic Reads
Monotonic Writes
Read Your Writes
Transaction isolation
Distributed reads/writes
Unavailable under partition

Unavailable during network faults. Nodes pause operations to preserve safety guarantees.

Sticky available

Available on healthy nodes if clients keep working with the same servers.

Available on healthy nodes

Available on all healthy nodes, even during full network partitions.

Key insight

Serializable comes from transactional SQL systems (transaction isolation). Linearizable comes from distributed systems (atomic reads/writes). They converge at the top in Strict Serializable, the strictest consistency model.

About Jepsen: Jepsen runs failure-oriented tests for distributed databases and validates their stated consistency guarantees. Many popular systems (Cassandra, MongoDB, CockroachDB, Redis) have gone through Jepsen analysis.

Related chapter

PACELC theorem

Trade-offs between latency and consistency.

Читать обзор

Two branches of consistency

Transaction serializability

This branch comes from relational databases and describes transaction isolation levels, from reading uncommitted data to serializable execution.

Focus:

How transactions interact and which anomalies are allowed: dirty reads, phantom reads, and lost updates.

Operation linearizability

This branch comes from distributed systems and describes atomic reads and writes across multiple nodes.

Focus:

Whether a distributed system looks like a single node where each operation has a precise place between invocation and response.

Strict serializability = linearizability + serializability

Strict serializability sits at the top of the hierarchy. It combines both models: transactions execute serializably and respect real-time operation order. Systems such as Google Spanner approach this with TrueTime.

Key consistency models

Linearizability

Unavailable during partition

Every operation appears instantaneous between invocation and response. All observers see the same sequence of operations. The strictest model for single operations.

Serializability

Unavailable during partition

Transactions behave as if they executed sequentially in some order, but that order does not have to match real time. The strongest SQL isolation level.

Causal consistency

Sticky available

Causally related operations are observed in the correct order. If event A happened before B, the system should not expose B without its cause. Achievable in AP systems.

Eventual consistency

Available on healthy nodes

If no new writes arrive, all replicas eventually converge. The model does not promise that any particular read observes the latest value. The weakest useful guarantee.

Notable Jepsen findings

SystemClaimObserved behaviorStatus
MongoDBDurable writesConfirmed writes could be lostFixed
CassandraLWT atomicityLost and duplicated operationsFixed
Redis ClusterConsistencyData loss without a network faultBy design
etcdLinearizabilityConfirmed ✓Verified
CockroachDBSerializabilityConfirmed ✓Verified
TiDBSnapshot isolationAnomalies foundFixed

Full list of reports: jepsen.io/analyses

How Jepsen testing works

1

Setup

Deploy a cluster on N nodes

2

Load

Run reads, writes, and CAS operations

3

Nemesis

Partitions, process kills, and clock shifts

4

History

Record every call, response, and error

5

Check

Compare the history with the chosen model

Nemesis is the failure-injection component. It breaks connectivity between nodes, kills processes, and shifts clocks. If a system claims linearizability, it must preserve a valid operation history through those scenarios.

Practical conclusions

1. Do not trust claims without evidence

Strong consistency, ACID, and linearizability are precise guarantees, not marketing adjectives. Check Jepsen reports and vendor documentation for concrete limitations.

2. Understand the cost of a model

Stricter consistency models have a price: unavailability during network partitions under CAP or higher latency under PACELC. Choose the model from application requirements.

3. Test under failure

Correctness is not established in ideal conditions; it is tested during failures. Use chaos-engineering tools such as Jepsen, Chaos Monkey, and Toxiproxy to observe actual system behavior.

4. Separate isolation from consistency

Serializable isolation in a database is not the same as linearizable consistency in a distributed system. The first is about transactions; the second is about individual operations. Full correctness needs both sides: strict serializability.

What to study next

Jepsen is your ally

Before choosing a database for a critical system, check Jepsen reports. If a system is not listed, that does not prove it is reliable; it only means no one has tested it publicly. Absence of bug evidence is not evidence of absence.

Related chapters

Enable tracking in Settings