System Design Space
Knowledge graphSettings

Updated: April 30, 2026 at 10:03 PM

Designing Data-Intensive Applications, 2nd Edition (short summary)

hard

DDIA matters because it turns data systems from scattered terminology into one coherent engineering language about requirements, storage, failures, and growth.

In real work, the second edition helps teams connect data models, storage engines, replication, sharding, transactions, and stream processing as one chain of design choices.

In interviews and architecture discussions, the book is especially useful because it lets you explain growth effects: data redistribution, backpressure, schema evolution, and the price of consistency.

Practical value of this chapter

Design in practice

Systematizes storage, replication, sharding, and event-processing choices in data systems.

Decision quality

Helps choose indexes, storage engines, and replication models by workload and guarantee profile.

Interview articulation

Gives language for reliability, latency, consistency, maintainability, and the cost of future change.

Risk and trade-offs

Highlights partial failures, data redistribution, backpressure, and schema evolution.

Designing Data-Intensive Applications, 2nd Edition

Authors: Martin Kleppmann, Chris Riccomini
Publisher: O'Reilly Media, 2026
Length: 650 pages

A summary of DDIA's second edition: architecture trade-offs, data models, storage, sharding, consistency, streams, derived state, and data privacy.

Original
Translated

Second edition

Official O’Reilly page for Designing Data-Intensive Applications, 2nd Edition: Martin Kleppmann and Chris Riccomini, released in 2026.

Open book page

The second edition of DDIA is best read not as a database catalog, but as a map for data-intensive applications. It connects nonfunctional requirements, reliability, scalability, and maintainability with how storage, replication, transactions, and event processing actually behave.

The book is most useful when you need to explain why a particular storage engine fits a workload, where replication boundaries belong, how sharding changes failure modes, and why consistency cannot be discussed separately from latency, network partitions, and observed system behavior.

How the second edition is organized

The second edition expands the original arc: alongside storage, replication, and consensus, it connects architecture trade-offs, cloud operations, local-first applications, derived state, and responsibility in data systems.

Chapters 1-2

Architecture trade-offs, reliability, scalability, maintainability, and requirements.

Chapters 3-5

Data models, query languages, storage engines, indexes, encoding, and schema evolution.

Chapters 6-10

Replication, sharding, transactions, partial failures, consistency, and consensus.

Chapters 11-13

Batch and stream processing, derived state, data integration, and end-to-end correctness.

Chapter 14

Law, privacy, social consequences, and engineering responsibility for data systems.

Architecture, data models, and storage

Chapters 1-2: trade-offs and requirements

What changes in the second edition:

  • System design starts with requirements, workload, and expected failures.
  • Reliability, scalability, and maintainability are treated as observable properties.
  • Cloud services and managed infrastructure add new trade-offs instead of removing old ones.

What to take away:

  • Architecture is not a list of technologies; it is a set of choices under constraints.
  • A good design explains the cost of growth, failure, and future change.
  • Metrics and observed behavior matter more than elegant abstractions.

Chapter 3: data models and query languages

Relational model

SQL, normalization, relationships, and mature transactional semantics.

Document model

Flexible structure, read locality, and the risk of hidden relationships between documents.

Graph model

Nodes, edges, and queries where relationships matter more than isolated entities.

Chapter 4: storage and retrieval

This chapter explains why the same query can be cheap in one database and expensive in another: the answer is often hidden in the index, log, on-disk format, and read/write profile.

LSM tree

  • Optimizes write-heavy workloads and sequential flushes to disk.
  • Requires file merging and careful control of background compaction.
  • Fits log-oriented and write-heavy systems well.

B-tree

  • Keeps sorted pages and supports efficient point reads.
  • Often updates data in place and depends heavily on page cache behavior.
  • Remains a default index structure in many relational databases.

Chapter 5: encoding and evolution

Data formats are contracts between code versions, services, and long-lived data. That makes JSON, Avro, Protocol Buffers, and schema evolution part of the same design conversation.

JSON/XML

Readable for humans, but compatibility still needs discipline.

Thrift/Protocol Buffers

Compact and fast, but more tied to field IDs.

Avro

Works well when schemas evolve together with stored data.

Distributed data

Chapter 6: replication

Single leader

  • All writes go through the leader replica.
  • The model is easier to reason about and debug.
  • The hard part is failover when the leader is unavailable.

Multi-leader

  • Writes are accepted in multiple regions.
  • Useful for geographically distributed clients.
  • The main cost is conflict detection and resolution.

Leaderless

  • Clients talk to multiple replicas directly.
  • Read and write quorums make guarantees tunable.
  • The system must repair divergence between copies.

This part of the second edition is also where local-first applications, CRDTs, and multi-device synchronization become especially relevant.

Chapter 7: sharding

Partitioning strategies:

  • By key, for even distribution through hashing.
  • By range, for time, geography, and naturally ordered data.
  • Consistent hashing reduces data movement when the cluster changes.

Operational risks:

  • Hot keys and uneven workload distribution.
  • Queries that must fan out across all shards.
  • Rebalancing work that competes with user traffic.

Chapter 8: transactions

DDIA is valuable here because it explains isolation levels through real anomalies rather than through dry database documentation.

LevelWhat it givesWhere caution remains
Read CommittedDoes not expose uncommitted writes.Does not prevent every read race.
Snapshot IsolationProvides a consistent read snapshot.Can still allow write skew.
SerializableMoves behavior closer to sequential execution.Costs more in latency and conflicts.

Chapters 9-10: failures, consistency, and consensus

Why distribution is hard:

  • Network partitions break the simple idea that a node is either alive or dead.
  • Partial failures create different views of reality across participants.
  • Clock skew makes event ordering less obvious than it looks.

Where consensus appears:

  • Paxos and Raft help choose a single order of writes.
  • Quorums define when a decision can be considered accepted.
  • The price of consensus is latency, recovery complexity, and dependence on a majority.

Processing, derived state, and responsibility

Chapter 11: batch processing

Unix pipeline idea:

Large-scale processing becomes easier to reason about when each step reads an input stream, writes an output stream, and remains reusable.

cat log.txt | grep ERROR | sort | uniq -c

Platform evolution:

  • MapReduce makes processing distributed, but pays with extra I/O.
  • Spark speeds up iterative jobs through memory and DAG execution.
  • Flink brings batch and stream scenarios closer together.

Chapter 12: stream processing

Stream processing matters not only for real-time features. It lets systems treat data changes as an explicit event log.

Message brokers

Kafka, RabbitMQ, and Pulsar as different delivery and storage models.

Event sourcing

An immutable event log as the source of system state.

Change data capture

Moving database changes into analytics and search systems.

Chapters 13-14: derived state, law, and society

Derived state

The second edition connects materialized views, GraphQL, workflows, and data integration into one theme: how to derive new state safely from recorded facts.

Data privacy

The final chapter moves beyond performance: data systems must account for law, user consent, explainability, and the consequences of automation.

How to use DDIA in system design

DDIA does not give canned interview answers. Its strength is that it teaches you to explain designs through workload, guarantees, failures, and the cost of future change.

Database choice

Compare SQL, NoSQL, indexes, and storage engines by read, write, and schema-change profile.

Replication and regions

Explain where synchrony is required, where lag is acceptable, and how the system recovers.

Sharding

Choose partition keys, identify hot keys early, and plan data redistribution.

Transactions and isolation

Name the isolation level and explain which anomalies it still leaves possible.

Events and streams

Build processing around idempotency, deduplication, and an explicit change log.

Responsibility for data

Account for privacy, retention, explainability, and the risk of incorrect automated decisions.

Verdict

Strengths

  • Frames system design as a set of testable trade-offs.
  • Connects data models, storage, replication, transactions, and streams into one picture.
  • Updates the focus for modern cloud, local-first, and event-driven systems.
  • Adds law, privacy, and social consequences to the data-systems conversation.
  • Gives engineers a language for mature architecture discussions, not a set of templates.

Caveats

  • It is dense; reading by topic works better than trying to rush through it.
  • It explains why choices work, but does not replace practice designing concrete systems.
  • Many chapters become most useful when tied back to examples from your own work.
  • For interviews, pair it with case studies and workload-estimation practice.

Recommendation:

DDIA remains essential reading for engineers who design data systems. The second edition is especially useful as a bridge between classic distributed systems and modern products where data lives in the cloud, on devices, in event streams, and under regulatory constraints.

Sources

Related chapters

Where to find the book

Enable tracking in Settings