System Design Space
Knowledge graphSettings

Updated: February 21, 2026 at 11:59 PM

Designing Data-Intensive Applications (short summary)

hard

Designing Data-Intensive Applications

Authors: Martin Kleppmann
Publisher: O'Reilly Media, 2017 (1st Edition), 2025 (2nd Edition)
Length: 616 pages

Analysis of the book by Martin Kleppmann: data models, replication, partitioning, transactions, batch and stream processing.

Designing Data-Intensive Applications - original coverOriginal
Designing Data-Intensive Applications - translated editionTranslated

Primary source

Official page of Designing Data-Intensive Applications by Martin Kleppmann.

Open book page

Book structure

The book is divided into three parts, each of which successively increases the scale of consideration - from a single machine to global distributed systems:

Part I: Basics

Data models, storage, coding. How data is represented and written to disk.

Part II: Distributed Data

Replication, partitioning, transactions, consensus. Scaling across multiple machines.

Part III: Derived Data

Batch and stream processing. Construction of data processing pipelines.

Part I: Data Systems Fundamentals

Chapter 1-2: Reliability, Scalability and Data Models

Three pillars of the system:

  • Reliability — the system works correctly even during failures
  • Scalability - ability to cope with increasing workload
  • Maintainability — ease of support and changes

Data models:

  • Relational - tables, SQL, ACID
  • Documentary — JSON, nesting, flexibility
  • Graphovaya — nodes and edges, connections of any complexity

Chapter 3: Data Storage and Retrieval

One of the key chapters of the book is how data is physically stored on disk.

LSM-Tree (Log-Structured Merge)

  • Optimized for recording
  • Used in Cassandra, RocksDB, LevelDB
  • Memtable → SSTable → Compaction

B-Tree

  • Optimized for reading
  • Used in PostgreSQL, MySQL, Oracle
  • Fixed size pages, update-in-place

Chapter 4: Coding and Circuit Evolution

How to serialize data and ensure backward/forward compatibility:

JSON/XML

Human readable, large size

Thrift/Protocol Buffers

Binary, with circuit

Avro

Schema evolution, Hadoop-friendly

Part II: Distributed Data

Chapter 5: Replication

Single-Leader

  • One master per recording
  • Simple model
  • Problem: single point of failure

Multi-Leader

  • Several masters
  • For multi-data centers
  • Problem: Write conflicts

Leaderless

  • All nodes are equal (Dynamo-style)
  • Quorum reads/writes
  • W + R > N for consistency

Chapter 6: Sharding

Partitioning strategies:

  • By key — hash(key) mod N
  • By range - time data, geographic
  • Consistent Hashing - minimizing rebalancing

Problems:

  • Hot spots - uneven load
  • Scatter-gather — requests to all shards
  • Rebalancing — data redistribution

Chapter 7: Transactions

An in-depth discussion of ACID and isolation levels is one of the strongest parts of the book:

Isolation levelProtects againstDoes not protect against
Read CommittedDirty reads, dirty writesNon-repeatable reads
Snapshot IsolationNon-repeatable readsWrite skew
SerializableAll anomalies

Chapters 8-9: Problems and Consensus

What can go wrong:

  • Network partitions
  • Asymmetric failures
  • Problems with the clock (clock skew)
  • Byzantine faults

Consensus algorithms:

  • Paxos - classic, complex
  • Raft - understandable, used in etcd
  • Zab — ZooKeeper
  • FLP impossibility theorem

Part III: Derived Data

Chapter 10: Batch Processing

Unix philosophy:

Kleppmann draws a parallel between Unix pipes and modern batch processing:

cat log.txt | grep ERROR | sort | uniq -c

MapReduce and its evolution:

  • MapReduce - simple model, lots of I/O
  • Spark — in-memory, DAG execution
  • Flink — unified batch/stream

Chapter 11: Stream Processing

Real-time data processing is a key topic for modern systems:

Message Brokers

Kafka, RabbitMQ, Pulsar

Event Sourcing

Immutable event log as a source of truth

Change Data Capture

Debezium, Maxwell

Chapter 12: The Future of Data Systems

Kleppmann concludes the book with philosophical reflections on how to build correct, sustainable and ethical data systems. He discusses:

  • Composition of services and data flow
  • End-to-end correctness guarantees
  • Ethical aspects of data processing

Key Concepts for System Design Interview

DDIA does not contain ready-made solutions to problems, but it provides a deep understanding that allows you to confidently answer the questions “why?”:

Selecting a Database

Understanding trade-offs between SQL and NoSQL, LSM vs B-Tree

Replication Strategies

When to use synchronous vs asynchronous replication

Partitioning

Selecting partition key, avoiding hot spots

Insulation levels

Explaining anomalies and preventing them

Exactly-once semantics

Idempotency and deduplication in stream processing

Consensus

Understanding Raft/Paxos for distributed locking

📚 Verdict

✅ Strengths

  • Deep understanding of the “why”, not just the “how”
  • Great visualizations and examples
  • Covering the entire stack: from bytes to business logic
  • Lots of references to real systems
  • Honest discussion of trade-offs

⚠️ Features

  • Large book (~600 pages)
  • No ready-made interview solutions
  • Takes time to digest
  • Some sections may be too academic

🎯 Recommendation:

DDIA is must-read for any engineer working with distributed systems. To prepare for an interview, use it together with practical books (Alex Xu, Stanley Chiang) - DDIA will give you an understanding of the “why”, and practical books will give you an understanding of the “how”.

Where to find the book

Enable tracking in Settings

System Design Space

© 2026 Alexander Polomodov