System Design SpaceSystem Design Space
Onboarding
Back to table of contents

Distributed Systems

23 chapters

This page contains all chapters in this theme. Open chapters in sequence or use this page as a section map.

1

Why are distributed systems and consistency needed?

Original Contenteasy

Introductory chapter: why we need consistency models, consensus and working with failures.

Open chapter
2

CAP theorem

Original Contentmedium

Fundamental limitation of distributed systems: consistency, availability, resistance to partitioning. History, misconceptions, ACID vs BASE.

Open chapter
3

PACELC theorem

Original Contenthard

Extending CAP: Tradeoffs between latency and consistency in native mode. System classification: PA/EL, PC/EC, PA/EC, PC/EL.

Open chapter
4

Clock Synchronization in Distributed Systems

Original Contentmedium

Practice of time synchronization: physical vs logical clocks, NTP/PTP, clock skew impact and architectural protection against time drift.

Open chapter
5

Leader Election: patterns and implementations

Original Contentmedium

How to design leader election: leases, quorum, failover, split-brain protection and practical implementations on Raft/ZooKeeper/etcd/Kubernetes.

Open chapter
6

Consensus: Paxos and Raft

Original Contentexpert

How systems negotiate a single meaning: quorums, two-phase Paxos, and leader-centric Raft.

Open chapter
7

Lesley Lamport: Causality, Paxos and Engineering Thinking

Documentaryhard

How Lamport's ideas (happens-before, logical clocks, Paxos, TLA+) grew out of physics and why they are critical for modern distributed systems.

Open chapter
8

Distributed Transactions: 2PC and 3PC

Original Contenthard

Practical analysis of distributed transactions: coordinator, prepare/commit phases, failure modes, blocking trade-offs and alternatives via Saga/outbox.

Open chapter
9

Jepsen and consistency models

Original Contentexpert

Distributed systems testing project: hierarchy of consistency models, Serializable vs Linearizable, known findings.

Open chapter
10

Testing Distributed Systems

Original Contenthard

A practical approach to testing distributed systems: chaos engineering, contract testing and integration testing at scale.

Open chapter
11

Designing Data-Intensive Applications (short summary)

Book Summaryhard

Analysis of the book by Martin Kleppmann: data models, replication, partitioning, transactions, batch and stream processing.

Open chapter
12

Distributed Systems: Principles and Paradigms (short summary)

Book Summaryexpert

The seminal work of Tanenbaum and van Steen: architectures, coordination, consistency, fault tolerance and security.

Open chapter
13

Google Global Network: Evolution and Architectural Principles for the AI ​​Age

Original Contenthard

Evolution of the Google network from the internet/streaming/cloud era to the AI-era: WAN as new LAN, multi-shard design, Protective ReRoute, intent-driven programmability and autonomous operations.

Open chapter
14

Streaming Data (short summary)

Book Summaryhard

Andrew Psaltis about stream processing: Collection/Queue/Analysis tiers, delivery semantics, data windows, stream algorithms.

Open chapter
15

Kafka: The Definitive Guide (short summary)

Book Summarymedium

Distributed stream processing platform: producers, consumers, partitions, replication, delivery semantics and Kafka Streams.

Open chapter
16

Kappa Architecture: stream-first alternative to Lambda

Original Contenthard

A single flow circuit without a separate batch layer: immutable log, materialized views, replay/backfill and comparison with Lambda.

Open chapter
17

Data Pipeline / ETL / ELT Architecture

Original Contentmedium

How to design a data pipeline: batch + streaming, ETL vs ELT, orchestration, data quality, recovery, cost control and operational reliability.

Open chapter
18

Apache Iceberg: table architecture in data lake

Original Contenthard

Practical analysis of Apache Iceberg: snapshots, manifests, ACID in the data lake, schema evolution, hidden partitioning, time travel and the place of Tableflow in the streaming circuit.

Open chapter
19

Big Data (short summary)

Book Summaryhard

Nathan Marz about Lambda Architecture: batch/serving/speed layers, data immutability, HyperLogLog and practical examples.

Open chapter
20

Data Mesh in Action

Book Summaryhard

A practical guide to adopting data mesh: domain ownership, data as a product, federated computational governance, self-serve platforms, and an MVP in one month.

Open chapter
21

Brief overview of the T-Bank data platform

Original Contenthard

Evolution of T-Bank's data platform: from DWH approaches to Lakehouse, key contours of the platform, scale and practical architectural conclusions.

Open chapter
22

Data platforms: How to build them in 2025 - interview with Nikolay Golov

Documentaryhard

Research Insights Made Simple #6: centralization vs federalization, data mesh in practice, OLTP/MPP limitations and the evolution of data platforms.

Open chapter
23

Local-First Software: Taking Back Control of Data

Documentaryeasy

A short documentary about the local-first approach: offline experience, synchronization and user control of data.

Open chapter