Why understand storage systems? — System Design Space

The database section matters not because it lists technologies, but because it brings the discussion back to the hard part: which data, query, and failure properties force a particular storage choice.

In day-to-day engineering work, this chapter helps split a system into distinct storage roles: transactional core, analytical projections, search layer, cache, and event logs instead of forcing one database to do everything.

For interviews and design reviews, it sets the right frame: workload profile, consistency, latency, and operating cost first, then the name of a specific engine.

Practical value of this chapter

Workload map

Break the product into OLTP/OLAP/streaming profiles and define where strict consistency is required versus where delay is acceptable.

Storage boundaries

Assign data ownership by domain so source-of-truth systems stay separate from indexes, caches, and analytical projections.

Evolution roadmap

Plan migration from a single storage model to polyglot persistence without service interruption and with explicit risk control.

Interview framing

Defend decisions through CAP/PACELC trade-offs, latency budgets, and operating cost, not by naming technologies.

Decision frame and editorial focus

Chapter focus

storage architecture boundaries and core database-selection trade-offs

Workload profile

Start from the data profile: source of truth, OLTP, analytics, search, cache, and event-stream responsibilities.

Good fit

Use this chapter as the entry frame: it sets reading order before jumping to a favorite database engine.

Boundary and risk

The main risk is mixing taxonomy, technology choice, and operational guarantees into one implicit recommendation.

Connect next

Connect conclusions to the database-selection framework, DDIA, and practical engine overviews.

Context

Designing Data-Intensive Applications, 2nd Edition

A reference source on data models, consistency, replication and storage trade-offs — worth returning to whenever a decision is contested.

Читать обзор

The Storage Systems section is about treating data architecture as a foundation, not as a detail you return to after the API is chosen. In production it is the storage decision that sets reliability, latency, cost profile and scaling limits for the whole system — and it is the most expensive one to redo on the fly.

This chapter connects System Design with concrete DB choices: which data model to take, where strict consistency is actually required and where it only costs more, and how to evolve storage in production without stopping the product.

Why this section matters

Storage choices define system boundaries

The shape of the data model and the choice of database set your API contracts, consistency level, latency and the work you do in operations every day. This decision pulls the rest of the architecture along with it.

Storage trade-offs are core architecture decisions

SQL, document, key-value, wide-column and graph databases cover different classes of problems and give different guarantees. There is no universal option — every choice gives something up.

Data reliability requires explicit engineering

Replication, transactions, recovery and backup strategy belong in the design up front. Leave them for later and the first failure turns into data loss instead of a managed incident.

Wrong database decisions are expensive to reverse

Late migration of the data model and storage strategy costs more than early domain validation: the data is already in production, and the move happens under load.

Storage competence is mandatory in system design

In interviews and in production alike, an engineer is expected to name not a trendy database but a choice justified through workload profile, consistency requirements and cost of ownership.

How to go through storage systems step by step

Move from workload to operations: define critical paths and data shape, choose the data model, set guarantee boundaries, design scaling, and document how storage will evolve.

Active step 1/5

Workload profile and critical paths

Start with the operations that truly matter: who writes, who reads, which data grows fastest, and where latency affects the user.

What to check

Read/write ratio, data volume, working-set size, and hot user journeys.
Latency budget, RPO/RTO, peak load, and query classes that cannot degrade.

Practice

Workload profile with read/write ratio, volumes, and SLOs.
Critical-path map from API to the actual storage system.

Self-check questions

Which operation will hurt user experience first as load grows?
What matters most here: latency, data freshness, or recovery after failure?

Mistake this catches

Choosing a database by popularity without validating workload shape, critical paths, and recovery targets.

Key storage trade-offs

Strong consistency vs latency and availability

The stricter the consistency guarantees, the more expensive distributed writes get and the harder it is to keep latency low. Every level of consistency is paid for in response time or in availability during a network partition.

Normalization vs read performance

A normalized schema keeps integrity simple, but on read-heavy workloads you often have to denormalize — and take on the manual job of keeping the duplicated copies in sync.

Single DB standard vs specialized storage stack

One engine for everything lowers operational complexity but fits diverse data poorly. Polyglot persistence matches access patterns more precisely — at the price of one more technology you have to operate.

Managed DB speed vs infrastructure control

A managed service speeds up delivery and removes toil, but takes away low-level tuning, portability and cost control at scale. The signal to reconsider is when the bill grows faster than the load.

What this section covers

Storage foundations

Data models, DB selection principles and the groundwork for services that hit their data limits before their compute limits.

Introduction to data storage Database Selection Framework Database Guide Designing Data-Intensive Applications, 2nd Edition

Database engines in practice

An OLTP/OLAP and NoSQL engine landscape: where each one is strong, where it hits a ceiling and what it costs to operate.

PostgreSQL MySQL MongoDB Cassandra ClickHouse Redis

How to apply this in practice

Common pitfalls

Choosing a database by popularity rather than by workload profile and consistency requirements.

Replication, backup and recovery get deferred until the first incident — by which point there is nothing left to recover.

Polyglot persistence without operational readiness: every new store is something you have to fix at three in the morning.

Schema evolution and migration compatibility go unwatched as the product grows — and you hit them on the largest table.

Recommendations

Start DB selection from concrete numbers: latency, consistency level, RPO/RTO targets and expected growth.

Validate the data model against real query patterns and failure scenarios before any staged production rollout.

Capture trade-offs in ADRs: the guarantees chosen, the limitations and the condition under which the decision gets revisited.

Build DB observability as part of the platform: query-class latency, resource saturation, replication lag and error budgets.

Section materials

Where to go next

Build a storage baseline first

Start with Data Storage Intro, the Database Selection Framework and DDIA — enough to stop arguing about databases by taste and reason about the choice on a shared language.

Deepen engines and operations

Then work through the PostgreSQL/MySQL/MongoDB/Cassandra/ClickHouse overviews and DB internals — that is where you see how the trade-offs you picked behave under real load.

Sources for entering the topic

Designing Data-Intensive Applications - Martin Kleppmann's book site on reliable, scalable, and maintainable data-intensive systems.
PostgreSQL documentation: Tutorial - the official introduction to PostgreSQL, relational concepts, SQL, and core DBMS features.
MongoDB documentation: Data Modeling - the official guide to document modeling and shaping data around application access patterns.
Martin Fowler: Polyglot Persistence - a classic case for matching different storage technologies to different kinds of data and operations — and where that costs you in complexity.

Related chapters

Introduction to data storage - the next step in the route: how state evolves from files and OLTP toward NoSQL, NewSQL, HTAP, and integration paths.
Database Selection Framework - turns DB choice from guesswork into a decision flow: from workload profile and system constraints to a justified pick.
Database Guide (short summary) - anchors the section vocabulary: relational model, SQL, indexes, transactions, replication, and NoSQL.
Designing Data-Intensive Applications, 2nd Edition (short summary) - builds foundational reasoning around data models, replication and transactions in distributed systems.
PostgreSQL: architecture and practices - shows the OLTP path up close: where the relational model pays off and where its trade-offs surface in production.
Cassandra: architecture and trade-offs - extends scaling and consistency decisions for write-heavy distributed workloads.
ClickHouse: analytical DBMS and architecture - adds the OLAP angle: columnar design, partitioning and high-throughput analytical reads.