System Design Space
Knowledge graphSettings

Updated: February 21, 2026 at 11:58 PM

Replication and sharding

mid

How to design replication and sharding: topologies, read/write path, rebalancing, hot shards, consistency and operational trade-offs.

Context

Principles of Designing Scalable Systems

This chapter deep-dives into one of the core scaling topics: replication + sharding.

Open chapter

Replication and sharding define data scale, availability, and cost profile. Replication improves resilience and read scaling; sharding increases write throughput and total capacity. Mistakes here usually surface not at launch, but during traffic growth and product complexity expansion.

Replication Models

Primary-Replica

One write leader with multiple read replicas. Works well for read-heavy workloads and straightforward operations.

Replica lag and read/write asymmetry; failover must be automated and tested.

Multi-Primary

Writes are accepted in multiple regions/nodes, reducing latency for geo-distributed writes.

Conflict resolution and merge logic are hard; significantly higher platform complexity.

Leaderless

Quorum-based reads/writes with high availability for distributed KV/NoSQL scenarios.

Harder consistency reasoning, read repair, hinted handoff, and tombstone behavior.

Replication Visualization

Replication Simulator

Request Queue

TX-401WRITE
user:42 · web
TX-402READ
user:42 · mobile
TX-403WRITE
tenant:acme · partner
TX-404READ
order:981 · web

Replication Router

Primary-Replica

Write to leader, read from replicas. A classic read-scaling model.

Primary

write leader
reads: 0
writes: 0
lag: 0ms

Replica A

read replica
reads: 0
writes: 0
lag: 8ms

Replica B

read replica
reads: 0
writes: 0
lag: 11ms
Ready

Ready for simulation. Start auto mode or run a single step.

Last decision: —

Sharding Models

  • Hash-based sharding: even distribution, but range queries are harder.
  • Range-based sharding: efficient for range queries, but hot-range risk is high.
  • Directory-based sharding: flexible data placement, requires a reliable metadata service.
  • Geo/tenant sharding: strong sovereignty and tenant isolation, but cross-tenant workflows are harder.

Sharding Visualization

Sharding Simulator

Shard Keys Queue

Q-901acme
user:42 · id=1200
Q-902globex
order:981 · id=3520
Q-903initech
invoice:77 · id=6710
Q-904umbrella
cart:55 · id=2200

Shard Router

Hash Sharding

Key is hashed into a shard ring for even distribution.

Shard 0

hash bucket 0
requests: 0hotness: 0

Shard 1

hash bucket 1
requests: 0hotness: 0

Shard 2

hash bucket 2
requests: 0hotness: 0
Ready

Ready for simulation. Start auto mode or run a single step.

Last decision: —

Related

Performance Engineering [RU]

Hot shards and migration windows directly affect p95/p99 latency.

Open chapter [RU]

Shard Rebalancing and Hot-Shard Control

Use consistent hashing and virtual nodes to reduce migration volume when node count changes.

Mitigate hot shards with adaptive partitioning, write buffering, workload-key routing, and temporal split.

Run backfill and migrations through throttled pipelines with checksum verification.

Every migration operation must support pause/resume, rollback plan, and observable progress.

Cloud Native

Cost Optimization & FinOps [RU]

Replication topology and rebalancing directly affect cloud cost and unit economics.

Open chapter [RU]

Practical Checklist

RPO/RTO are defined per data class and failure scenario.

Operations requiring strong consistency vs eventual consistency are explicitly documented.

There is a shard-key evolution plan for product growth and changing access patterns.

Replication, failover, and rebalancing are validated regularly through game days.

Cross-zone/cross-region replication cost is accounted for in the FinOps model.

Common anti-pattern: picking shard key for the current feature set without an evolution plan.

References

Related chapters

Enable tracking in Settings

System Design Space

© 2026 Alexander Polomodov