System Design Space
Knowledge graphSettings

Updated: March 24, 2026 at 11:23 AM

RAM and persistent storage

medium

The difference between RAM and HDD/SSD, speed, latency and storage cost.

This chapter matters because it is not really about two kinds of memory, but about dramatic jumps in latency, capacity, and durability across the hierarchy.

In real engineering work, it helps you design hot and cold paths deliberately: what belongs in memory, what moves to SSD or HDD, and how those choices reshape APIs, caches, and user experience.

In interviews and design discussions, it turns storage trade-offs into something concrete: latency cliffs, durability, and infrastructure budget.

Practical value of this chapter

Memory hierarchy

Guides hot/cold path design using realistic latency and cost differences.

Data placement

Clarifies what should live in RAM vs SSD/HDD and how that affects API behavior.

Load stability

Shows how to avoid degradation as working set grows and cache pressure increases.

Interview trade-offs

Strengthens discussion of speed, durability, and infrastructure-budget compromises.

Source

Random-access memory

RAM structure and key properties of volatile memory.

Перейти на сайт

RAM and persistent storage are different levels of one hierarchy: RAM gives minimal latency for hot data, while SSD and HDD provide capacity and durability. In system design, you almost always need a balance between latency, capacity, and cost.

RAM vs persistent storage

How RAM works

  • DRAM is organized into cell matrices addressed by the memory controller.
  • Access is random, but reads within one row are usually faster because of the row buffer.
  • Memory requires periodic refresh, which introduces background latency.
  • CPU operates through L1/L2/L3 caches, hiding part of RAM latency.

How persistent storage works

  • Storage is block-based: reads and writes happen in pages and blocks.
  • HDD adds mechanical seek, SSD works with NAND pages and erase blocks.
  • Performance depends on I/O queues, concurrency, and block size.
  • File systems and DBMSs build journals, indexes, and compaction strategies on top.

RAM

  • Volatile: data is lost when power is removed.
  • Very low access latency and high read/write speed.
  • Best for hot data, indexes, and caches.
  • More expensive per GB than persistent storage.

Persistent storage

  • Persistent: data survives power loss.
  • Access latency is orders of magnitude higher than RAM.
  • Available capacity is much larger.
  • Lower storage cost per GB.

Dynamic visualization: capacity and tiering calculator

This simplified model estimates how much data lands in RAM/SSD/HDD after replication and compression.

Daily ingest

600 GB/day

Retention

30 days

Replication

x3

Compression

2.0x

Hot data

20%

SSD share of warm data

60%

Index overhead

35%

Capacity calculator output

Total data

26.37 TB

RAM for hot set

3780 GB

SSD capacity

12.66 TB

HDD capacity

8.44 TB

Data distribution by tier

Hot 20% (5.27 TB)Warm/SSD 48%Cold/HDD 32%

Hot-data coverage by RAM tier: 70%

Estimated monthly cost: $13781 (~$165372/year).

Source

Solid-state drive

SSD fundamentals and why SSD latency is much lower than HDD latency.

Перейти на сайт

Dynamic visualization: latency calculator

Change load parameters and see how cache-hit, random I/O, and queue depth impact p95 latency for each storage layer.

Cache hit ratio

85%

Random I/O

70%

Queue depth

16

Target IOPS

50,000

Latency SLO (p95)

15 ms

RAM

p95: 0.10 us

Device-only: 0.10 us

Load pressure: 0.20x

Within SLO

NVMe SSD

p95: 46 us

Device-only: 306 us

Load pressure: 0.20x

Within SLO

SATA SSD

p95: 313 us

Device-only: 2.08 ms

Load pressure: 0.71x

Within SLO

HDD

p95: 1084.1 ms

Device-only: 7227.2 ms

Load pressure: 142.86x

Outside SLO

Recommended primary tier: NVMe SSD

Current parameters fit the target SLO for the selected cache-hit profile.

Workload-aware selection calculator

Pick a workload profile and tune the performance/cost balance. The calculator proposes a target RAM/SSD/HDD split.

Cost priority

45%

0 = performance-first, 100 = cost-first

Yearly growth

+35%

Target p99

6 ms

Target IOPS

120,000

Recommended tier split

RAM40%
SSD55%
HDD/Cold5%

Projected ingest: 540 GB/day -> projected storage envelope: 23.73 TB

Risk: validate growth and queue depth to avoid hidden peak-time degradation.

How to choose under load

  • Use aggressive RAM caching for hot key-space and indexes.
  • Use SSD as the primary operational storage layer for read/write paths.
  • Cold tier share is small: this profile prioritizes latency over storage cost.

Source

Hard disk drive

Mechanical disk characteristics and random/sequential access behavior.

Перейти на сайт

HDD vs SSD in practical architecture

Branch-heavy APIs and transactional systems usually require SSD for random I/O. HDD remains valuable for archives, historical reads, and long-term retention where latency is not critical.

HDD

  • Mechanical components with rotating platters.
  • High random I/O latency because of seek time.
  • Large capacity and low price per TB.
  • Good for cold data, backups, and archives.

SSD

  • Flash memory with no moving parts.
  • Low latency and high IOPS relative to HDD.
  • Much better random access behavior.
  • Finite write endurance, so lifecycle policy and monitoring matter.

Common mistakes

Keeping the whole dataset on SSD

Without tiering, cost grows quickly while cold data still occupies the expensive fast tier.

Ignoring the working set

If the hot working set does not fit in RAM, p95/p99 latency can degrade sharply.

Tracking only average latency

For storage, tail behavior matters most: p95/p99 and queue-driven degradation under load.

No lifecycle policy

Without automatic RAM -> SSD -> HDD transitions, storage cost and latency become unpredictable.

Practical recommendations

Tiered storage by access pattern

Keep hot keys and indexes in RAM, operational data on SSD, and long-tail history on HDD or object storage.

Capacity planning with headroom

Plan for replication, compression, index overhead, and yearly traffic growth in one model.

SLO-first storage choice

Set p99 latency and IOPS targets first, then choose RAM/SSD/HDD, and only then optimize cost.

Storage observability

Monitor cache hit ratio, queue depth, fsync latency, and saturation for each storage tier.

Practical conclusion

Choosing between RAM, SSD, and HDD is not a binary decision. You design a multi-tier storage path. Define latency SLO and workload profile first, model capacity second, and optimize cost through tiering and lifecycle policies third.

Related chapters

Enable tracking in Settings