This chapter matters because it is not really about two kinds of memory, but about dramatic jumps in latency, capacity, and durability across the hierarchy.
In real engineering work, it helps you design hot and cold paths deliberately: what belongs in memory, what moves to SSD or HDD, and how those choices reshape APIs, caches, and user experience.
In interviews and design discussions, it turns storage trade-offs into something concrete: latency cliffs, durability, and infrastructure budget.
Practical value of this chapter
Memory hierarchy
Guides hot/cold path design using realistic latency and cost differences.
Data placement
Clarifies what should live in RAM vs SSD/HDD and how that affects API behavior.
Load stability
Shows how to avoid degradation as working set grows and cache pressure increases.
Interview trade-offs
Strengthens discussion of speed, durability, and infrastructure-budget compromises.
Source
Random-access memory
RAM structure and key properties of volatile memory.
RAM and persistent storage are different levels of one hierarchy: RAM gives minimal latency for hot data, while SSD and HDD provide capacity and durability. In system design, you almost always need a balance between latency, capacity, and cost.
RAM vs persistent storage
How RAM works
- DRAM is organized into cell matrices addressed by the memory controller.
- Access is random, but reads within one row are usually faster because of the row buffer.
- Memory requires periodic refresh, which introduces background latency.
- CPU operates through L1/L2/L3 caches, hiding part of RAM latency.
How persistent storage works
- Storage is block-based: reads and writes happen in pages and blocks.
- HDD adds mechanical seek, SSD works with NAND pages and erase blocks.
- Performance depends on I/O queues, concurrency, and block size.
- File systems and DBMSs build journals, indexes, and compaction strategies on top.
RAM
- Volatile: data is lost when power is removed.
- Very low access latency and high read/write speed.
- Best for hot data, indexes, and caches.
- More expensive per GB than persistent storage.
Persistent storage
- Persistent: data survives power loss.
- Access latency is orders of magnitude higher than RAM.
- Available capacity is much larger.
- Lower storage cost per GB.
Dynamic visualization: capacity and tiering calculator
This simplified model estimates how much data lands in RAM/SSD/HDD after replication and compression.
Daily ingest
600 GB/day
Retention
30 days
Replication
x3
Compression
2.0x
Hot data
20%
SSD share of warm data
60%
Index overhead
35%
Capacity calculator output
Total data
26.37 TB
RAM for hot set
3780 GB
SSD capacity
12.66 TB
HDD capacity
8.44 TB
Data distribution by tier
Hot-data coverage by RAM tier: 70%
Estimated monthly cost: $13781 (~$165372/year).
Source
Solid-state drive
SSD fundamentals and why SSD latency is much lower than HDD latency.
Dynamic visualization: latency calculator
Change load parameters and see how cache-hit, random I/O, and queue depth impact p95 latency for each storage layer.
Cache hit ratio
85%
Random I/O
70%
Queue depth
16
Target IOPS
50,000
Latency SLO (p95)
15 ms
RAM
p95: 0.10 us
Device-only: 0.10 us
Load pressure: 0.20x
Within SLO
NVMe SSD
p95: 46 us
Device-only: 306 us
Load pressure: 0.20x
Within SLO
SATA SSD
p95: 313 us
Device-only: 2.08 ms
Load pressure: 0.71x
Within SLO
HDD
p95: 1084.1 ms
Device-only: 7227.2 ms
Load pressure: 142.86x
Outside SLO
Recommended primary tier: NVMe SSD
Current parameters fit the target SLO for the selected cache-hit profile.
Workload-aware selection calculator
Pick a workload profile and tune the performance/cost balance. The calculator proposes a target RAM/SSD/HDD split.
Cost priority
45%
0 = performance-first, 100 = cost-first
Yearly growth
+35%
Target p99
6 ms
Target IOPS
120,000
Recommended tier split
Projected ingest: 540 GB/day -> projected storage envelope: 23.73 TB
Risk: validate growth and queue depth to avoid hidden peak-time degradation.
How to choose under load
- Use aggressive RAM caching for hot key-space and indexes.
- Use SSD as the primary operational storage layer for read/write paths.
- Cold tier share is small: this profile prioritizes latency over storage cost.
Source
Hard disk drive
Mechanical disk characteristics and random/sequential access behavior.
HDD vs SSD in practical architecture
Branch-heavy APIs and transactional systems usually require SSD for random I/O. HDD remains valuable for archives, historical reads, and long-term retention where latency is not critical.
HDD
- Mechanical components with rotating platters.
- High random I/O latency because of seek time.
- Large capacity and low price per TB.
- Good for cold data, backups, and archives.
SSD
- Flash memory with no moving parts.
- Low latency and high IOPS relative to HDD.
- Much better random access behavior.
- Finite write endurance, so lifecycle policy and monitoring matter.
Common mistakes
Keeping the whole dataset on SSD
Without tiering, cost grows quickly while cold data still occupies the expensive fast tier.
Ignoring the working set
If the hot working set does not fit in RAM, p95/p99 latency can degrade sharply.
Tracking only average latency
For storage, tail behavior matters most: p95/p99 and queue-driven degradation under load.
No lifecycle policy
Without automatic RAM -> SSD -> HDD transitions, storage cost and latency become unpredictable.
Practical recommendations
Tiered storage by access pattern
Keep hot keys and indexes in RAM, operational data on SSD, and long-tail history on HDD or object storage.
Capacity planning with headroom
Plan for replication, compression, index overhead, and yearly traffic growth in one model.
SLO-first storage choice
Set p99 latency and IOPS targets first, then choose RAM/SSD/HDD, and only then optimize cost.
Storage observability
Monitor cache hit ratio, queue depth, fsync latency, and saturation for each storage tier.
Practical conclusion
Choosing between RAM, SSD, and HDD is not a binary decision. You design a multi-tier storage path. Define latency SLO and workload profile first, model capacity second, and optimize cost through tiering and lifecycle policies third.
Related chapters
- Why foundational knowledge matters - explains how memory and disk constraints shape architecture decisions.
- Structured Computer Organization (short summary) - provides hardware context: memory hierarchy, buses, and CPU-I/O interaction.
- Operating system: overview - extends this topic with page cache, I/O scheduling, and kernel-space latency effects.
- Why understand storage systems - broadens the topic to replication, indexing, and long-term storage strategy.
- Database Selection Framework - helps choose the right DB and storage pattern for a specific workload.
- Performance Engineering - adds practical methods for latency/throughput profiling in the I/O path.
- CPU and GPU: overview and differences - connects compute decisions with memory bandwidth and data movement cost.
