A content delivery network is not just a cache closer to the user. It is a system about traffic geography, cache hierarchy, content freshness, and origin protection under global load.
The chapter connects DNS routing, points of presence, an intermediate shielding layer, and cache lifetime policy into one design where response speed constantly competes with freshness.
For interviews and engineering discussions, this case quickly shows whether you can think beyond a single region, price the cost of a cache miss, and protect the origin under heavy traffic.
Traffic geography
You need to control where a user request actually lands through routing, nearest-PoP selection, and safe fallback paths when a region degrades.
Cache hierarchy
The edge cache, shielding layer, and origin store should behave like one chain rather than a set of unrelated nodes.
Content freshness
Decide in advance where TTL is enough, where purge is necessary, and where versioned URLs are the only safe option.
Origin protection
During cache misses and burst traffic, you need to cap fan-in to the origin, coalesce identical requests, and define a clear degraded mode.
Content delivery network (CDN) is a geographically distributed layer of servers that caches and serves content from the nearest point of presence (PoP). For modern web products, it reduces latency and removes a large share of traffic from origin servers.
Source
Acing the System Design Interview
A detailed breakdown of CDN architecture, cache invalidation, and the main design trade-offs.
What a CDN solves
- Lower latency: content is served from the nearest edge location
- Less origin pressure: a large share of requests ends before it reaches the origin layer
- Scalability: traffic can be spread across regions and points of presence
- Fault tolerance: traffic can be rerouted when one location fails
- Attack absorption: distributed infrastructure handles spikes and hostile traffic better than a single origin
Functional requirements
Core capabilities
- Static content caching
- Geographic traffic routing
- Cache invalidation and refresh
- Origin failover
Extended capabilities
- Dynamic content acceleration
- Edge-side computation
- Secure connection termination
- Request and response transformation
Non-functional requirements
| Requirement | Target value | Rationale |
|---|---|---|
| Latency | < 50 ms (p99) | The user should not wait for the page to begin loading |
| Cache hit ratio | > 95% | Keep origin pressure as low as possible |
| Availability | 99.99% | The CDN sits on the critical external path |
| Throughput | Tbps+ | The system must handle global traffic and burst load |
CDN architecture
System components
1. DNS-based routing
The entry point is DNS. In a global setup it is often combined with GeoDNS and Anycast so users land on the nearest point of presence with the lowest practical latency.
2. Edge nodes at each PoP
These nodes accept user traffic, serve content from local cache, and only forward requests deeper into the stack when necessary.
3. Intermediate origin shield
This layer aggregates cache misses from many PoPs and prevents the origin from receiving a flood of identical requests.
4. Origin server
The server or object store that is only accessed when content is not found in the intermediate caches.
CDN request path
Ready to run
Press a button to demo the flow
Preloaded vs on-demand caching
If the content set is known in advance and critical for the first load, it can be pushed out to the edge ahead of time. If the library changes constantly and is mostly user-generated content, on-demand lazy caching is usually a better fit. The trade-off is that the first user still pays the cold-start cost.
Preloaded content
Content is distributed to edge nodes before the first user request ever arrives.
Advantages:
- No first-request penalty
- Predictable performance
- Strong control over distribution timing
Limitations:
- Requires explicit rollout control
- Rare content still consumes edge capacity
- Cross-region synchronization is harder
On-demand caching
Content is cached only after the first real request reaches the edge node.
Advantages:
- Adapts automatically to real demand
- Uses storage more efficiently
- Is simpler to operate
Limitations:
- The first request pays for the cache miss
- Misses increase origin pressure
- Latency is less predictable
Cache invalidation
Freshness depends on choosing the right TTL, defining a clear purge path, and knowing which objects can be refreshed in the background and which cannot.
Cache Invalidation Strategies
TTL expiry
Content expires automatically after a configured Time-To-Live (TTL).
Advantages
- •Simple setup via HTTP headers
- •No provider API integration required
- •Predictable cache behavior
Drawbacks
- •Updates wait until the TTL expires
- •Picking the right value is tricky
- •No instant invalidation
Caching strategy
What is worth caching?
| Content type | Cacheability | Recommended TTL |
|---|---|---|
| Static files (JS, CSS) | High | 1 year with versioning |
| Images | High | 1 month to 1 year |
| HTML pages | Medium | 5 minutes to 1 hour |
| Public API responses | Medium | 1 minute to 1 hour |
| Personalized content | Low | Usually do not cache |
Cache key design
The cache key decides which content variants are treated as different objects. Mistakes here lead either to cache pollution or a weak hit ratio.
# Simple key (URL only):
cache_key = hash(url)
# Extended key:
cache_key = hash(url + headers["Accept-Encoding"] +
headers["Accept-Language"] +
query_params["version"])
# Vary tells the CDN which fields belong in the key:
Vary: Accept-Encoding, Accept-LanguageSecurity and origin protection
The external CDN path is usually where both attack traffic and encryption first land, so the design has to account for DDoS defense, TLS termination, strict HTTPS policy, and certificate-status handling.
Attack absorption
- Rate limiting on edge nodes
- Anycast to spread burst traffic
- Traffic scrubbing centers
- Bot and anomaly detection
Connection security
- Terminate secure sessions at the edge
- Shared and dedicated certificates
- Encrypted origin connections
- Strict HTTPS policy and certificate-status stapling
Access control
- Signed URLs and cookies
- Access tokens
- IP allow-lists
- Geographic restrictions
Origin protection
- Intermediate origin shield
- Request coalescing
- Hidden origin hostname
- Firewall rules limited to CDN IP ranges
Metrics and observability
The user path is best understood through TTFB, the percentage of responses served from cache, and the volume of traffic that still reaches the origin layer.
The share of requests completed without touching the origin
How quickly the user receives the first byte of the response
Total bytes served and regional traffic spikes
Key alerts:
- Cache hit ratio < 90% → review TTLs and cache key structure
- Origin errors > 1% → inspect the origin or the shielding layer
- TTFB p99 > 100 ms → review routing and the path to origin
- Traffic spike → possible attack or sudden content popularity
Interview questions
How do you keep cache invalidation consistent?
Use versioned URLs for immutable content, a purge API for urgent updates, and stale-while-revalidate where a short window of stale data is acceptable.
How do you protect the origin from a flood of identical misses?
Combine request coalescing, an origin shield, a circuit breaker, and selective cache pre-warming for the hottest objects.
When do you choose preloading over on-demand caching?
Preloading fits a limited set of critical assets. On-demand caching works better for large libraries and long-tail content that is not worth pushing everywhere in advance.
How do you handle dynamic content?
Use ESI, fragment caching, short TTLs with stale-while-revalidate, or edge computing to assemble personalized responses closer to the user.
Key takeaways
- 1.A CDN is critical for global scale because it cuts latency and removes pressure from the origin layer.
- 2.Cache invalidation stays hard, so TTLs, versioning, and purge flows should be designed together rather than separately.
- 3.An origin shield reduces fan-in to the origin and helps the system survive mass cache misses.
- 4.The choice between preloading and on-demand caching depends on content shape, freshness requirements, and the cost of a miss.
- 5.Cache hit ratio and TTFB tell you the most about how the CDN changes the user path.
Related chapters
- Acing the System Design Interview (short summary) - provides a clean frame for the case: requirements, scale, architecture, and the main trade-offs.
- Object Storage (S3) - covers the origin layer: object durability, lifecycle rules, and what happens during cache misses.
- Designing Data-Intensive Applications, 2nd Edition (short summary) - strengthens the foundation behind replication, consistency, and distributed trade-offs in a global delivery network.
- Caching strategies: Cache-Aside, Read-Through, Write-Through, Write-Back - extends the discussion of TTL selection, invalidation, and cache hit ratio management on edge nodes.
- Video Feed (YouTube/TikTok) - shows a heavy media workload where geo-distributed delivery and origin protection become critical.
- Domain Name System (DNS) - explains DNS-based routing and nearest-PoP selection through GeoDNS and Anycast.
- System design case studies examples - places the CDN case in broader interview context and makes it easier to compare with other architecture problems.
- Uber/Lyft - adds a global system with hard latency requirements, where regional traffic placement is critical.
- URL Shortener (TinyURL) - covers a related redirect-heavy workload where serving responses closer to the user reduces origin pressure.
