System Design Space
Knowledge graphSettings

Updated: April 30, 2026 at 7:40 AM

Content Delivery Network (CDN)

medium

Classic case: content delivery network design, regional routing, cache invalidation, origin shielding, and the trade-off between preloading and on-demand caching.

A content delivery network is not just a cache closer to the user. It is a system about traffic geography, cache hierarchy, content freshness, and origin protection under global load.

The chapter connects DNS routing, points of presence, an intermediate shielding layer, and cache lifetime policy into one design where response speed constantly competes with freshness.

For interviews and engineering discussions, this case quickly shows whether you can think beyond a single region, price the cost of a cache miss, and protect the origin under heavy traffic.

Traffic geography

You need to control where a user request actually lands through routing, nearest-PoP selection, and safe fallback paths when a region degrades.

Cache hierarchy

The edge cache, shielding layer, and origin store should behave like one chain rather than a set of unrelated nodes.

Content freshness

Decide in advance where TTL is enough, where purge is necessary, and where versioned URLs are the only safe option.

Origin protection

During cache misses and burst traffic, you need to cap fan-in to the origin, coalesce identical requests, and define a clear degraded mode.

Content delivery network (CDN) is a geographically distributed layer of servers that caches and serves content from the nearest point of presence (PoP). For modern web products, it reduces latency and removes a large share of traffic from origin servers.

Source

Acing the System Design Interview

A detailed breakdown of CDN architecture, cache invalidation, and the main design trade-offs.

Читать обзор

What a CDN solves

  • Lower latency: content is served from the nearest edge location
  • Less origin pressure: a large share of requests ends before it reaches the origin layer
  • Scalability: traffic can be spread across regions and points of presence
  • Fault tolerance: traffic can be rerouted when one location fails
  • Attack absorption: distributed infrastructure handles spikes and hostile traffic better than a single origin

Functional requirements

Core capabilities

  • Static content caching
  • Geographic traffic routing
  • Cache invalidation and refresh
  • Origin failover

Extended capabilities

  • Dynamic content acceleration
  • Edge-side computation
  • Secure connection termination
  • Request and response transformation

Non-functional requirements

RequirementTarget valueRationale
Latency< 50 ms (p99)The user should not wait for the page to begin loading
Cache hit ratio> 95%Keep origin pressure as low as possible
Availability99.99%The CDN sits on the critical external path
ThroughputTbps+The system must handle global traffic and burst load

CDN architecture

System components

1. DNS-based routing

The entry point is DNS. In a global setup it is often combined with GeoDNS and Anycast so users land on the nearest point of presence with the lowest practical latency.

2. Edge nodes at each PoP

These nodes accept user traffic, serve content from local cache, and only forward requests deeper into the stack when necessary.

3. Intermediate origin shield

This layer aggregates cache misses from many PoPs and prevents the origin from receiving a flood of identical requests.

4. Origin server

The server or object store that is only accessed when content is not found in the intermediate caches.

CDN request path

User
DNS
Edge (PoP)
Miss
Shield

Ready to run

Press a button to demo the flow

10-50ms
Edge cache hit
50-150ms
Shield cache hit
200-500ms+
Origin fetch

Preloaded vs on-demand caching

If the content set is known in advance and critical for the first load, it can be pushed out to the edge ahead of time. If the library changes constantly and is mostly user-generated content, on-demand lazy caching is usually a better fit. The trade-off is that the first user still pays the cold-start cost.

Preloaded content

Content is distributed to edge nodes before the first user request ever arrives.

Advantages:

  • No first-request penalty
  • Predictable performance
  • Strong control over distribution timing

Limitations:

  • Requires explicit rollout control
  • Rare content still consumes edge capacity
  • Cross-region synchronization is harder
Best for: static sites, software distribution, critical frontend assets

On-demand caching

Content is cached only after the first real request reaches the edge node.

Advantages:

  • Adapts automatically to real demand
  • Uses storage more efficiently
  • Is simpler to operate

Limitations:

  • The first request pays for the cache miss
  • Misses increase origin pressure
  • Latency is less predictable
Best for: dynamic sites and large libraries of user-generated content

Cache invalidation

Freshness depends on choosing the right TTL, defining a clear purge path, and knowing which objects can be refreshed in the background and which cannot.

Cache Invalidation Strategies

Edge Cache
TTL: 01:00
Cached

TTL expiry

Content expires automatically after a configured Time-To-Live (TTL).

LowDelayed
Advantages
  • Simple setup via HTTP headers
  • No provider API integration required
  • Predictable cache behavior
Drawbacks
  • Updates wait until the TTL expires
  • Picking the right value is tricky
  • No instant invalidation
Use case: Static content with infrequent updates

Caching strategy

What is worth caching?

Content typeCacheabilityRecommended TTL
Static files (JS, CSS)High1 year with versioning
ImagesHigh1 month to 1 year
HTML pagesMedium5 minutes to 1 hour
Public API responsesMedium1 minute to 1 hour
Personalized contentLowUsually do not cache

Cache key design

The cache key decides which content variants are treated as different objects. Mistakes here lead either to cache pollution or a weak hit ratio.

# Simple key (URL only):
cache_key = hash(url)

# Extended key:
cache_key = hash(url + headers["Accept-Encoding"] +
                 headers["Accept-Language"] +
                 query_params["version"])

# Vary tells the CDN which fields belong in the key:
Vary: Accept-Encoding, Accept-Language

Security and origin protection

The external CDN path is usually where both attack traffic and encryption first land, so the design has to account for DDoS defense, TLS termination, strict HTTPS policy, and certificate-status handling.

Attack absorption

  • Rate limiting on edge nodes
  • Anycast to spread burst traffic
  • Traffic scrubbing centers
  • Bot and anomaly detection

Connection security

  • Terminate secure sessions at the edge
  • Shared and dedicated certificates
  • Encrypted origin connections
  • Strict HTTPS policy and certificate-status stapling

Access control

  • Signed URLs and cookies
  • Access tokens
  • IP allow-lists
  • Geographic restrictions

Origin protection

  • Intermediate origin shield
  • Request coalescing
  • Hidden origin hostname
  • Firewall rules limited to CDN IP ranges

Metrics and observability

The user path is best understood through TTFB, the percentage of responses served from cache, and the volume of traffic that still reaches the origin layer.

Cache hit ratio

The share of requests completed without touching the origin

TTFB

How quickly the user receives the first byte of the response

Traffic volume

Total bytes served and regional traffic spikes

Key alerts:

  • Cache hit ratio < 90% → review TTLs and cache key structure
  • Origin errors > 1% → inspect the origin or the shielding layer
  • TTFB p99 > 100 ms → review routing and the path to origin
  • Traffic spike → possible attack or sudden content popularity

Interview questions

How do you keep cache invalidation consistent?

Use versioned URLs for immutable content, a purge API for urgent updates, and stale-while-revalidate where a short window of stale data is acceptable.

How do you protect the origin from a flood of identical misses?

Combine request coalescing, an origin shield, a circuit breaker, and selective cache pre-warming for the hottest objects.

When do you choose preloading over on-demand caching?

Preloading fits a limited set of critical assets. On-demand caching works better for large libraries and long-tail content that is not worth pushing everywhere in advance.

How do you handle dynamic content?

Use ESI, fragment caching, short TTLs with stale-while-revalidate, or edge computing to assemble personalized responses closer to the user.

Key takeaways

  • 1.A CDN is critical for global scale because it cuts latency and removes pressure from the origin layer.
  • 2.Cache invalidation stays hard, so TTLs, versioning, and purge flows should be designed together rather than separately.
  • 3.An origin shield reduces fan-in to the origin and helps the system survive mass cache misses.
  • 4.The choice between preloading and on-demand caching depends on content shape, freshness requirements, and the cost of a miss.
  • 5.Cache hit ratio and TTFB tell you the most about how the CDN changes the user path.

Related chapters

Enable tracking in Settings