System Design Space
Knowledge graphSettings

Updated: April 8, 2026 at 1:35 PM

URL Shortener (TinyURL)

medium

Classic URL shortening case: choosing short-code length, generating unique IDs, caching redirects, and scaling the mapping store.

A URL shortener has a simple outer interface, but underneath it is a case about a very hot read path, short-ID generation, and scaling the mapping store.

The chapter is useful because it forces you to separate the write path from the read path: what to cache, where hot keys appear, and how to keep the ID generator from becoming a new failure point.

For interviews and architecture discussions, it is valuable because it quickly reveals whether you can spot read-write asymmetry, identify the real bottleneck, and avoid premature complexity.

Control Plane

Focus on policy, limits, routing, and stable edge behavior under variable load.

Data Path

Keep latency and throughput predictable while traffic and burst pressure increase.

Failure Modes

Cover fail-open/fail-close behavior, graceful degradation, and safe fallback paths.

Ops Ready

Show monitoring for saturation, retry storms, and practical operational guardrails.

URL Shortener (TinyURL, bit.ly) is a classic system design case. It looks simple because the surface area is tiny: create a short link and resolve it later. Underneath, though, it quickly turns into a discussion about compact IDs, extremely hot read paths, caching, and mapping storage that must scale with traffic.

That is why the case works so well in interviews: even on a small system, you can still discuss latency, throughput, availability, and the cost of mistakes on the most popular user path.

Chapter 8

Alex Xu: URL Shortener

Detailed analysis in the book System Design Interview

Читать обзор

Why a URL shortening service matters

Convenience

Short links are easier to remember, share on social platforms, and use in SMS.

Analytics

You can track clicks, user geography, and traffic sources.

Control

You can disable a link, set an expiration time, or protect it with a password.

Requirements

Functional

  • FR1
    Creating a short link from a long URL
  • FR2
    Redirect via short link to original URL
  • FR3
    Link lifetime (TTL, optional)
  • FR4
    Custom alias for the link (optional)

Non-functional

  • NFR1
    100M new URLs per day
  • NFR2
    10:1 read-to-write ratio → 1B redirects per day
  • NFR3
    Redirect latency < 100 ms
  • NFR4
    99.9% availability

Quick scale estimate

Traffic

  • Write: 100M/day = 1,160 QPS
  • Read: 1B/day = 11,600 QPS
  • Peak: ~23,000 QPS (2x average)

Storage

  • Avg URL size: 500 bytes
  • 100M × 500B = 50GB/day
  • 5 years: 50GB × 365 × 5 ≈ 90TB

Length of the short URL

How many characters are needed for a unique identifier? We use base62 (a-z, A-Z, 0-9):

LengthCombinationsURLs over 5 years
6 characters62⁶ = 56.8BNot enough
7 characters62⁷ = 3.5T✓ Enough
8 characters62⁸ = 218TWith reserve

Conclusion: 7 base62 characters give 3.5 trillion combinations. At 100M URLs per day, that lasts for about 96 years.

ID generation strategies

1
Hash + Collision Resolution

Take MD5/SHA256 of the URL, keep the first 7 characters, and check whether a collision occurs.

✓ Pros:
  • Deterministic (same URL = same hash)
  • No central point of failure
✗ Cons:
  • Collisions require retry + DB lookup
  • Harder to support custom aliases

2
Unique ID Generator + Base62
Recommended

Generate a unique numeric ID, then convert it to base62.

✓ Pros:
  • Guaranteed unique (no collisions)
  • Simple logic
  • Easy to support custom aliases
✗ Cons:
  • Requires a dedicated ID generator service
  • Identical URLs can give different short URLs

Options for generating IDs

Auto-increment DB

A simple solution with an auto-increment primary key.

⚠️ Becomes a single point of failure and scales poorly

Multi-master DB

Two servers: one generates even IDs and the other generates odd IDs.

✓ Easy to scale at first, but limited by the number of writer nodes

UUID

128-bit unique identifier generated on the client.

⚠️ Too long at 36 characters, which defeats the point of a short link

Snowflake ID
Recommended

64-bit ID: timestamp + datacenter + machine + sequence number.

✓ Distributed, time-sortable, and compact

Snowflake

Twitter/X: Snowflake ID

Detailed analysis of the ID generation algorithm

Читать обзор

High-level architecture

Architecture map

Click a flow to highlight it
Client

Browser / App

Load Balancer

Edge routing

URL Service

Stateless API

Read path

Cache

Redis

Database

PostgreSQL / Cassandra

Write path

ID Generator

Snowflake

Database

PostgreSQL / Cassandra

Highlight a flow

A cache miss is shown as a dashed line to the database.

Write flow
Read flow

Write Path

  1. 1. The client sends a long URL
  2. 2. The ID generator produces a unique identifier
  3. 3. Convert the ID to base62 and produce the short URL
  4. 4. Store the mapping in the database
  5. 5. Return the short URL to the client

Read Path

  1. 1. The client requests the short URL
  2. 2. Check the cache (Redis) first
  3. 3. On a cache miss, query the database
  4. 4. Refresh the cache
  5. 5. Return HTTP 301/302

301 vs 302 redirects

301 Moved Permanently

The browser caches the redirect, so later requests may go straight to the target URL.

✓ Less load on the server

✗ Much harder to keep full click analytics on the service side

302 Found
Recommended

The browser does not cache the redirect, so every click still passes through the service.

✓ Full click analytics

✓ You can change the target URL

Data model

urls table

ColumnTypeDescription
short_urlVARCHAR(7)Primary key, base62 encoded
original_urlTEXTOriginal long URL
user_idBIGINTLink creator (optional)
created_atTIMESTAMPCreation date
expires_atTIMESTAMPTTL (null = unlimited)

Deep Dive

Database Internals

Indexes, B-trees, and optimization for read-heavy workloads

Читать обзор

Caching strategy

A 10:1 read-to-write ratio makes caching critical. Redis keeps the hottest short URLs close to the application.

Strategy

  • Cache-aside: read from the cache first, then fall back to the database on a miss
  • LRU eviction: evict rarely used URLs
  • Write-through: write to the cache immediately when a short link is created

Cache Size

20% daily reads × avg URL size

= 200M × 500B = 100GB

→ Redis cluster with replication

CDN

Content Delivery Network

Geo-distributed caching for global systems

Читать обзор

Choosing a database

PostgreSQL

  • ✓ ACID guarantees
  • ✓ Easy to use
  • ✓ Good for moderate traffic
  • ✗ Horizontal scaling is more difficult

Cassandra / DynamoDB
For scale

  • ✓ Linear horizontal scaling
  • ✓ High availability without a single point of failure
  • ✓ Optimized for write-heavy workloads
  • ✗ Eventually consistent

What to emphasize in an interview

In an interview, the point is not just to draw a diagram. You want to make the trade-offs explicit: why this ID strategy fits the case, how caching changes the read path, and where the service gives up flexibility in exchange for speed, simplicity, or lower cost.

What to show clearly

• How base62 works and why 7 characters are enough

• What trade-offs exist between hashing and an ID generator

• Why 301 and 302 affect analytics differently

• Why caching is the main lever in a read-heavy system

Frequent follow-up questions

• How do you handle duplicate URLs?

• How do you support custom aliases?

• How do you delete expired URLs?

• How do you protect the service from abuse?

Related chapters

Enable tracking in Settings