Video hosting feed — System Design Space

Video hosting does not break on the player screen. It breaks where the system must absorb huge upload volume, survive heavy processing, and then deliver the same asset to millions of viewers at far lower cost.

The chapter ties together video intake, the transcoding queue, storage for source and processed files, feed publication, and CDN delivery into one working system.

For interviews and engineering discussions, this case is useful because it forces a clean explanation of the heavy write path, the hot read path, feed fan-out strategy, and the cost of media traffic.

Heavy Write Path

Upload intake, validation, and video processing should not slow publication and should not interfere with the viewing path, which follows a very different load profile.

Transcoding Queue

Asynchronous processing is what spreads CPU pressure over time, absorbs upload spikes, and lets workers scale independently.

Hot Read Path

Feed lookup, metadata delivery, and segment streaming must stay fast even when a newly published video turns into a traffic spike.

Media Delivery Cost

The bill is driven not only by disks, but also by network egress, cache warmup, the number of quality variants, and repeated reads from origin.

This public interview from C++ Russia 2022 is a good example of how the hard part of video hosting starts well past the player. The system has to accept user uploads, run transcoding, update feeds quickly, and then deliver the same content to millions of viewers through a CDN.

Interview recording

The full public interview is available on YouTube. The value is not the final diagram but the reasoning behind it: where the candidate clarifies requirements, where they pick a trade-off, and where they cut scope.

Watch on YouTube

Problem framing

Video-hosting feed

Design a service where creators upload videos and viewers watch them in a chronological feed with selectable playback quality.

At this scale, product features quickly run into non-functional constraints: availability, throughput, failure tolerance, and cost control. Otherwise the system stalls either on publication speed or on media delivery cost.

Functional

Fast upload experience for content creators.

Publish the video into subscriber feeds after processing is complete.

Let viewers choose playback quality from 360p to 1080p.

Provide a chronological feed with simple ordering by publication time.

Non-functional

Availability: high

Upload and playback should stay available under partial failures.

Scalability: horizontal

Growth in content and audience cannot bottleneck on a single node — capacity is added by adding machines.

Fault tolerance: mandatory

Losing a node must not stop serving traffic or destroy video data.

Cost efficiency: controlled

Compute, storage, and media delivery are the main cost lines; let any of them grow unchecked and the service runs at a loss.

Load estimation (back of the envelope)

DAU (Daily Active Users)10 million

Views per day100 million

Video uploads per day1 million (~12 RPS)

Feed requests per day50 million (~580 RPS)

Storage growth per day~300 TB

Key takeaway: the dominant pressure comes not from raw request count, but from storage and media delivery volume. The real cost drivers are quality variants, caching strategy, and network egress.

High-level architecture

The architecture separates the write path, which accepts and processes uploads, from the read path, which assembles feeds, serves metadata, and streams the final video.

Video Hosting: High-Level Map

uploads, processing, feed generation, and playback delivery

Ingestion and Processing Plane

Creator

video upload

API Gateway

auth and routing

Upload API

source file intake

Processing Queue

transcoding jobs

Video Workers

HLS/DASH variants

Origin Storage

files and segments

Metadata Store

video state and manifest

Feed Store

personalized feeds

Read Serving Plane

Viewer

playback

API Gateway

auth and routing

Feed API

timeline assembly

CDN

edge caching

Origin Storage

fallback source on cache miss

Ingestion and Processing Plane

Creator -> gateway -> Upload API

source video intake and validation

Queue -> video workers

transcoding and quality variants

Storage and Serving Plane

Origin + metadata + feeds

segments, manifest, and publication

Viewer -> Feed API -> CDN

main playback path

The map shows how uploads, processing, metadata, and playback delivery are separated into distinct planes.

Upload and processing can scale independently from the viewing path. That lets the system add workers, cache capacity, and delivery throughput separately.

Write Path and Read Path

Write/Read Path Explorer

Switch the scenario and replay each step in sequence.

Creator

uploads source video

Upload API

validation and temporary storage

Queue and Workers

transcoding

Origin and Metadata

files and playback manifest

Feed Update

fan-out and publish event

Creator

uploads source video

Upload API

validation and temporary storage

Queue and Workers

transcoding

Origin and Metadata

files and playback manifest

Feed Update

fan-out and publish event

Write path: press Play to walk from the upload request to feed publication.

Write path: key steps

The creator uploads the source file and receives a `video_id` plus processing status.
The system splits the job into separate transcoding tasks and produces 360p, 480p, 720p, and 1080p variants.
Finished files and segments land in origin storage, while metadata tracks processing state and publication readiness.
Once the video reaches `ready`, the system updates follower feeds and emits a publication event.

Read path: key steps

A viewer opens the feed, and the service returns `video_id` entries based on subscriptions and simple ranking signals.
The metadata service returns the HLS/DASH playback manifest and the available quality variants.
CDN usually serves segments from an edge cache close to the user.
On a cache miss, CDN fetches the segment from origin storage and warms the local edge cache.

Key components

The critical pieces are the task queue for heavy work, blob storage for intermediate and final files, the origin layer for segment delivery, and the playback manifest in metadata that tells the player how to start streaming.

Upload API

Accepts files from creators, validates limits, and places the source video into temporary storage.

POST /v1/videos → {video_id, upload_url}

Transcoding task queue

RabbitMQ or a similar broker spreads heavy processing across workers. One upload usually expands into several jobs, one per quality profile.

360p

480p

720p

1080p

Video processing workers

These stateless workers read jobs from the queue, run FFmpeg, generate quality variants, and scale horizontally when upload pressure rises.

Blob storage

An S3-compatible store such as MinIO or Ceph keeps source videos, finished files, and stream segments. In practice, teams often split it into temporary and permanent layers.

Temporary storagePermanent storage

Feed store

A document-oriented NoSQL database such as MongoDB or Cassandra stores precomputed feeds. Data is usually sharded by user_id, while each feed is a time-ordered array of video IDs.

CDN

CDN keeps popular segments close to viewers and removes pressure from the origin layer. It is essential for low playback startup time and for protecting the storage layer during traffic spikes.

Data models

Video metadata

{
  "id": "uuid",
  "author_id": "user_123",
  "title": "My Video",
  "description": "...",
  "created_at": "2024-03-15T10:00:00Z",
  "status": "ready",
  "versions": {
    "360p": "s3://videos/uuid/360p.mp4",
    "480p": "s3://videos/uuid/480p.mp4",
    "720p": "s3://videos/uuid/720p.mp4",
    "1080p": "s3://videos/uuid/1080p.mp4"
  }
}

User feed

{
  "user_id": "user_456",
  "feed": [
    "video_id_1",
    "video_id_2",
    "video_id_3",
    ...
  ],
  "last_updated": "2024-03-15T12:00:00Z"
}

A precomputed feed stores only video IDs and the latest refresh timestamp, which keeps the read path cheap and predictable.

Feed delivery strategies

The main product trade-off is where to perform feed fan-out: at publish time or at read time.

→Fan-out on write (push)

When a video is published, the system immediately writes it into subscriber feeds.

Fast feed reads

Expensive publish path for creators with huge audiences

←Fan-out on read (pull)

The feed is assembled on demand from recent uploads across subscriptions.

Simple publish flow

Infrastructure services

L4/L7 Load Balancer

Distributes traffic across public APIs and serving services.

Service Discovery

Helps services find healthy peers through systems like Consul or etcd.

Autoscaling

Adjusts the number of workers based on actual processing pressure.

Monitoring

Collects metrics, alerts, and logs through tools such as Prometheus and Grafana.

Rate Limiting

Protects the platform against abuse and volumetric attacks.

Circuit Breaker

Turns instability into graceful degradation instead of a cascading failure.

Key takeaways from the interview

Separate writes from reads

Upload and processing grow under one set of constraints, while feed serving and media delivery grow under another. Keep those paths independent.

Push heavy work into queues

Transcoding should never sit on the user request path. Asynchronous processing keeps load manageable and makes worker scaling predictable.

CDN is mandatory for media delivery

With hundreds of terabytes of new content per day, a global cache is the only way to keep delivery cost and origin pressure under control.

Precompute what users read most often

A prepared feed is usually cheaper and faster than rebuilding it from multiple sources every time the app opens.

Additional materials

Related chapters

Content Delivery Network (CDN) - helps explain how global caching and distributed delivery keep playback startup fast under heavy traffic.
System Design Interview: An Insider's Guide (short summary) - provides a classic structure for media-system answers: requirements, sizing, write path, read path, and trade-offs.
Hacking the System Design Interview (short summary) - deepens the trade-off discussion around feed fan-out, transcoding queues, storage tiers, and resilience.
System design case studies examples - places the video-hosting problem in the broader case-study landscape and makes architecture comparisons easier.
Short-Term Preparation for System Design Interviews - shows how to package the solution for an interview, from requirements and sizing to architecture, risks, and system evolution.