System Design Space
Knowledge graphSettings

Updated: March 2, 2026 at 9:25 AM

Video hosting feed

mid

Public interview at C++ Russia 2022: transcoding, fan-out strategies, CDN and asynchronous processing.

At the C++ Russia 2022 conference we conducted a public System Design interview with the task of designing video hosting feed - system, similar to YouTube. This case demonstrates working with asynchronous processing, video transcoding and building personalized feeds.

Video recording of the interview

The full recording of the public interview is available on YouTube. I recommend watching it to see the iterative design process.

Watch on YouTube

Statement of the problem

Video Hosting Feed

Design an application that allows content creators to upload videos, and viewers can view it in a chronological feed with the ability to select quality.

Functional

Fast video upload experience for content creators.

Publish videos into subscriber feeds after processing is completed.

Playback quality selection from 360p to 1080p.

Chronological feed with simple ordering by publication time.

Non-functional

Availability: high availability

Viewing and upload flows should stay available under partial failures.

Scalability: horizontal growth

The system should scale with growing audience and content volume.

Fault tolerance: failure resilience

Single-node failures should not cause video loss or serving downtime.

Cost efficiency: infrastructure cost control

Architecture should use storage, CDN, and compute resources efficiently.

Load Estimation (Back of the Envelope)

DAU (Daily Active Users)10 million
Views per day100 million
Video downloads per day1 million (~12 RPS)
Tape requests per day50 million (~580 RPS)
Storage growth per day~300 TB

Key Takeaway: Main load - it's video storage and delivery, not the number of requests. Effective use of CDN and storage optimization are critical.

High-level architecture

The system is divided into two main paths: Write Path (video downloading and processing) and Read Path (view and feed).

Video Hosting: High-Level Map

ingestion + transcoding + feed + video delivery

Ingestion + Processing Plane

Creator
upload video
API Gateway
auth + routing
Upload API
init upload
Transcode Queue
jobs
Video Workers
HLS/DASH renditions

Storage + Serving Plane

Origin Storage
video files + segments
Metadata DB
video state + manifest
Feed DB
user feeds
Viewer -> Feed API -> CDN
playback path
Overall video-hosting map: ingestion, processing, and read serving are separated into dedicated planes.

The upload/transcoding loop scales independently of read-serving. This allows you to separately expand processing workers and edge CDN capacity.

Read/Write Flow

Read/Write Path Explorer

Switch path and replay processing steps in Play mode.

1
Creator
upload video
2
Upload API
validate + store temp
3
Queue + Workers
transcode renditions
4
Origin + Metadata
persist files + manifest
5
Feed Update
fan-out + notify
Write path: press Play to walk through upload-to-feed publication.

Write Path: operational notes

  • The client uploads the original to the upload circuit, after which the API returns `video_id` and processing status.
  • The video is broken down into transcoding tasks; workers generate several quality profiles.
  • Ready renditions and manifests are published in origin storage, and metadata is recorded in the DB.
  • After the `ready` status, fan-out is launched in feed storage and notifications to subscribers.

Read Path: operational notes

  • Viewer requests feed, the service returns a list of `video_id` by subscriptions and personal signals.
  • Metadata API provides playback manifest (HLS/DASH) and available permissions.
  • CDN serves segments from edge nodes; hot content is almost always read from cache.
  • With cache miss, edge requests a segment in origin storage and immediately warms up the local cache.

Key Components

Upload API

Accepts binary files from creators and saves them to temporary storage.

POST /v1/videos → {video_id, upload_url}

Message Queue (Task Broker)

RabbitMQ or equivalent for managing transcoding tasks. A parent task generates 4 child tasks, one for each resolution.

360p
480p
720p
1080p

Video Workers

Stateless workers for transcoding. Read tasks from the queue, process video (FFmpeg), save the result in Blob Storage. Easily scaled horizontally.

Blob Storage

S3-compatible storage (MinIO, Ceph) for video files. Divided into temporary (for downloading) and permanent (processed videos).

Temp StoragePermanent Storage

Feed Database

Document-oriented NoSQL database (MongoDB, Cassandra) for storing pre-computed tapes. Shard by user_id. Feed is an array of video IDs, sorted by time.

CDN

Critical for video delivery. Caches popular content on edge servers closer to users. Reduces the load on Blob Storage and reduces latency.

Data Models

Video Meta

{
  "id": "uuid",
  "author_id": "user_123",
  "title": "My Video",
  "description": "...",
  "created_at": "2024-03-15T10:00:00Z",
  "status": "ready",
  "versions": {
    "360p": "s3://videos/uuid/360p.mp4",
    "480p": "s3://videos/uuid/480p.mp4",
    "720p": "s3://videos/uuid/720p.mp4",
    "1080p": "s3://videos/uuid/1080p.mp4"
  }
}

User Feed

{
  "user_id": "user_456",
  "feed": [
    "video_id_1",
    "video_id_2",
    "video_id_3",
    ...
  ],
  "last_updated": "2024-03-15T12:00:00Z"
}

Pre-calculated feed - an array of video IDs from subscriptions, sorted by time.

Strategies for building a feed (Fan-out)

Fan-out on Write (Push)

When we publish a video, we add it to all subscribers’ feeds.

Fast tape reading
Problem with popular authors (millions of subscribers)

Fan-out on Read (Pull)

When requesting a feed, we collect the latest videos from all subscriptions.

Simple recording
Slow reading with many subscriptions

Hybrid approach

For ordinary authors we use Push (fan-out on write). For authors with millions of subscribers - Pull when reading. This is the classic compromise that Twitter/X makes.

Infrastructure services

Load Balancer L4/L7

Traffic distribution between API servers

Service Discovery

Service registration and discovery (Consul, etcd)

Auto-scaling

Automatic scaling of workers based on load

Monitoring

Metrics, alerts, logging (Prometheus, Grafana)

Rate Limiting

Protection against abuse and DDoS

Circuit Breaker

Graceful degradation during failures

Key takeaways from the interviews

1

Separate Write and Read paths

Asynchronous video processing (Write) and synchronous tape reading (Read) have different requirements and scale differently.

2

Use queues for heavy operations

Transcoding is a resource-intensive operation. Message Queue allows manage the load and easily scale workers.

3

CDN is a must have for media content

With 300 TB of new content per day, the system will not survive without a CDN. Caching on edge reduces load and latency.

4

Anticipate where possible.

Pre-calculated feeds are significantly faster than on-the-fly assembly from multiple sources on every request.

Related materials

Related chapters

Enable tracking in Settings

System Design Space

© 2026 Alexander Polomodov