Глава 32. Twitter/X — System Design Space

Введение

Проектирование Twitter — это классическая задача на собеседованиях по System Design. Ключевые challenge: доставка твитов миллионам фолловеров, генерация персонализированной ленты в реальном времени и определение трендовых тем. Этот кейс демонстрирует trade-offs между fanout-on-write и fanout-on-read подходами.

Связанный кейс

Chat System

Другой кейс real-time доставки сообщений

Читать обзор

Функциональные требования

Post Tweet — публикация твита (до 280 символов, медиа)
Home Timeline — лента твитов от людей, на которых подписан
User Timeline — все твиты конкретного пользователя
Follow/Unfollow — подписка на пользователей
Like/Retweet — взаимодействие с твитами
Search — поиск по твитам и пользователям
Trending Topics — популярные темы в реальном времени
Notifications — уведомления о mentions, likes, retweets

Нефункциональные требования

Scale: 500M пользователей, 200M DAU
Tweets/day: 500M новых твитов
Read-heavy: соотношение read:write = 1000:1
Timeline latency: <200ms для home timeline
Eventual consistency: твит может появиться с задержкой 5-10 сек
Availability: 99.99% uptime
Celebrity problem: пользователи с 100M+ фолловеров

Traffic estimation: 200M DAU × 100 timeline reads/day = 20B reads/day ≈ 230K QPS (read). 500M tweets/day ≈ 6K QPS (write).

Core Problem: Feed Generation

Главная архитектурная проблема Twitter — как эффективно генерировать home timeline. Есть два фундаментальных подхода: fanout-on-write (push) иfanout-on-read (pull).

Fanout-on-Write (Push)

При публикации твита он сразу записывается в timeline cache каждого фолловера.

✅ Быстрое чтение timeline (уже готов)
✅ Простая логика при чтении
❌ Медленная публикация для celebrities
❌ Много storage (дублирование)
❌ Wasted work для неактивных users

Fanout-on-Read (Pull)

Timeline собирается на лету при запросе: получаем список followees, их последние твиты, merge.

✅ Быстрая публикация (O(1))
✅ Меньше storage
✅ Нет wasted work
❌ Медленное чтение (много запросов)
❌ Сложный merge в реальном времени

Источник

Alex Xu & DDIA

Подробнее о fanout patterns в книгах Alex Xu и DDIA (Chapter 12)

Читать обзор

Hybrid Approach (Twitter's Solution)

Twitter использует гибридный подход: fanout-on-write для обычных пользователей и fanout-on-read для celebrities (users с >10K фолловеров).

Interactive Architecture Map

Переключайте путь, чтобы подсветить fanout-on-write или fanout-on-read

Tweet Ingestion

API + Media Upload

Tweet Service

Store tweet + detect celebrity

Regular Users (fanout-on-write)

Fanout Service

Write to followers timeline cache

Timeline Cache

Redis sorted sets per user

Celebrities (fanout-on-read)

Celebrity Store

Separate tweet store

Merge on Read

Combine with cached feed

Timeline Service

Rank + filter + deliver

Celebrity Detection

Порог ~10,000 фолловеров. При публикации: если followers > threshold, твит НЕ fanout'ится, а хранится в отдельном "celebrity tweets" store. При чтении timeline: merge pre-computed feed + свежие celebrity tweets.

Timeline Cache Architecture

Redis Timeline Structure:

# Каждый user имеет свой timeline в Redis
# Key: timeline:{user_id}
# Value: Sorted Set (score = timestamp, member = tweet_id)

ZADD timeline:12345 1705234567 "tweet_abc123"
ZADD timeline:12345 1705234890 "tweet_def456"

# Получение последних 100 твитов
ZREVRANGE timeline:12345 0 99

# Timeline storage estimation:
# 200M users × 800 tweets × 8 bytes = 1.28 TB
# С replication ×3 = ~4 TB Redis cluster

Timeline Limits

Max 800 tweets в timeline cache
Старые твиты удаляются (ZREMRANGEBYRANK)
Для старых твитов — запрос в DB

Fanout Workers

Async processing через Message Queue
Batch updates для эффективности
Retry logic для failed fanouts

Trend Scoring Formula

# Simplified Twitter Trending Score

def calculate_trend_score(topic, current_window):
    # Current velocity: tweets per minute in last 5 min
    current_count = count_in_window(topic, minutes=5)
    
    # Baseline: average tweets per 5-min over past 7 days
    baseline_count = get_baseline(topic, days=7)
    
    # Velocity ratio: how much faster than normal
    if baseline_count > 0:
        velocity_ratio = current_count / baseline_count
    else:
        velocity_ratio = current_count * 10  # New topic bonus
    
    # Recency weight: exponential decay
    recency_weight = sum(
        tweet.weight * exp(-lambda * (now - tweet.timestamp))
        for tweet in current_window.tweets
    )
    
    # Engagement boost
    engagement_score = (likes + retweets * 2 + replies * 3) / total_tweets
    
    # Final score
    score = velocity_ratio * recency_weight * (1 + log(engagement_score))
    
    # Spam/bot filtering
    if unique_users_ratio < 0.3:  # Too few unique users
        score *= 0.1  # Heavy penalty
    
    return score

Timeline Ranking

Современный Twitter использует ML-based ranking вместо чистого chronological order. Модель предсказывает вероятность engagement для каждого твита.

Ranking Features:

Tweet Features

Age of tweet
Has media (image/video)
Has links
Tweet length
Current engagement rate
Author's average engagement

User Features

Historical interactions with author
Topic interests
Activity patterns (time of day)
Social graph proximity
Device type
Session context

Two-Pass Ranking

Pass 1 (Candidate Generation): Быстрый retrieval ~1000 candidate tweets из timeline cache + celebrity tweets + "In case you missed it".

Pass 2 (Ranking): ML model scores каждого candidate. Top 50 показываются пользователю. Inference <50ms.

Search Architecture

Search Pipeline

Переключайте между indexing и query flow

Indexing Flow

Ingestion

Tweet → tokenize

Realtime Index

Lucene in-memory shards

Cold Index

On-disk segments

Query Flow

Parse Query

Terms + filters

Fanout to Shards

Parallel search

Merge + Rank

Recency + engagement

Index latency ~10s

Search latency p99 < 200ms

Data Model

# Core Tables

tweets {
    tweet_id: UUID (Snowflake ID)
    user_id: UUID
    content: VARCHAR(280)
    media_urls: JSON
    created_at: TIMESTAMP
    reply_to_tweet_id: UUID (nullable)
    retweet_of_id: UUID (nullable)
    quote_tweet_id: UUID (nullable)
    like_count: INT
    retweet_count: INT
    reply_count: INT
}

users {
    user_id: UUID
    username: VARCHAR(15)
    display_name: VARCHAR(50)
    bio: VARCHAR(160)
    follower_count: INT
    following_count: INT
    is_verified: BOOLEAN
    is_celebrity: BOOLEAN  # follower_count > 10K
    created_at: TIMESTAMP
}

follows {
    follower_id: UUID
    followee_id: UUID
    created_at: TIMESTAMP
    PRIMARY KEY (follower_id, followee_id)
}

# Denormalized for read performance
user_timelines (Redis Sorted Set)
    Key: timeline:{user_id}
    Score: tweet_timestamp
    Member: tweet_id

# Celebrity tweets (separate store)
celebrity_tweets (Redis Sorted Set)
    Key: celebrity:{user_id}
    Score: tweet_timestamp
    Member: tweet_id

Snowflake ID Generation

Twitter разработал Snowflake для генерации уникальных, сортируемых ID без централизованной координации.

# Snowflake ID Structure (64-bit)
┌────────────────────────────────────────────────────────────────┐
│ 1 bit │   41 bits timestamp   │ 10 bits │    12 bits          │
│unused │   (milliseconds)      │machine  │   sequence          │
│       │   since epoch         │   ID    │   number            │
└────────────────────────────────────────────────────────────────┘

# Properties:
# - Roughly sortable by time
# - No coordination needed
# - 4096 IDs per millisecond per machine
# - 69 years before overflow

# Example ID: 1605978261000000000
# Timestamp: 2020-11-21 12:31:01 UTC
# Machine: 42
# Sequence: 0

def generate_snowflake_id(machine_id, last_timestamp, sequence):
    timestamp = current_time_ms() - TWITTER_EPOCH
    
    if timestamp == last_timestamp:
        sequence = (sequence + 1) & 0xFFF  # 12 bits
        if sequence == 0:
            timestamp = wait_next_millis(last_timestamp)
    else:
        sequence = 0
    
    id = (timestamp << 22) | (machine_id << 12) | sequence
    return id, timestamp, sequence

High-Level Architecture

Выберите поток, чтобы подсветить ключевые компоненты

Edge & Routing

Clients

Web + Mobile

CDN

Static + media

Load Balancer

Traffic routing

API Gateway

Auth + rate limit

Core Services

Tweet Service

Write tweets

Timeline Service

Feed assembly

Search Service

Query layer

Trending Service

Stream processing

User Service

Profiles + graph

Async + Data

Message Queue

Kafka + fanout

Tweets DB

Primary storage

Timeline Cache

Redis ZSETs

Search Index

Lucene/ES

Trending Cache

Redis sorted sets

Graph DB

Followers graph

Interview Tips

Ключевые trade-offs

Fanout-on-write vs fanout-on-read — объясни hybrid подход
Consistency vs latency — eventual consistency acceptable
Storage vs compute — pre-compute vs on-demand
Accuracy vs performance в trending (Count-Min Sketch)

Частые follow-up вопросы

Как обрабатывать celebrity with 100M followers?
Как масштабировать fanout workers?
Как определить spam trends?
Как реализовать "For You" персонализацию?

Дополнительные ресурсы

Twitter Engineering Blog

Официальный блог с deep dives в архитектуру

DDIA: Chapter 11-12

Stream Processing и Future of Data Systems

Twitter/X

Введение

Функциональные требования

Нефункциональные требования

Core Problem: Feed Generation

Fanout-on-Write (Push)

Fanout-on-Read (Pull)

Hybrid Approach (Twitter's Solution)

Interactive Architecture Map

Celebrity Detection

Timeline Cache Architecture

Redis Timeline Structure:

Timeline Limits

Fanout Workers

Trending Topics

Trending Pipeline

Count-Min Sketch

Sliding Window

Trend Scoring Formula

Timeline Ranking

Ranking Features:

Tweet Features

User Features

Two-Pass Ranking

Search Architecture

Search Pipeline

Data Model

Snowflake ID Generation

High-Level Architecture

High-Level Architecture

Interview Tips

Ключевые trade-offs

Частые follow-up вопросы

Дополнительные ресурсы

Twitter Engineering Blog

DDIA: Chapter 11-12