Introduction
Designing Twitter is a classic System Design interview task. Key challenges: delivering tweets to millions of followers, generating a personalized feed in real time and identifying trending topics. This case demonstrates the trade-offs between fanout-on-write and fanout-on-read approaches.
Functional Requirements
- Post Tweet — publishing a tweet (up to 280 characters, media)
- Home Timeline — a feed of tweets from people you follow
- User Timeline — all tweets of a specific user
- Follow/Unfollow — subscription to users
- Like/Retweet — interaction with tweets
- Search — search by tweets and users
- Trending Topics — popular topics in real time
- Notifications — notifications about mentions, likes, retweets
Non-functional requirements
- Scale: 500M users, 200M DAU
- Tweets/day: 500M new tweets
- Read-heavy: read:write ratio = 1000:1
- Timeline latency: <200ms for home timeline
- Eventual consistency: Tweet may appear with a delay of 5-10 seconds
- Availability: 99.99% uptime
- Celebrity problem: users with 100M+ followers
Traffic estimation: 200M DAU × 100 timeline reads/day = 20B reads/day ≈ 230K QPS (read). 500M tweets/day ≈ 6K QPS (write).
Core Problem: Feed Generation
Twitter's main architectural problem is how to efficiently generate the home timeline. There are two fundamental approaches: fanout-on-write (push) andfanout-on-read (pull).
Fanout-on-Write (Push)
When a tweet is published, it is immediately recorded in the timeline cache of each follower.
- ✅ Quick reading timeline (already ready)
- ✅ Simple logic when reading
- ❌ Slow publication for celebrities
- ❌ Lots of storage (duplication)
- ❌ Wasted work for inactive users
Fanout-on-Read (Pull)
Timeline is assembled on the fly when requested: we get a list of followers, their latest tweets, merge.
- ✅ Fast publication (O(1))
- ✅ Less storage
- ✅ No wasted work
- ❌ Slow reading (many queries)
- ❌ Complex merge in real time
Hybrid Approach (Twitter's Solution)
Twitter uses hybrid approach: fanout-on-write for ordinary users and fanout-on-read for celebrities (users with >10K followers).
Interactive Architecture Map
Switch paths to highlight fanout-on-write or fanout-on-read
Tweet Ingestion
API + Media Upload
Tweet Service
Store tweet + detect celebrity
Fanout Service
Write to followers timeline cache
Timeline Cache
Redis sorted sets per user
Celebrity Store
Separate tweet store
Merge on Read
Combine with cached feed
Timeline Service
Rank + filter + deliver
Celebrity Detection
Threshold ~10,000 followers. When publishing: if followers > threshold, the tweet is NOT fanouted, but stored in a separate "celebrity tweets" store. When reading timeline: merge pre-computed feed + latest celebrity tweets.
Timeline Cache Architecture
Redis Timeline Structure:
# Each user has his own timeline in Redis
# Key: timeline:{user_id}
#Value: Sorted Set (score = timestamp, member = tweet_id)
ZADD timeline:12345 1705234567 "tweet_abc123"
ZADD timeline:12345 1705234890 "tweet_def456"
# Retrieving the last 100 tweets
ZREVRANGE timeline:12345 0 99
# Timeline storage estimation:
#200M users × 800 tweets × 8 bytes = 1.28 TB
# With replication ×3 = ~4 TB Redis clusterTimeline Limits
- Max 800 tweets in timeline cache
- Old tweets are deleted (ZREMRANGEBYRANK)
- For old tweets - query in DB
Fanout Workers
- Async processing via Message Queue
- Batch updates for efficiency
- Retry logic for failed fanouts
Trending Topics
Determining trends is a real-time stream processing task. It is necessary to track the frequency of occurrence of hashtags and keywords, taking into account the time window and velocity.
Trending Pipeline
Click a stage to highlight the corresponding part of the pipeline
Tweet Stream
Kafka topic
Extract
Hashtags, entities
Filter
Spam + stop words
Scoring
Sliding window + decay
Trend Cache
Redis sorted sets
Count-Min Sketch
Probabilistic data structure for approximate counting. Saves memory when counting millions of unique hashtags. Trade-off: small error (~1-2%) in O(1) space.
Sliding Window
Typically a 5-15 minute window with decay. Hopping window (updated every minute) for smooth transitions. Tumbling window for hourly aggregates.
Trend Scoring Formula
# Simplified Twitter Trending Score
def calculate_trend_score(topic, current_window):
# Current velocity: tweets per minute in last 5 min
current_count = count_in_window(topic, minutes=5)
# Baseline: average tweets per 5-min over past 7 days
baseline_count = get_baseline(topic, days=7)
# Velocity ratio: how much faster than normal
if baseline_count > 0:
velocity_ratio = current_count / baseline_count
else:
velocity_ratio = current_count * 10 # New topic bonus
# Recency weight: exponential decay
recency_weight = sum(
tweet.weight * exp(-lambda * (now - tweet.timestamp))
for tweet in current_window.tweets
)
# Engagement boost
engagement_score = (likes + retweets * 2 + replies * 3) / total_tweets
# Final score
score = velocity_ratio * recency_weight * (1 + log(engagement_score))
# Spam/bot filtering
if unique_users_ratio < 0.3: # Too few unique users
score *= 0.1 # Heavy penalty
return scoreTimeline Ranking
Modern Twitter uses ML-based ranking instead of pure chronological order. The model predicts the probability of engagement for each tweet.
Ranking Features:
Tweet Features
- Age of tweet
- Has media (image/video)
- Has links
- Tweet length
- Current engagement rate
- Author's average engagement
User Features
- Historical interactions with author
- Topic interests
- Activity patterns (time of day)
- Social graph proximity
- Device type
- Session context
Two-Pass Ranking
Pass 1 (Candidate Generation): Quick retrieval ~1000 candidate tweets from timeline cache + celebrity tweets + "In case you missed it".
Pass 2 (Ranking): ML model scores of each candidate. Top 50 are shown to the user. Inference <50ms.
Search Architecture
Search Pipeline
Switch between indexing and query flow
Indexing Flow
Ingestion
Tweet → tokenize
Realtime Index
Lucene in-memory shards
Cold Index
On-disk segments
Query Flow
Parse Query
Terms + filters
Fanout to Shards
Parallel search
Merge + Rank
Recency + engagement
Data Model
# Core Tables
tweets {
tweet_id: UUID (Snowflake ID)
user_id: UUID
content: VARCHAR(280)
media_urls: JSON
created_at: TIMESTAMP
reply_to_tweet_id: UUID (nullable)
retweet_of_id: UUID (nullable)
quote_tweet_id: UUID (nullable)
like_count: INT
retweet_count: INT
reply_count: INT
}
users {
user_id: UUID
username: VARCHAR(15)
display_name: VARCHAR(50)
bio: VARCHAR(160)
follower_count: INT
following_count: INT
is_verified: BOOLEAN
is_celebrity: BOOLEAN # follower_count > 10K
created_at: TIMESTAMP
}
follows {
follower_id: UUID
followee_id: UUID
created_at: TIMESTAMP
PRIMARY KEY (follower_id, followee_id)
}
# Denormalized for read performance
user_timelines (Redis Sorted Set)
Key: timeline:{user_id}
Score: tweet_timestamp
Member: tweet_id
# Celebrity tweets (separate store)
celebrity_tweets (Redis Sorted Set)
Key: celebrity:{user_id}
Score: tweet_timestamp
Member: tweet_idSnowflake ID Generation
Twitter developed Snowflake to generate unique, sortable IDs without central coordination.
# Snowflake ID Structure (64-bit)
┌────────────────────────────────────────────────────────────────┐
│ 1 bit │ 41 bits timestamp │ 10 bits │ 12 bits │
│unused │ (milliseconds) │machine │ sequence │
│ │ since epoch │ ID │ number │
└────────────────────────────────────────────────────────────────┘
# Properties:
# - Roughly sortable by time
# - No coordination needed
# - 4096 IDs per millisecond per machine
# - 69 years before overflow
# Example ID: 1605978261000000000
# Timestamp: 2020-11-21 12:31:01 UTC
# Machine: 42
# Sequence: 0
def generate_snowflake_id(machine_id, last_timestamp, sequence):
timestamp = current_time_ms() - TWITTER_EPOCH
if timestamp == last_timestamp:
sequence = (sequence + 1) & 0xFFF # 12 bits
if sequence == 0:
timestamp = wait_next_millis(last_timestamp)
else:
sequence = 0
id = (timestamp << 22) | (machine_id << 12) | sequence
return id, timestamp, sequenceHigh-Level Architecture
High-Level Architecture
Select a flow to highlight key components
Edge & Routing
Clients
Web + Mobile
CDN
Static + media
Load Balancer
Traffic routing
API Gateway
Auth + rate limit
Core Services
Tweet Service
Write tweets
Timeline Service
Feed assembly
Search Service
Query layer
Trending Service
Stream processing
User Service
Profiles + graph
Async + Data
Message Queue
Kafka + fanout
Tweets DB
Primary storage
Timeline Cache
Redis ZSETs
Search Index
Lucene/ES
Trending Cache
Redis sorted sets
Graph DB
Followers graph
Interview Tips
Key trade-offs
- Fanout-on-write vs fanout-on-read - explain the hybrid approach
- Consistency vs latency — eventual consistency acceptable
- Storage vs compute — pre-compute vs on-demand
- Accuracy vs performance in trending (Count-Min Sketch)
Frequent follow-up questions
- How to process a celebrity with 100M followers?
- How to scale fanout workers?
- How to identify spam trends?
- How to implement "For You" personalization?
