System Design Space
Knowledge graphSettings

Updated: March 2, 2026 at 9:12 AM

Search System (Google/Elasticsearch)

mid

Classic task: inverted index, web crawling, ranking (BM25, PageRank), distributed search.

Introduction

Designing a search engine is one of the most complex tasks in a System Design interview. This task covers the entire spectrum: from data collection (crawling) to indexing, ranking results and scaling to petabytes of information.

Key Components of a Search Engine

Web Crawler

Collection and crawling of web pages

Indexer

Creating an inverted index

Ranker

Ranking results

Functional Requirements

Core Features

  • Full-text search of billions of documents
  • Autocompletion and query suggestions
  • Faceted search and filtering
  • Relevant ranking of results
  • Spell correction and synonym processing

Additional features

  • Search by images and videos
  • Personalization of results
  • Localization and language models
  • Real-time indexing of new pages

Non-functional requirements

<200ms
P99 latency for search queries
10B+
Indexed documents
100K QPS
Peak Request Load
99.99%
Availability

Web Crawling

Connection

Alex Xu: Web Crawler

A detailed analysis of the crawler architecture in the book System Design Interview

Читать обзор

Web Crawler is an automated system for crawling and scraping web pages. At Google's scale, this means processing billions of URLs every day.

Crawler Architecture

Select a stage to highlight key components

Priority Queue
Politeness Manager
Freshness Scorer
Worker Pool
DNS Resolver
robots.txt Cache
Parser (HTML)
Duplicate Detector
URL Extractor

URL Frontier

URL queue for crawling with prioritization and load control on domains:

  • Priority Queue - important pages first
  • Politeness — no more than N requests/sec per domain
  • Freshness — frequency of page re-crawling

Duplicate Detection

Duplicate detection to save resources:

  • URL Deduplication - Bloom Filter for URL
  • Content Hash — SimHash/MinHash
  • Near-Duplicate - LSH for similar pages

💡 Politeness and robots.txt

Each crawler must comply with robots.txt and not overload the servers. Google uses separate queues for each domain with rate limiting. The typical interval between requests to one domain is 1-10 seconds.

Inverted Index

Theory

Database Internals

Deep Dive into Data Structures for Indexing

Читать обзор

The Inverted Index is the fundamental data structure for full-text search. Instead of storing “document → words,” it stores “word → documents.”

Inverted Index Structure

Forward Index (traditional):
┌─────────────┬──────────── ──────────────────────────┐
│ Doc ID │ Terms │
├─────────────┼──────────── ──────────────────────────┤
│ doc_1 │ "system", "design", "interview" │
│ doc_2 │ "design", "patterns", "book" │
│ doc_3 │ "system", "architecture", "book" │
└─────────────┴──────────── ──────────────────────────┘

Inverted Index:
┌─────────────────┬───────────── ────────────────────────────────┐
│ Term │ Posting List │
├─────────────────┼───────────── ────────────────────────────────┤
│ "system" │ [(doc_1, pos:[0]), (doc_3, pos:[0])] │
│ "design" │ [(doc_1, pos:[1]), (doc_2, pos:[0])] │
│ "interview" │ [(doc_1, pos:[2])] │
│ "patterns" │ [(doc_2, pos:[1])] │
│ "book" │ [(doc_2, pos:[2]), (doc_3, pos:[2])] │
│ "architecture" │ [(doc_3, pos:[1])] │
└─────────────────┴───────────── ────────────────────────────────┘

Posting List Entry:
┌─────────────────────────────── ───────────────────────────────┐
│ doc_id: uint64 │
│ term_frequency: uint16 │
│ positions: [uint32, ...] // for phrase queries │
│ field_lengths: {title: 5, body: 1000, ...} │
└─────────────────────────────── ───────────────────────────────┘

Index Components

  • Dictionary (Lexicon)

    A dictionary of all unique terms is stored in memory

  • Posting Lists

    Lists documents for each term on disk

  • Skip Lists

    Speeding up the intersection of posting lists

Storage optimizations

  • Delta Encoding

    Storing doc_id differences instead of absolute values

  • Variable Byte Encoding

    Compress small numbers into fewer bytes

  • Block Compression

    PFOR, Simple9, Roaring Bitmaps

Text Processing Pipeline

Select a stage to inspect the transformation output

Input

"The System Design Interview is AMAZING!"

Output

["The","System","Design","Interview","is","AMAZING!"]

Stage Summary

Split text into individual tokens

Query Processing

Request processing includes parsing, expansion, and efficient traversal of posting lists.

Query Processing Flow

Switch stages to inspect processing details

Current Stage

Query Parser

Tokens, operators, and query fields

Example

Input: "system design patterns"Tokens: ["system","design","patterns"]Operator: ANDFields: title?, body?

Latency Budget

Parser + expansion: < 20ms, retrieval: < 50ms, ranking: < 100ms

⚡ Optimization: WAND Algorithm

WAND (Weighted AND) - an algorithm for efficient Top-K retrieval without completely scanning all documents:

  • Each term has an upper bound score
  • Skipping documents that cannot be included in Top-K
  • Up to 10x speedup compared to naive approach

Ranking and Relevance

Ranking is a key component that determines search quality. Modern systems use a multi-level approach: from simple formulas to ML models.

BM25 Formula

Basic formula for text ranking:

BM25(D, Q) = Σ IDF(qi) × 
  (f(qi, D) × (k1 + 1)) / 
  (f(qi, D) + k1 × (1 - b + b × |D|/avgdl))

Where:
  f(qi, D) = term frequency in document
  |D| = document length
  avgdl = average document length
  k1 = 1.2 (term saturation)
  b = 0.75 (length normalization)

BM25 - improved version of TF-IDF with saturation and length normalization

PageRank (simplified)

Assessing page authority based on incoming links:

PR(A) = (1-d) + d × Σ PR(Ti)/C(Ti)

Where:
  d = damping factor (0.85)
  Ti = pages linking to A
  C(Ti) = outbound links from Ti

Iterative computation:
  1. Initialize all PR = 1/N
  2. Iterate until convergence
  3. Normalize results

Classic Google algorithm for web ranking

Multi-Stage Ranking Pipeline

Switch stages to inspect ranking details

Current Stage

Stage 1: Candidate Generation

Inverted index lookup + BM25

Output

~10,000 candidates

Latency Budget

≈ 50ms

Highlights

  • BM25 scoring
  • Top-10K retrieval
  • Low latency
100+
Ranking signals in Google
BERT
Neural ranking models
LTR
Learning to Rank

Distributed Architecture

Theory

DDIA: Partitioning

Data partitioning for scaling

Читать обзор

A search engine the size of Google requires a distributed architecture with index sharding and replication for fault tolerance.

Sharding Strategies

Document-based Sharding
Shard 1: docs 1-1M
Shard 2: docs 1M-2M
Shard 3: docs 2M-3M
...

Query → all shards → merge results

Pros:
+ Simplicity
+ Adding documents
Cons:
- Each request → all shards
Term-based Sharding
Shard 1: terms A-F
Shard 2: terms G-M
Shard 3: terms N-Z
...

Query → only the necessary shards

Pros:
+ Fewer shards per request
Cons:
- Hotspots (popular thermal baths)
- Complex adding docs

Distributed Query Processing

Подсветите этапы: router, shards или финальный merge

Load Balancer

Распределение входящего трафика

Query Router

Parsing + fanout

Aggregator

Merge + re-rank

Shard 1

Replica A/B/C

Shard 2

Replica A/B/C

Shard 3

Replica A/B/C

1. Router разбирает запрос
2. Параллельный fanout на все шарды
3. Merge Top‑K + финальный re‑rank

Latency Budget

Router: < 10msShard fanout: < 100msMerge: < 20ms

💡 Elasticsearch Architecture

Elasticsearch uses document-based sharding with automatic routing:

shard = hash(document_id) % number_of_primary_shards

• Primary shards - writing and reading
• Replica shards - read only, failover
• Typical configuration: 5 primary × 2 replicas = 15 shards

Real-time Indexing

For searching news and social networks, indexing speed is critical—seconds from publication to appearance in search.

Tiered Index Architecture

Переключайте слой, чтобы увидеть его роль и характеристики

Active Tier

Real-time Tier

In-memory index (hot data)

Storage

RAM

Freshness

Seconds

Focus

Writes + freshness

Latency

< 1s to searchable

Query Fanout

Query → все tiers параллельно → merge → deduplicate → return

Write Path

  • 1.The document is sent to Kafka
  • 2.Indexer consumer processes
  • 3.Writing to in-memory buffer
  • 4.Refresh → searchable
  • 5.Flush → durable on disk

Elasticsearch Refresh

  • refresh_interval: 1s (default)
  • Creates a new segment in memory
  • Near real-time search
  • Trade-off: refresh cost vs freshness

Autocomplete and Typeahead

Autocompletion requires even lower latency (<50ms) and works based on prefix trees and precomputed popular queries.

Trie-based Autocomplete

Select a prefix to inspect trie path and suggestions

Trie Path

rootsys
Prefix length

3 chars

Latency target

< 50ms

Note

Top results are stored in the prefix node

Top Suggestions

  • system design
  • system
  • system administrator
  • system metrics

Top-K Cache

Trie nodes store top suggestions (Top-10) to avoid traversing the whole subtree.

Optimizations

  • Prefix caching — cache of popular prefixes
  • Compressed Trie - merging one-child nodes
  • Top-K precomputation — we store the top 10 in nodes

Data Collection

  • Query logs → time aggregation
  • Filtering offensive requests
  • Personalization (user story)

Interview tips

What must be discussed

  • Inverted index vs forward index — trade-offs
  • BM25/TF-IDF - understanding the formula
  • Sharding strategy — document vs term
  • Latency budget allocation
  • Near real-time vs batch indexing

Frequent follow-up questions

  • How to process phrase queries?
  • How to implement fuzzy search?
  • How to add personalization?
  • How to deal with spam in results?
  • How to measure search quality?

Key Findings

1

Inverted Index

Fundamental data structure for efficient full-text search

2

Multi-stage Ranking

Cascade from simple formulas to heavy ML models to balance latency and quality

3

Document Sharding

Preferred strategy for most systems - easier to scale

4

Tiered Storage

Division into real-time, recent and historical tiers for fresh content

5

Politeness in Crawling

Mandatory compliance with robots.txt and rate limiting

Related materials

Related chapters

Enable tracking in Settings

System Design Space

© 2026 Alexander Polomodov