Source
Wikipedia: Elasticsearch
Project history, core architecture, and Elasticsearch’s role in the search ecosystem.
Official site
Elastic: Elasticsearch
Product docs, platform capabilities, and operational guidance.
Elasticsearch is a distributed search and analytics engine built on Apache Lucene. In system design it is usually used as a dedicated search layer over a transactional database: this enables fast full-text queries and filtering, but requires deliberate work with indices, consistency, and operational cost.
History and context
Project launch
Elasticsearch is created as a distributed REST engine built on top of Apache Lucene.
ELK ecosystem emerges
The search layer, log pipelines, and visualization tools start being used together as one stack.
Enterprise adoption grows
The platform becomes widely used for log analytics, observability, and product search.
Licensing and cloud evolution
Managed offerings and operational practices for high-load clusters continue to evolve.
Core architecture elements
Index -> shard -> replica
An index is split into primary shards, and replicas are added for fault tolerance. This is the foundation of horizontal scaling.
Cluster and node roles
The cluster coordinates query execution, data placement, and shard rebalancing across nodes.
Near real-time search
Data is not instantly searchable: it becomes visible after refresh cycles, creating a trade-off between latency and throughput.
Relevance and ranking
Lucene scoring models (including BM25) help rank results and tune search quality.
High-Level Architecture
The diagram below shows a baseline Elasticsearch setup in a product system: a dedicated search layer, indexing pipeline, and a cluster with primary and replica shards.
System view
Elasticsearch is usually deployed as a dedicated search and analytics layer over a transactional source of truth.
Search quality
Analytics
Operational trade-offs
Read / Write Path through components
The interactive flow below shows how documents move into the index and how queries travel through coordinator and shards before returning ranked results.
Read/Write Path Explorer
Interactive view of how requests move through core Elasticsearch components.
Write path
- A service writes the event to the source of truth (typically an OLTP database).
- Through CDC/outbox or an ingestion pipeline, the document reaches the indexer.
- Elasticsearch places the document into a primary shard and replicates it to replicas.
- After refresh, the document becomes visible in search results (near real-time).
When to choose Elasticsearch
Good fit
- Full-text product search (catalogs, articles, documentation).
- Observability scenarios: search over logs, events, and traces.
- Use cases that need flexible filters, aggregations, and ranking.
- Read-heavy systems that require fast search experience.
Avoid when
- As the only source of truth for critical transactional data.
- OLTP workloads with frequent point updates/deletes and strict ACID expectations.
- Systems without full-text search needs where SQL/cache is enough.
- Teams not ready for cluster and index operational overhead.
Practice: DDL and DML
Below are practical API examples often discussed in System Design interviews: from index/mapping management to document writes and search queries.
DDL and DML examples in Elasticsearch
DDL controls indices and mappings, while DML operates on documents and search queries.
DDL in Elasticsearch is about schema and index lifecycle operations: creating indices, tuning shards/replicas, and evolving mappings.
Create an index with settings and mapping
PUT /products-v1Define shard/replica layout and field types before indexing data.
PUT /products-v1
{
"settings": {
"number_of_shards": 3,
"number_of_replicas": 1
},
"mappings": {
"properties": {
"name": { "type": "text" },
"price": { "type": "float" },
"category": { "type": "keyword" },
"created_at": { "type": "date" }
}
}
}Add new fields to existing mapping
PUT /products-v1/_mappingYou can extend mapping with new fields (changing existing field types is limited).
PUT /products-v1/_mapping
{
"properties": {
"brand": { "type": "keyword" },
"is_active": { "type": "boolean" }
}
}Alias switch for zero-downtime reindex rollout
POST /_aliasesBlue/green pattern: move alias from products-v1 to products-v2 atomically.
POST /_aliases
{
"actions": [
{ "remove": { "index": "products-v1", "alias": "products" } },
{ "add": { "index": "products-v2", "alias": "products" } }
]
}