Elasticsearch: search engine and architecture

Elasticsearch is best understood not as just another database, but as a separate search layer with its own indexing model, update lag, and operational risks.

In real work, this chapter helps you think early about sharding, index templates, rollover, reindex workflows, and near-real-time behavior so search does not become fragile magic on top of the main store.

In interviews and architecture discussions, it is especially valuable when you need to explain why a system needs a dedicated search layer and why it cannot be merged painlessly with the transactional source of truth.

Practical value of this chapter

Search boundary

Keep the search index separate from the transactional source of truth: Elasticsearch accelerates discovery but is not the system of record.

Index lifecycle

Plan index templates, rollover, retention policy, and reindex workflows before launch.

Relevance and latency

Align analyzers, ranking strategy, and caching with UX expectations and query traffic patterns.

Interview framing

Explain why a dedicated search tier is introduced and what consistency risks come with it.

Decision frame and editorial focus

Chapter focus

search index architecture, relevance model, and near-real-time reads

Workload profile

Start from the specialized query: analytics, search, time series, graph traversal, vector retrieval, or monitoring metrics.

Good fit

The choice is justified when the index or storage model directly matches product behavior and relieves the source of truth.

Boundary and risk

The danger is turning a specialized layer into a universal database and losing consistency, freshness, and ownership boundaries.

Connect next

Connect the chapter to the OLTP source, data pipeline, retention/compaction, and read-model architecture.

Source

Wikipedia: Elasticsearch

Project history, baseline architecture, and Elasticsearch’s role in the search ecosystem.

Open article

Official site

Elastic: Elasticsearch

Product documentation, platform capabilities, and operational guidance.

Open website

When full-text search and filters start dragging down the transactional database, a separate search layer takes that load off. Elasticsearch is a distributed search and analytics engine built on Apache Lucene; in system design it sits next to the primary database, not in place of it. The gain in full-text search and filtering speed is paid for with manual index work, results that lag behind on consistency, and the ongoing operational cost of a cluster.

History and context

2010

Project launch

Elasticsearch is created as a distributed REST engine built on top of Apache Lucene.

2012

ELK ecosystem emerges

Search, log pipelines, and visualization tools start being used together as one operating stack.

2018

Enterprise adoption grows

The platform becomes widely used for log analytics, observability, and product search.

January 2021

Moving away from Apache 2.0

Starting with version 7.11, Elastic relicenses Elasticsearch and Kibana from Apache 2.0 to a dual SSPL/Elastic License model to restrict cloud providers selling the engine as a service.

2021

The OpenSearch fork

AWS responds with the Apache 2.0-licensed OpenSearch fork (announced in April, 1.0 released in July); in 2024 the project moves to the Linux Foundation as the OpenSearch Software Foundation.

2024

Open-source option returns

In August 2024, Elastic adds AGPLv3 as a third licensing option for Elasticsearch and Kibana, making the core available under an OSI-approved open-source license again.

Core architecture elements

Index, shards, and replicas

An index is split into primary shards so data lands on different nodes, while replicas keep a copy in case a node goes down. Horizontal scaling rests on exactly this.

Cluster and node roles

Nodes take on different roles: some coordinate queries, others store data and move shards during rebalancing. How those roles are spread decides whether the cluster bottlenecks on one overloaded node.

Near real-time search

A written document is not searchable right away — it shows up after the next index refresh cycle. The more frequent the refresh, the fresher the results and the more they cost in throughput: this is a trade-off to make on purpose.

Relevance and ranking

Result order is set by the BM25 formula together with text-analysis settings. Search quality here is not a default — it is something you tune against a specific corpus and the queries people actually run.

Elasticsearch architecture by layers

The diagram shows a baseline Elasticsearch setup in a product system: a dedicated search layer, an indexing pipeline, and a cluster with primary and replica shards. The arrows mark where data is duplicated and where results start to lag behind.

Clients and API

Web/mobile appsREST APIQuery DSLKibana

layer transition

Ingestion layer

CDC / outboxBeats/LogstashBulk APImapping pipeline

layer transition

Request coordination

query parsingroutingscatter/gathermerge results

layer transition

Index and storage

primary shardsLucene segmentsinverted indexrefresh/merge

layer transition

Replication and resilience

replica shardsshard allocationfailoverread scale

layer transition

Cluster operations

ILMsnapshotsautoscalingmonitoring

system view

Elasticsearch is usually deployed as a dedicated search and analytics layer over a transactional source of truth.

Search quality

BM25 relevanceanalyzers + tokenizationsynonyms + boosting

Analytics

aggregationsfacets/filterstime-series exploration

Operational trade-offs

near real-time visibilityindex design costscluster maintenance

Write and read paths through components

Writes and reads take different routes. The interactive flow shows how a document moves into the index and how a query travels through a coordinating node and shards before returning ranked results.

Write and read paths

Interactive view of how requests move through core Elasticsearch components.

Source DB

OLTP state

Ingestion

CDC / outbox

Primary Shard

index write

Replica Shards

replication

Refresh

near real-time

Source DB

OLTP state

Ingestion

CDC / outbox

Primary Shard

index write

Replica Shards

replication

Refresh

near real-time

Write path: data moves through ingestion into primary shards, then replicates and becomes searchable after refresh.

Write path

A service writes the event to the source of truth (typically an OLTP database).
Through CDC/outbox or an ingestion pipeline, the document reaches the indexer.
Elasticsearch places the document into a primary shard and replicates it to replicas.
After refresh, the document becomes visible in search results (near real-time).

When to choose Elasticsearch

Good fit

Full-text product search (catalogs, articles, documentation).
Observability scenarios: search over logs, events, and traces.
Use cases that need flexible filters, aggregations, and ranking.
Read-heavy systems that require fast search experience.

Avoid when

As the only source of truth for critical transactional data.
OLTP workloads with frequent point updates/deletes and strict ACID expectations.
Systems without full-text search needs where SQL/cache is enough.
Teams not ready for cluster and index operational overhead.

Practice: DDL and DML

Next are practical API examples often worked through in system design interviews: from index and mapping management to document writes and search queries.

DDL and DML examples in Elasticsearch

DDL controls indices and mappings, while DML operates on documents and search queries.

DDL in Elasticsearch is about schema and index lifecycle operations: creating indices, tuning shards/replicas, and evolving mappings.

Create an index with settings and mapping

PUT /products-v1

Define shard/replica layout and field types before indexing data.

PUT /products-v1
{
  "settings": {
    "number_of_shards": 3,
    "number_of_replicas": 1
  },
  "mappings": {
    "properties": {
      "name": { "type": "text" },
      "price": { "type": "float" },
      "category": { "type": "keyword" },
      "created_at": { "type": "date" }
    }
  }
}

Add new fields to existing mapping

PUT /products-v1/_mapping

You can extend mapping with new fields (changing existing field types is limited).

PUT /products-v1/_mapping
{
  "properties": {
    "brand": { "type": "keyword" },
    "is_active": { "type": "boolean" }
  }
}

Alias switch for zero-downtime reindex rollout

POST /_aliases

Blue/green pattern: move alias from products-v1 to products-v2 atomically.

POST /_aliases
{
  "actions": [
    { "remove": { "index": "products-v1", "alias": "products" } },
    { "add": { "index": "products-v2", "alias": "products" } }
  ]
}

Related chapters

Search System (Google/Elasticsearch) - Practical system design case on ranking, indexing, and scaling a production-grade search platform.
Database Selection Framework - How to decide when a search engine should be introduced as a dedicated layer next to the transactional datastore.
MongoDB: document model, replication, and consistency - Boundary between an operational document store and a full-text search index in real-world architectures.
ClickHouse: analytical DBMS and architecture - Separation of concerns between full-text retrieval and analytical aggregation over event data.
Qdrant: vector database and architecture - Where lexical search hits the wall of synonyms and meaning — and how vector retrieval helps in semantic and AI-oriented discovery scenarios.
Data Pipeline / ETL / ELT Architecture - How to build ingestion and index-synchronization pipelines that keep search data fresh under continuous updates.