System Design Space
Knowledge graphSettings

Updated: March 25, 2026 at 2:00 AM

ClickHouse: analytical DBMS and architecture

medium

Column-oriented OLAP DBMS: MergeTree, partitioning, replication, materialized views and high-throughput analytics scenarios.

ClickHouse matters not because it promises fast dashboards, but because of how it is built around real analytical flows: columnar storage, MergeTree, background merges, and heavy read-path workloads.

In engineering work, this chapter helps you design tables, sort keys, partitioning, and materialized views around business questions rather than source-event shape, which is how freshness and ingest cost are actually controlled.

In interviews and architecture discussions, it is especially strong when you need to clearly separate OLAP from OLTP and show why ClickHouse is great for analytics without replacing the write system of record.

Practical value of this chapter

Analytical data model

Design tables and partitioning around real analytical questions, not source-system structure.

Ingestion and merge path

Include batching, sort keys, and background merges in freshness and ingest-latency planning.

Performance economics

Tune storage policy, compression, and materialized views to balance speed and cost.

Interview perspective

Clearly position ClickHouse as OLAP and explain why it is not a direct OLTP replacement.

Official documentation

ClickHouse Docs

Basic concepts: table engines, architecture, SQL and operational practices.

Open docs

ClickHouse — columnar OLAP DBMS for analytics on large volumes. Its strengths are fast aggregating queries based on events and logs with high throughput recording and effective data compression.

History: key milestones

2009

Internal launch in Yandex

ClickHouse appears as a columnar analytical database for high-load reports.

2016

Open source release

The project becomes public and begins to develop in the open-source ecosystem.

2021

Spin-off of ClickHouse, Inc.

The team is launching a separate company around the project, the focus is shifting to the commercial ecosystem and cloud direction.

2021+

Cloud and ecosystem

Development of managed offerings, object storage integrations and an ecosystem of tools.

Key Architecture Principles

Column-oriented storage

Reading only the necessary columns dramatically reduces the I/O for analytical queries.

MergeTree family

Partitions, sorting, background merge processes and replication are the core of most production installations.

Vectorized execution

Requests are executed in batches, which increases throughput and CPU cache efficiency.

Compression + skipping

Columnar compression and data skipping indexes speed up scan-heavy workloads.

High-Level Architecture

At the high-level level, ClickHouse separates the client layer, coordinator layer and storage/background layer MergeTree. Writing and reading go through coordinators, and actual storage and calculations occur on shard replicas.

Sharded + Replicated

Canonical production profile: shard-key distribution, Keeper-based replication, and parallel scans.

Pros

  • Horizontal scaling as data volume grows.
  • Higher read throughput and better fault tolerance.
  • Flexible balancing of write/read paths across nodes.

Limitations

  • Higher operational complexity.
  • Requires lag, merge backlog, and shard skew control.
Best for: Large BI and product analytics with high throughput.

Workload Queue

CH-REQ-201
read
BI
dashboard_last_15m
CH-REQ-202
write
Ingest
insert_events_batch
CH-REQ-203
read
ML
feature_slice_by_tenant
CH-REQ-204
write
Ingest
insert_clickstream_batch

Control Plane

Requests are distributed across shards, and replicas provide HA and read scalability.

Ready to simulate ClickHouse architecture.

Last decision

Active step: idle

Shard A / R1

lag 1

primary replica

parts: 36 | reads: 0 | writes: 0

Shard A / R2

lag 2

secondary replica

parts: 35 | reads: 0 | writes: 0

Shard B / R1

lag 1

primary replica

parts: 34 | reads: 0 | writes: 0

Shard B / R2

lag 2

secondary replica

parts: 33 | reads: 0 | writes: 0

Replication & Merges

Keeper ops: 0

Replication and merge processes keep parts and lag within a controlled range.

Cluster Counters

reads: 0 | writes: 0 | parts: 138 | Avg lag: 1.5

Monitor the balance between ingestion throughput, merge backlog, and query latency.

Architecture Checklist

Partition + ORDER BY design
Replication lag SLO
Merge / TTL observability

Write/Read Path via Components

Below is an interactive diagram showing the passage of requests through key components: coordinator, shard replicas, Keeper and background processes.

Read/Write Path Explorer

Interactive walkthrough of how ClickHouse queries move through coordinator, shard replicas, Keeper, and background processes.

1
Client
INSERT batch
2
Insert Coordinator
route by shard key
3
Replicated MergeTree
create new parts
4
ClickHouse Keeper
replication metadata
5
Background Merges
compact + optimize
Write path: INSERT is routed through coordinator to shard replicas, creates new parts, and is optimized by background merge processes.

Write path

  1. Client sends INSERT in batches (usually via HTTP/native protocol).
  2. Coordinator routes each batch by shard key to the target replicas.
  3. Data is written as new parts inside MergeTree tables and synchronized via Keeper metadata.
  4. Background merges compact parts, apply TTL/mutations, and optimize storage layout.

Data Modeling Practice

  • Design for read and aggregation patterns first, not OLTP normalization.
  • Choose ORDER BY so that the key filters filter out the data as much as possible.
  • Use time/domain partitioning for managed retention and scan.
  • Use materialized views for pre-aggregation of hot analytical reports.
  • Plan TTL/retention in advance - volumes in analytics are growing quickly.

When to choose ClickHouse

Good fit

  • Product analytics, BI dashboards, observability/log analytics.
  • Event and time-series data with very high write throughput.
  • Complex aggregating queries on large historical volumes.
  • Near-real-time data marts for product and analytics teams.

Worth avoiding

  • OLTP scenarios with frequent point-updates and short transactions.
  • Workloads where row-level lock and strict transactional semantics are critical.
  • Systems with many small UPDATE/DELETE in real time.
  • Use cases, where the key task is online serving one record at a time.

Related chapters

Enable tracking in Settings