ClickHouse: analytical DBMS and architecture

ClickHouse matters not because it promises fast dashboards, but because of how it is built around real analytical flows: columnar storage, MergeTree, background part merges, and heavy read paths.

In engineering work, this chapter helps you design tables, sort keys, partitioning, and materialized views around business questions rather than source-event shape, which is how freshness and ingest cost are controlled.

In interviews and architecture discussions, it is especially strong when you need to clearly separate OLAP from OLTP and show why ClickHouse is great for analytics without replacing the write system of record.

Practical value of this chapter

Analytical data model

Design tables and partitioning around real analytical questions, not source-system structure.

Ingestion and merge path

Include batching, sort keys, and background merges in freshness and ingest-latency planning.

Performance economics

Tune storage policy, compression, and materialized views to balance speed and cost.

Interview perspective

Clearly position ClickHouse as OLAP and explain why it is not a direct OLTP replacement.

Decision frame and editorial focus

Chapter focus

columnar analytics, MergeTree internals, and high-throughput analytical workloads

Workload profile

Start from the specialized query: analytics, search, time series, graph traversal, vector retrieval, or monitoring metrics.

Good fit

The choice is justified when the index or storage model directly matches product behavior and relieves the source of truth.

Boundary and risk

The danger is turning a specialized layer into a universal database and losing consistency, freshness, and ownership boundaries.

Connect next

Connect the chapter to the OLTP source, data pipeline, retention/compaction, and read-model architecture.

Official documentation

ClickHouse Docs

Basic concepts: table engines, architecture, SQL and operational practices.

Open docs

ClickHouse comes in when reports over events, logs, and historical facts stop fitting into a row-oriented DBMS. It is a columnar analytical database, and its strengths are fast aggregating queries, high write throughput, and dense data compression. The cost: it is a poor fit for point transactions.

History: key milestones

2009

Internal launch in Yandex

Appears as a columnar analytical database for high-volume reporting — where a row store could no longer scan large result sets fast enough.

2016

Open source release

The code opens up, and drivers, integrations, and battle-tested operational practices grow around the project.

2021

Spin-off of ClickHouse, Inc.

A dedicated company grows around the project, and the center of development shifts toward the commercial ecosystem and a cloud service.

2021+

Cloud and ecosystem

Managed offerings, object-storage integrations, and tooling accrete around the analytical stack.

Key Architecture Principles

Column-oriented storage

A query reads only the columns it needs — and doesn't pay I/O for the rest when scanning large analytical result sets.

MergeTree family

Partitions, sorting, background part merges, and replication hold up most production clusters — this is the base layer, not an option.

Vectorized execution

Operations run in batches of rows rather than one at a time: this lifts throughput and packs the CPU cache more tightly.

Compression + skipping

Columnar compression and data-skipping indexes cut the dead ranges out of a read — on scan-heavy workloads this decides whether you hit the disk or not.

ClickHouse architecture by layers

At a high level, ClickHouse splits three layers: client entry points, query coordinators, and MergeTree storage. Writes and reads go through coordinators, while the storage and computation land on shard replicas — that is where disk and CPU are actually spent.

Sharded + Replicated

Canonical production profile: shard-key distribution, Keeper-based replication, and parallel scans.

Pros

Horizontal scaling as data volume grows.
Higher read throughput and better fault tolerance.
Flexible balancing of write/read paths across nodes.

Limitations

Higher operational complexity.
Requires lag, merge backlog, and shard skew control.

Best for: Large BI and product analytics with high throughput.

Workload Queue

CH-REQ-201

read

dashboard_last_15m

CH-REQ-202

write

Ingest

insert_events_batch

CH-REQ-203

read

feature_slice_by_tenant

CH-REQ-204

write

Ingest

insert_clickstream_batch

Control Plane

Requests are distributed across shards, and replicas provide HA and read scalability.

Ready to simulate ClickHouse architecture.

Last decision

—

Active step: idle

Shard A / R1

lag 1

primary replica

parts: 36 | reads: 0 | writes: 0

Shard A / R2

lag 2

secondary replica

parts: 35 | reads: 0 | writes: 0

Shard B / R1

lag 1

primary replica

parts: 34 | reads: 0 | writes: 0

Shard B / R2

lag 2

secondary replica

parts: 33 | reads: 0 | writes: 0

Replication & Merges

Keeper ops: 0

Replication and merge processes keep parts and lag within a controlled range.

Cluster Counters

reads: 0 | writes: 0 | parts: 138 | Avg lag: 1.5

Monitor the balance between ingestion throughput, merge backlog, and query latency.

Architecture Checklist

Partition + ORDER BY design

Replication lag SLO

Merge / TTL observability

Write and read paths through components

The interactive diagram below walks a request through the key components — the coordinator, shard replicas, ClickHouse Keeper, and background part merges — and shows where the write path diverges from the read path.

Read/Write Path Explorer

Interactive walkthrough of how ClickHouse queries move through coordinator, shard replicas, Keeper, and background processes.

Client

INSERT batch

Insert Coordinator

route by shard key

Replicated MergeTree

create new parts

ClickHouse Keeper

replication metadata

Background Merges

compact + optimize

Client

INSERT batch

Insert Coordinator

route by shard key

Replicated MergeTree

create new parts

ClickHouse Keeper

replication metadata

Background Merges

compact + optimize

Write path: INSERT is routed through coordinator to shard replicas, creates new parts, and is optimized by background merge processes.

Write path

Client sends INSERT in batches (usually via HTTP/native protocol).
Coordinator routes each batch by shard key to the target replicas.
Data is written as new parts inside MergeTree tables and synchronized via Keeper metadata.
Background merges compact parts, apply TTL/mutations, and optimize storage layout.

Data Modeling Practice

Design for read and aggregation patterns first, not OLTP normalization — otherwise fast queries have to be reached through workarounds.
Pick ORDER BY so the main filters prune as much data as possible before any read happens.
Time or domain partitioning gives you controllable retention and cuts scan volume when a query only touches a recent slice.
Pre-compute hot analytical reports with materialized views instead of rebuilding them on every query.
Plan TTL and retention up front: analytical data grows quickly, and cleaning it up after the fact costs more.

When to choose ClickHouse

Good fit

Product analytics, BI dashboards, observability and log analytics.
Event and time-series data where writes arrive at very high throughput.
Complex aggregating queries on large historical volumes that scan millions of rows.
Near-real-time data marts for product and analytics teams.

Worth avoiding

OLTP scenarios with frequent point-updates and short transactions.
Workloads that need row-level locks and strict transactional semantics.
A stream of small per-row UPDATE/DELETE in real time — here columnar storage works against you.
The case where the main job is to serve one record by key, not aggregate a range.

Related chapters

Database Selection Framework - Where the line falls: when ClickHouse is taken as the primary analytical platform, and when the job really belongs to a transactional database.
How data storage systems work - A map of storage types and their trade-offs — it shows which niche a columnar analytical engine like ClickHouse settles into.
YDB: distributed SQL database and architecture - How to split the transactional and analytical contours: a strict SQL layer in YDB and fast aggregates in ClickHouse.
DuckDB: embedded analytical DBMS and architecture - Where a distributed cluster is overkill: a comparison with an embedded engine for local analytics and one-off data processing.
Time-Series Databases: selection and architecture - Where metrics and events are better left to TSDB systems, and where ClickHouse wins — especially on high-cardinality queries.
Data Pipeline / ETL / ELT Architecture - What to settle on the ingestion side before loading into ClickHouse, so near-real-time and batch analytics don't choke on a messy feed at the entrance.