System Design Space
Knowledge graphSettings

Updated: March 1, 2026 at 9:26 PM

YDB: distributed SQL database and architecture

mid

Distributed SQL DBMS: auto-sharding tablets, ACID/serializable transactions, row/column tables, and shared-nothing horizontal scaling.

Source

Wikipedia: YDB

YDB history (KiWi/KiKiMR -> YDB), release milestones, and general architecture context.

Open article

Official docs

YDB Docs: Architecture

Architecture overview: shared-nothing, tablets, auto-sharding, distributed transactions, and storage layer.

Open docs

YDB (Yet another DataBase) is a distributed SQL database with ACID transactions, automatic sharding, and shared-nothing architecture. In system design, YDB is often chosen for high-load backend systems where strong consistency, key-based horizontal scale, and a unified platform for transactional plus analytical workloads are needed.

History and context

2010KiWi

Development starts in Yandex internal infrastructure

Yandex starts the KiWi distributed key-value layer as a foundation for scalable backend services.

2012KiKiMR

Transition toward distributed SQL architecture

KiKiMR evolves with tablet-based architecture and actor model that later become the core of YDB.

2016production rollout

Broad production usage inside Yandex

YDB-class architecture is used for high-load services across Yandex ecosystem workloads.

April 19, 2022v22.1.5

Public open-source release

YDB is released as open source distributed SQL database with ACID transactions and automatic sharding.

February 6, 2025v24.3.15.5

24.3 server branch stabilization

24.3 branch receives production-oriented stability and operational improvements.

September 15, 2025v25.1.4.7

New 25.x minor line

25.x stream continues SQL capabilities, performance tuning, and operational feature evolution.

Core architecture elements

Tablets and auto-sharding

Table data is distributed across shard tablets that can split/move automatically as load increases.

Serializable transactions

YDB provides ACID transactions with serializable isolation and optimistic concurrency control.

Row + column tables

The same platform supports row-oriented and column-oriented tables for OLTP and analytical workload profiles.

Disaggregated compute/storage

Architecture supports separated compute and storage layers plus multi-AZ fault-tolerant deployments.

Data model and transaction contour

The interactive section below shows how YDB combines row/column tables, automatic sharding, indexes, and distributed transactions in one architecture.

YDB data model: tables, shards, and transactions

YDB combines a relational model with automatic sharding and distributed transactions for high-load systems.

Why YDB is more than a typical SQL database

  • Every table requires a primary key, and data is physically distributed across shard tablets.
  • Both row-oriented and column-oriented tables are available for OLTP and OLAP profiles.
  • Distributed transactions with serializable isolation and OCC are built in.
  • Topics, CDC, and asynchronous replication let teams build integrated data pipelines.

Row-oriented tables

Core table type for transactional workloads: primary key is mandatory and rows are sorted by key.

Key elements

PRIMARY KEYDataShardPoint readsRange scans by key

Typical use cases

  • User/account state
  • Orders/payments
  • Transactional APIs

Example

CREATE TABLE orders (
  tenant_id Uint64,
  order_id Uint64,
  status Utf8,
  amount Uint64,
  PRIMARY KEY (tenant_id, order_id)
);

High-Level Architecture

High-level YDB diagram: client access, SQL/transaction layer, tablet shards, distributed storage, and fault-tolerance mechanics.

Clients and access
gRPC + SDKNode discoveryYQL / SQLCLI + API
Layer transition
Query and transaction layer
Parser + optimizerSerializable txOCCDistributed transaction coordinator
Layer transition
Tablets and shards
DataShard / ColumnShardSchemeShardHiveAuto split / merge
Layer transition
Distributed storage
PDisk / VDiskDSProxySynchronous replicationBLOB storage
Layer transition
Fault tolerance and scaling
Shared-nothing3-AZ topologyAutomatic balancingDisaggregated compute/storage
Layer transition
Workload layers
Row + column tablesTopics + CDCFederated queriesVector indexes

System view

Distributed SQLStrong consistencyTransactional + analytical workloads

Data model

Mandatory primary keyRow/column table enginesHierarchical namespace

Operational trade-offs

Cross-shard tx costKey design mattersCluster operations are non-trivial

Read / Write Path through components

This unified diagram combines write/read path and shows how requests move through discovery, transaction coordinator, shards, and replicated storage.

Read/Write Path Explorer

Interactive walkthrough of transaction and query flow through core YDB cluster components.

1
Client Tx
UPSERT UPDATE REPLACE
2
Discovery + Route
tablet map
3
Tx Coordination
serializable/OCC
4
Shard Apply
DataShard/ColumnShard
5
Replica Commit
DSProxy + storage
Write path: request is routed to target shards, coordinated as a transaction, and acknowledged after replicated storage commit.

Write path

  1. Primary key design determines whether a write stays single-shard or becomes distributed.
  2. YDB uses serializable isolation with optimistic concurrency control; conflicting transactions may fail and require retry.
  3. Cross-shard writes usually cost more latency/resources than single-shard writes.
  4. DDL and DML are not combined in one transaction; schema changes are separate idempotent operations.

When to choose YDB

Good fit

  • High-load transactional services requiring strong consistency and auto-sharding.
  • Systems with continuous growth where shared-nothing horizontal scale is required.
  • Use cases that need both OLTP tables and nearby analytical processing on column tables.
  • Teams ready to invest in key design and distributed SQL operational discipline.

Avoid when

  • Small single-node projects where a simple local DB is enough.
  • Workloads dominated by frequent cross-shard transactions without partition-aware key design.
  • Organizations that cannot support distributed cluster operations.
  • Cases where full-text search or pure OLAP dominates without a transactional core.

Practice: DDL and DML

Below are practical YDB DDL/DML examples: schema/index design, transactional upserts, and key-range query patterns.

DDL and DML examples in YDB

DDL defines schema/partitioning, while DML covers transactional writes and analytical reads.

In YDB, DDL operations (tables, indexes, partitioning) are handled separately from DML transactions and should be idempotent.

Create row table with auto partitioning

CREATE TABLE

Primary key is mandatory; auto partitioning helps scale with data and load growth.

CREATE TABLE orders (
  tenant_id Uint64,
  order_id Uint64,
  status Utf8,
  amount Uint64,
  created_at Timestamp,
  PRIMARY KEY (tenant_id, order_id)
)
WITH (
  AUTO_PARTITIONING_BY_SIZE = ENABLED,
  AUTO_PARTITIONING_MIN_PARTITIONS_COUNT = 8
);

Add global secondary index

ALTER TABLE ... ADD INDEX

Secondary index improves non-key access paths but increases write-side overhead.

ALTER TABLE orders
ADD INDEX idx_status GLOBAL ASYNC ON (status);

Create column table for analytics

CREATE TABLE ... STORE=COLUMN

For OLAP workloads, use column store and hash-based partitioning.

CREATE TABLE events_olap (
  ts Timestamp NOT NULL,
  tenant_id Uint64 NOT NULL,
  event_type Utf8,
  payload Json,
  PRIMARY KEY (ts, tenant_id)
)
PARTITION BY HASH(tenant_id)
WITH (STORE = COLUMN);

Related materials

Related chapters

Enable tracking in Settings

System Design Space

© 2026 Alexander Polomodov