Source
Wikipedia: YDB
YDB history (KiWi/KiKiMR -> YDB), release milestones, and general architecture context.
Official docs
YDB Docs: Architecture
Architecture overview: shared-nothing, tablets, auto-sharding, distributed transactions, and storage layer.
YDB (Yet another DataBase) is a distributed SQL database with ACID transactions, automatic sharding, and shared-nothing architecture. In system design, YDB is often chosen for high-load backend systems where strong consistency, key-based horizontal scale, and a unified platform for transactional plus analytical workloads are needed.
History and context
Development starts in Yandex internal infrastructure
Yandex starts the KiWi distributed key-value layer as a foundation for scalable backend services.
Transition toward distributed SQL architecture
KiKiMR evolves with tablet-based architecture and actor model that later become the core of YDB.
Broad production usage inside Yandex
YDB-class architecture is used for high-load services across Yandex ecosystem workloads.
Public open-source release
YDB is released as open source distributed SQL database with ACID transactions and automatic sharding.
24.3 server branch stabilization
24.3 branch receives production-oriented stability and operational improvements.
New 25.x minor line
25.x stream continues SQL capabilities, performance tuning, and operational feature evolution.
Core architecture elements
Tablets and auto-sharding
Table data is distributed across shard tablets that can split/move automatically as load increases.
Serializable transactions
YDB provides ACID transactions with serializable isolation and optimistic concurrency control.
Row + column tables
The same platform supports row-oriented and column-oriented tables for OLTP and analytical workload profiles.
Disaggregated compute/storage
Architecture supports separated compute and storage layers plus multi-AZ fault-tolerant deployments.
Data model and transaction contour
The interactive section below shows how YDB combines row/column tables, automatic sharding, indexes, and distributed transactions in one architecture.
YDB data model: tables, shards, and transactions
YDB combines a relational model with automatic sharding and distributed transactions for high-load systems.
Why YDB is more than a typical SQL database
- Every table requires a primary key, and data is physically distributed across shard tablets.
- Both row-oriented and column-oriented tables are available for OLTP and OLAP profiles.
- Distributed transactions with serializable isolation and OCC are built in.
- Topics, CDC, and asynchronous replication let teams build integrated data pipelines.
Row-oriented tables
Core table type for transactional workloads: primary key is mandatory and rows are sorted by key.
Key elements
Typical use cases
- User/account state
- Orders/payments
- Transactional APIs
Example
CREATE TABLE orders (
tenant_id Uint64,
order_id Uint64,
status Utf8,
amount Uint64,
PRIMARY KEY (tenant_id, order_id)
);High-Level Architecture
High-level YDB diagram: client access, SQL/transaction layer, tablet shards, distributed storage, and fault-tolerance mechanics.
System view
Data model
Operational trade-offs
Read / Write Path through components
This unified diagram combines write/read path and shows how requests move through discovery, transaction coordinator, shards, and replicated storage.
Read/Write Path Explorer
Interactive walkthrough of transaction and query flow through core YDB cluster components.
Write path
- Primary key design determines whether a write stays single-shard or becomes distributed.
- YDB uses serializable isolation with optimistic concurrency control; conflicting transactions may fail and require retry.
- Cross-shard writes usually cost more latency/resources than single-shard writes.
- DDL and DML are not combined in one transaction; schema changes are separate idempotent operations.
When to choose YDB
Good fit
- High-load transactional services requiring strong consistency and auto-sharding.
- Systems with continuous growth where shared-nothing horizontal scale is required.
- Use cases that need both OLTP tables and nearby analytical processing on column tables.
- Teams ready to invest in key design and distributed SQL operational discipline.
Avoid when
- Small single-node projects where a simple local DB is enough.
- Workloads dominated by frequent cross-shard transactions without partition-aware key design.
- Organizations that cannot support distributed cluster operations.
- Cases where full-text search or pure OLAP dominates without a transactional core.
Practice: DDL and DML
Below are practical YDB DDL/DML examples: schema/index design, transactional upserts, and key-range query patterns.
DDL and DML examples in YDB
DDL defines schema/partitioning, while DML covers transactional writes and analytical reads.
In YDB, DDL operations (tables, indexes, partitioning) are handled separately from DML transactions and should be idempotent.
Create row table with auto partitioning
CREATE TABLEPrimary key is mandatory; auto partitioning helps scale with data and load growth.
CREATE TABLE orders (
tenant_id Uint64,
order_id Uint64,
status Utf8,
amount Uint64,
created_at Timestamp,
PRIMARY KEY (tenant_id, order_id)
)
WITH (
AUTO_PARTITIONING_BY_SIZE = ENABLED,
AUTO_PARTITIONING_MIN_PARTITIONS_COUNT = 8
);Add global secondary index
ALTER TABLE ... ADD INDEXSecondary index improves non-key access paths but increases write-side overhead.
ALTER TABLE orders
ADD INDEX idx_status GLOBAL ASYNC ON (status);Create column table for analytics
CREATE TABLE ... STORE=COLUMNFor OLAP workloads, use column store and hash-based partitioning.
CREATE TABLE events_olap (
ts Timestamp NOT NULL,
tenant_id Uint64 NOT NULL,
event_type Utf8,
payload Json,
PRIMARY KEY (ts, tenant_id)
)
PARTITION BY HASH(tenant_id)
WITH (STORE = COLUMN);