YDB matters not only because of its origin inside Yandex, but because it combines distributed SQL with a platform view of multi-tenancy, scaling, and operations.
In real projects, this chapter helps you reason about tenant isolation, quotas, transaction scope, and batched writes as properties of the whole platform rather than local optimizations for one team.
In interviews and architecture discussions, it is most useful when you need to justify YDB through fault tolerance, noisy-neighbor control, and predictable behavior at scale.
Practical value of this chapter
Multi-tenant boundaries
Design tenant isolation and quota strategy so noisy neighbors do not violate platform-level SLA.
Transactional contours
Define transaction scope and batching model for distributed SQL execution behavior.
Storage/compute elasticity
Tie scaling strategy to workload profile and expected traffic spikes.
Interview rationale
Justify YDB through fault tolerance, multi-tenant governance, and operational predictability requirements.
Source
Wikipedia: YDB
YDB history from KiWi and KiKiMR to the public DBMS, release milestones, and general architecture context.
Official docs
YDB Docs: Architecture
Architecture overview: shared-nothing design, YDB tablets, automatic sharding, distributed transactions, and the storage layer.
YDB (Yet another DataBase) is a distributed SQL database with ACID transactions, automatic sharding, and shared-nothing architecture. In system design, YDB is often chosen for high-load backend systems where strong consistency, key-based horizontal scale, and a unified platform for transactional plus analytical workloads are needed.
History and context
Development starts in Yandex internal infrastructure
Yandex starts the KiWi distributed key-value layer as a foundation for services that need horizontal scale.
Transition toward distributed SQL
KiKiMR evolves around tablets and the actor model, which later become part of YDB's core architecture.
Broad production usage inside Yandex
YDB-style architecture is adopted by high-load services across the Yandex ecosystem.
Public open-source release
YDB is released as an open-source distributed SQL DBMS with ACID guarantees and automatic sharding.
24.3 server branch stabilization
The 24.3 branch receives fixes and operational improvements for production clusters.
New 25.x minor line
25.x stream continues SQL capabilities, performance tuning, and operational feature evolution.
Core architecture elements
Tablets and auto-sharding
Table data is distributed across YDB tablets that can split and move automatically as load grows.
Serializable transactions
YDB provides ACID transactions with serializable isolation and optimistic concurrency control.
Row + column tables
The same platform supports row tables for OLTP and column tables for nearby analytical workloads.
Disaggregated compute/storage
Architecture supports separated compute and storage layers plus multi-AZ fault-tolerant deployments.
Data model and transaction contour
The interactive section below shows how YDB combines row and column tables, automatic sharding, indexes, and distributed transactions in one architecture.
YDB data model: tables, shards, and transactions
YDB combines a relational model with automatic sharding and distributed transactions for high-load systems.
Why YDB is more than a typical SQL database
- Every table requires a primary key, and data is physically distributed across shard tablets.
- Both row-oriented and column-oriented tables are available for OLTP and OLAP profiles.
- Distributed transactions with serializable isolation and OCC are built in.
- Topics, CDC, and asynchronous replication let teams build integrated data pipelines.
Row-oriented tables
Core table type for transactional workloads: primary key is mandatory and rows are sorted by key.
Key elements
Typical use cases
- User/account state
- Orders/payments
- Transactional APIs
Example
CREATE TABLE orders (
tenant_id Uint64,
order_id Uint64,
status Utf8,
amount Uint64,
PRIMARY KEY (tenant_id, order_id)
);YDB architecture by layer
The diagram shows client access, SQL and transaction processing, tablets and shards, distributed storage, and fault-tolerance mechanics.
System view
Data model
Operational trade-offs
Read and write paths through components
This unified diagram combines write and read paths and shows how requests move through discovery, transaction coordinator, shards, and replicated storage.
Read/Write Path Explorer
Interactive walkthrough of transaction and query flow through core YDB cluster components.
Write path
- Primary key design determines whether a write stays single-shard or becomes distributed.
- YDB uses serializable isolation with optimistic concurrency control; conflicting transactions may fail and require retry.
- Cross-shard writes usually cost more latency/resources than single-shard writes.
- DDL and DML are not combined in one transaction; schema changes are separate idempotent operations.
When to choose YDB
Good fit
- High-load transactional services that need strong consistency and automatic sharding.
- Systems with continuous data and traffic growth where shared-nothing horizontal scale matters.
- Use cases that need OLTP tables plus nearby analytics on column tables.
- Teams ready to invest in key design and distributed SQL operational discipline.
Avoid when
- Small single-node projects where a simple local DB is enough.
- Workloads dominated by frequent cross-shard transactions without partition-aware key design.
- Organizations that cannot support distributed cluster operations.
- Cases where full-text search or pure OLAP dominates without a transactional core.
Practice: DDL and DML
Below are practical YDB DDL/DML examples: schema/index design, transactional upserts, and key-range query patterns.
DDL and DML examples in YDB
DDL defines schema/partitioning, while DML covers transactional writes and analytical reads.
In YDB, DDL operations (tables, indexes, partitioning) are handled separately from DML transactions and should be idempotent.
Create row table with auto partitioning
CREATE TABLEPrimary key is mandatory; auto partitioning helps scale with data and load growth.
CREATE TABLE orders (
tenant_id Uint64,
order_id Uint64,
status Utf8,
amount Uint64,
created_at Timestamp,
PRIMARY KEY (tenant_id, order_id)
)
WITH (
AUTO_PARTITIONING_BY_SIZE = ENABLED,
AUTO_PARTITIONING_MIN_PARTITIONS_COUNT = 8
);Add global secondary index
ALTER TABLE ... ADD INDEXSecondary index improves non-key access paths but increases write-side overhead.
ALTER TABLE orders
ADD INDEX idx_status GLOBAL ASYNC ON (status);Create column table for analytics
CREATE TABLE ... STORE=COLUMNFor OLAP workloads, use column store and hash-based partitioning.
CREATE TABLE events_olap (
ts Timestamp NOT NULL,
tenant_id Uint64 NOT NULL,
event_type Utf8,
payload Json,
PRIMARY KEY (ts, tenant_id)
)
PARTITION BY HASH(tenant_id)
WITH (STORE = COLUMN);Related chapters
- Database Selection Framework - How to evaluate distributed SQL options and decide when YDB is the better fit for consistency, scale, and operating cost.
- PostgreSQL: history and architecture - Comparison between classic OLTP patterns and YDB's distributed SQL execution model.
- Cassandra: architecture and trade-offs - Distributed SQL versus AP-oriented NoSQL trade-offs across consistency model and query-shape constraints.
- Distributed Transactions (2PC/3PC) - Coordination context for transactional guarantees and latency/availability compromises in distributed write paths.
- ClickHouse: analytical DBMS and architecture - Operational boundary between transactional YDB workloads and dedicated analytical serving layers.
- CockroachDB: distributed SQL database and architecture - Side-by-side view of two distributed SQL platforms across multi-region behavior, guarantees, and operations.
