Brief overview of the T-Bank data platform

This chapter matters because it offers not an abstract data platform, but a live evolutionary case: how a large company moves from a DWH model toward lakehouse architecture and platform thinking without one magical leap.

In real engineering work, it helps you see a platform as a sequence of controlled steps: what to centralize, what to leave to domains, how to survive migration debt, and where shared platform capabilities end and local team freedom begins.

In interviews and architecture discussions, it is especially useful when you need to discuss not only the target state, but also the cost of getting there: responsibility boundaries, change management, and accumulated architectural debt.

Practical value of this chapter

Design in practice

Shows data-platform evolution as a sequence of controlled architecture moves.

Decision quality

Provides guidance on when to centralize capabilities versus preserve domain autonomy.

Interview articulation

Strengthens answers with a real production operating-model example.

Risk and trade-offs

Focuses on transformation cost: migration debt, ownership split, and change management.

Source

T-Bank Data Platform

A review of how the platform reached its current system landscape and where its key architectural paths run.

Read the article

As a company grows, its data architecture follows a familiar path: from classic DWH approaches (Inmon/Kimball) to Data Lake and lakehouse architecture. At the scale of 17k+ users and 144M+ requests per month, a single store no longer holds up across every load profile — and the platform splits into specialized layers with explicit ownership boundaries. Brief overview of the T-Bank data platform walks through that structure layer by layer.

Platform scale

Evolution period

18+ years

A gradual move from classic DWH to Data Lake and then to a lakehouse architecture.

Platform users

17 000+

Engineers, analysts, and product teams work through a shared data platform surface.

Requests per month

144 million+

At this load, scalability and latency stop being background work and sit squarely on the critical path.

Key systems

The end-to-end data lifecycle — ingestion, storage, processing, governance, observability, and security — is spread across specialized systems.

Architectural blocks of the platform

Data ingestion and delivery

Sources of very different nature meet here — OLTP systems, event streams, and downstream data products — and each delivery channel has to be kept reliable on its own.

Data Replication: BODS + Chrono

Two generations of replication live side by side: a legacy batch path and streaming change propagation. The old path cannot be switched off at once while it still carries part of the load.

Event Sourcing: SDP

A Streaming Data Transfer Platform for domain events, shaped by Data Mesh principles — ownership of the stream stays with the domain, not a central team.

Reverse ETL: Spheradian

Returns enriched data back into operational systems with latency down to 100 ms — otherwise the analytical result never reaches the business flow in time.

Architectural takeaways

At data-platform scale a single universal DBMS does not hold up: different latency and throughput profiles call for specialized storage and compute paths.

Data Contracts and the observability loop stop being optional — without them, a growing number of teams turns the platform into an unmanageable set of integrations where every schema change risks taking down someone else's pipeline.

The gap between analytics and operational products is closed by Reverse ETL: enriched data has to return to business flows quickly, or its value is lost.

Technology does not buy resilience on its own — organizational responsibility holds it: clear data owners, incident processes, and shared delivery standards.

Practical checklist

Separate ingestion, storage, processing, and governance as architectural layers with explicit SLAs/SLOs — otherwise you cannot hold any single layer accountable.
For each data product, capture the contract, owner, and backward-compatibility rules for schema changes: a silent contract break costs more than a visible one.
Plan multi-engine execution early: batch, streaming, and exploratory analytics usually need different engines, and locking into one forces a compromise on all of them.
Treat data incident management as a required process, not as a postmortem habit after the first major outage — by then the incident has already cost trust in the data.

References

Related chapters

Evolution of T-Bank Architecture - How the company's broader architecture evolved and why platform engineering became a discipline of its own.
Data platforms in 2025: interview with Nikolay Golov - Where the real trade-offs in building data platforms sit, and which of them you have to choose deliberately.
Data Pipeline / ETL / ELT Architecture - How ingestion, transformation, and serving paths fit together in data pipelines.
Apache Iceberg: table architecture for data lakes - How open table formats support lakehouse design and controlled data evolution.
Technoshow “Dropped”: episode 1 - A practical incident review for data-platform operations and recovery discipline.