Apache Iceberg: table architecture for data lakes

Iceberg becomes important once a data lake stops being just a pile of files and starts needing tables, snapshots, and database-like discipline.

In real engineering work, this chapter helps you reason about snapshots, manifest files, schema evolution, hidden partitioning, and compaction so the lakehouse layer stays manageable instead of drowning in small files and metadata overhead.

In interviews and architecture discussions, it helps explain why an open table format is its own architectural decision with real costs in metadata scaling, operations, and streaming integration.

Practical value of this chapter

Design in practice

Helps design lakehouse tables with snapshot isolation and schema evolution in mind.

Decision quality

Guides partition/spec evolution, compaction, and metadata-scaling choices.

Interview articulation

Makes read and write paths and open-table-format benefits easier to explain.

Risk and trade-offs

Highlights metadata growth, small-file, and operational-complexity risks.

Connection

Data Pipeline / ETL / ELT Architecture

Baseline context for ingestion, orchestration, data quality, and recovery processes.

Open chapter

Apache Iceberg is an open analytical table format that brings DWH-level manageability to the data lake: atomic commits, schema evolution, time travel, and predictable reads over large tables. In practice it becomes the foundation of the analytical layer wherever streaming and batch paths must run over the same tabular representation and stay agreed on which data counts as current.

Evolution of data approaches

Data Warehouse (1990s)

Nightly ETL, strict schemas, and reports that often lag until the next day.

Data quality stays under control, but every schema change is expensive and slow to ship.

Data Lake (2010s)

Schema-on-read, ELT, and scale on object storage such as S3, GCS, or Blob.

Flexibility goes up, and the price is weak transactionality and consistency: a parallel write can easily corrupt the table.

Lakehouse / Open Table Format

Iceberg adds ACID commits, schema evolution, and time travel on top of the data lake.

It adds metadata and operational discipline, but makes the lakehouse layer much more manageable.

Pains of the classic data lake

Slow object listing and unpredictable query planning when tables contain many files.
No atomicity for parallel writes: overwrites can easily produce race conditions.
Schema evolution is fragile: renaming or dropping columns often breaks compatibility.
Partitioning, reprocessing, and backfills are hard to operate by hand.

Iceberg architectural layers

Shows how the engine reads only relevant files via metadata pruning.

Interactive run

Click "Start" to walk through the layers and see execution order.

Catalog lookup
The engine gets a pointer to the current table metadata file.
Read metadata file
Schema, partition spec, and available snapshots list are loaded.
Select snapshot
A consistent snapshot is selected for query execution and time travel.
Load manifest list
The set of manifest files referenced by the snapshot is resolved.
Predicate pruning
Only relevant data files are selected using min/max/null stats.
Scan data files
The engine scans only selected files and returns the query result.

Shows both paths: query path down and commit path up.

readwrite

Selected layer

Snapshot

Captures a consistent table version for reads and time travel.

Contains: Snapshot ID, commit timestamp, pointer to manifest list.

Why needed: Enables reproducible queries, rollback, and change audits.

Compliance

Data Governance & Compliance

Row-level deletes and lineage are especially important for regulatory requirements.

Open chapter

What exactly does Iceberg solve?

ACID transactions

Copy-on-write and optimistic concurrency at the metadata commit level.

Safe parallel INSERT/DELETE/MERGE without table corruption.

Time travel

Reading by snapshot ID or timestamp.

Reproducible queries, change auditing, and rollback scenarios.

Schema evolution

Column IDs and schema metadata in JSON, not positional indexes.

Adding/renaming columns without completely rewriting the table.

Hidden partitioning

Partition transforms (bucket/truncate/day) are hidden behind the table abstraction.

Faster scans and fewer mistakes from manual partition-key choices in queries.

Row-level deletes

V2 specification with delete files and positional references.

GDPR/FZ-152 deletion workflows, upserts, and targeted data corrections.

Physical model and deployment

Iceberg is not a separate server: it is an open specification, libraries, and engine integrations.
Data and metadata are stored as regular files in object storage.
The catalog is an external component: Hive Metastore, AWS Glue, JDBC, or REST catalog.
Spark/Flink/Trino/Impala/Hive read the same table using the same metadata format.
The design avoids rename and mass-listing bottlenecks that are painful in object storage.

Tableflow and streaming pipeline

Kafka topics are automatically materialized into Iceberg tables for near real-time analytics.
Custom ETL code between streaming ingestion and BI/lakehouse layers is reduced.
Data contracts and schema governance are required; otherwise bad input scales quickly.
Tableflow is useful as a bridge between an operational stream and analytical freshness SLAs.

References

Related chapters

Streaming Data - Delivery semantics and stream processing architecture.
Kafka: The Definitive Guide, 2nd Edition (short summary) - Event logs, ingestion, historical replay, and integration with analytical layers.
Data Pipeline / ETL / ELT Architecture - Designing ingestion, transformations, quality checks, and recovery flows.
Database Selection Framework - How to choose between OLTP, OLAP, and lakehouse layers with operational trade-offs in mind.
Data Governance & Compliance - PII, lineage, GDPR/FZ-152 requirements and data lifecycle control.