A book on database internals is not about academic prestige. It matters because without that layer, teams make architectural decisions from surface signals and vendor labels.
In engineering practice, it helps you see how B-trees, LSM trees, transactions, replication, and consensus shape the write path, read amplification, crash recovery, and concurrency behavior of a system.
In interviews and architecture discussions, this material is especially valuable as a differentiator because it lets you explain not only what to choose, but why a mechanism behaves the way it does.
Practical value of this chapter
Storage-engine literacy
Deep B-tree and LSM-tree understanding improves architectural choices for read paths, write paths, and workload behavior.
Isolation intuition
Internal transaction mechanics make isolation-level and concurrency decisions explicit and defensible.
Replication and consensus
Tie replication models directly to availability targets, read freshness, and recovery requirements.
Interview deep dive
Use internals-level explanation as a differentiator: explain not only what to choose, but why it works.
Related chapter
PostgreSQL from the inside
Deep dive into MVCC, WAL, locks and PostgreSQL indexes from Egor Rogov.
Database Internals
Authors: Alex Petrov
Publisher: O'Reilly Media, Inc.
Length: 370 pages
Analysis of Alex Petrov's book on storage engines, B-trees, LSM trees, transactions, replication, consensus, recovery, and physical data layout.
The book connects storage engines, database pages, B-trees, LSM trees, write-ahead logs, two-phase locking, replicated state machines, and physical I/O into one engineering picture: how low-level storage, transactions, and distributed protocols become the observable behavior of a real DBMS.
Detailed analysis
Code of Architecture
Detailed analysis of the first part from Alexander and the Code of Architecture club
Part I: storage engines
B-trees and their variants
Data structures
Disk optimizations
Key insight: B-trees work well for reads and in-place updates, which makes them a natural fit for OLTP workloads. PostgreSQL and MySQL InnoDB use B+ trees for indexes.
Detailed analysis
Code of Architecture
Detailed analysis of the LSM-tree chapter from Alexander and the Code of Architecture club
LSM trees
Components
Compaction
Reading optimizations
Key insight: LSM trees optimize writes through sequential I/O, but they rely on compaction to keep read cost under control. Cassandra, RocksDB, LevelDB, and HBase all use this family of structures.
B-Tree vs LSM tree: choosing a storage structure
B-Tree architecture
✓ Advantages
- Fast reads: O(log N)
- Efficient range queries
- In-place updates
✗ Drawbacks
- Write amplification
- Random I/O on writes
Used in:
Transaction Processing
Concurrency control
Recovery
Part II: distributed systems
Detailed analysis
Code of Architecture
Detailed analysis of the chapter on replication and partitioning from Alexander and the Code of Architecture club
Replication and partitioning
Replication
Partitioning
Detailed analysis
Code of Architecture
Detailed analysis of the chapter on consensus protocols from Alexander and the Code of Architecture club
Consensus protocols
Paxos
- Classical Lamport algorithm
- Prepare → Promise → Accept
- Difficult to implement
- Multi-Paxos reduces the cost of repeated decisions
Raft
- Consensus with an explicit leader
- Leader election and log replication
- etcd, Consul, CockroachDB
- Easier to explain and implement
Zab
- ZooKeeper atomic broadcast
- Leader and backup-node model
- FIFO ordering guarantees
- Optimized for writes
Distributed transactions
Atomic commit protocols
Alternative approaches
Low-level details
File formats
Disk I/O optimization
Examples from real DBMSs
PostgreSQL
MySQL InnoDB
RocksDB
Cassandra
MongoDB
CockroachDB
Takeaways and recommendations
Strengths
- Deep explanation of DBMS internals
- Practical comparison of B-trees and LSM trees
- Detailed treatment of consensus protocols
- Examples from real production DBMSs
- Physical disk storage explained clearly
Who should read it?
- Engineers who work with databases below the SQL surface
- Storage-engine and data-infrastructure developers
- People who want to reason about DBMS trade-offs
- Candidates for Staff+ roles on database teams
- Researchers and practitioners in storage systems
Verdict: Database Internals bridges the gap between high-level system design books and academic papers. If DDIA explains which properties a system needs, Petrov shows how those properties emerge from pages, logs, indexes, replication, and commit protocols. It is a strong book for anyone who wants to understand DBMS behavior at the mechanism level.
Related chapters
- PostgreSQL from the inside (short summary) - Connects Petrov's general mechanisms to PostgreSQL's concrete implementation: MVCC, WAL, locks, and indexes.
- Designing Data-Intensive Applications, 2nd Edition (short summary) - Shows how DDIA's system-level ideas map to the internals of storage engines, replication, and consensus.
- Why understand storage systems? - A storage-section map for deciding where database internals should influence the class of storage you choose.
- Database Selection Framework - A selection framework where B-tree, LSM-tree, and write-ahead-log behavior become practical decision criteria.
- Replication and sharding - Operational continuation of the book topics: read/write paths, fault tolerance, rebalancing, and data-scale growth.
