System Design Space
Knowledge graphSettings

Updated: March 25, 2026 at 2:00 AM

PostgreSQL from the inside (short summary)

hard

PostgreSQL looks straightforward right up until you need to reason about MVCC, WAL, locking, and the planner. This book is valuable precisely because it walks into that inner layer without turning it into mysticism.

In real work, it helps you understand snapshot semantics, xmin/xmax, buffer cache, index types, and recovery behavior as practical levers that affect concurrent correctness and operating cost.

In interviews and architecture discussions, this material is especially strong when you need to show that you can do more than write SQL and can troubleshoot PostgreSQL at the system level.

Practical value of this chapter

MVCC visibility model

Understanding xmin/xmax and snapshot semantics helps design safe concurrent business operations.

WAL and checkpoint impact

Model write amplification, recovery time, and durability guarantees when choosing topology and settings.

Planner and index strategy

Connect statistics, planner behavior, and index types to critical query paths before scale hits.

Interview root-cause reasoning

Demonstrate PostgreSQL troubleshooting at internals level, not only at SQL syntax level.

Official source

Postgres Pro

Free electronic version of the book on the Postgres Professional website.

Download for free

PostgreSQL 17 from the Inside

Authors: Egor Rogov
Publisher: DMK Press, 2025
Length: 668 pages

Analysis of Egor Rogov's book: MVCC, buffer cache, WAL, locks, query planner and PostgreSQL index types.

Original

Why read this book?

Understanding the internals of PostgreSQL is critical to:

  • Performance optimizations - knowing how the query planner works helps you write effective SQL
  • Troubleshooting - understanding MVCC and blocking allows you to diagnose concurrency problems
  • System Design interview — deep knowledge of a real DBMS distinguishes the candidate from those who know only theory
  • Architectural solutions — choosing the right index type or partitioning strategy

Related chapter

Database Internals

Book by Alex Petrov about B-Trees, LSM-Trees and general principles of DBMS design.

Читать обзор

Book structure: 5 parts

Part 1: Isolation and multi-versioning

The foundation of PostgreSQL is the MVCC (Multi-Version Concurrency Control) mechanism, which allows readers and writers not to block each other.

Key Concepts

  • Transaction isolation levels
  • Structure of pages and row versions (tuples)
  • Data snapshots
  • Version visibility (visibility rules)

Data cleaning

  • HOT updates (Heap-Only Tuples)
  • Vacuum and Autovacuum processes
  • Freezing transactions
  • Rebuilding tables and indexes

MVCC: Multi-Versioning in PostgreSQL

1 / 4

Initial state

The row was created by transaction 100. xmin=100 means it is visible to all transactions with ID >= 100.

Heap (table storage)

id: 1Alice
xmin:100
xmax:
xmin— transaction ID that created the version
xmax— transaction ID that deleted/updated the version

Transactions (select to check visibility)

Part 2: Buffer cache and log

How PostgreSQL manages memory and provides durability through Write-Ahead Logging (WAL).

Buffer cache

  • Shared buffers and their organization
  • Clock sweep algorithm
  • Buffer pinning and reference counting
  • Local buffers for temporary tables

WAL (Write-Ahead Log)

  • Structure of WAL records
  • Checkpoint and recovery
  • Synchronization modes (fsync)
  • Archiving and replication

Part 3: Locks

Despite MVCC, PostgreSQL uses different types of locks to ensure consistency.

Object locks

  • 8 table locking modes
  • Compatibility Matrix
  • Advisory locks

Row locks

  • FOR UPDATE / FOR SHARE
  • Tuple-level locks
  • Multixact for shared locks

Locks in memory

  • Spinlocks
  • Lightweight locks (LWLocks)
  • Buffer pins

Part 4: Running Queries

From SQL parsing to returning the result - the full cycle of query processing.

Processing stages

  • Parsing → Rewriting → Planning
  • Collection and use of statistics
  • Cost model
  • EXPLAIN ANALYZE in detail

Execution Operations

  • Seq Scan vs Index Scan vs Bitmap Scan
  • Nested Loop / Hash / Merge Join
  • Sorting: quicksort vs external sort
  • Aggregation and grouping

Part 5: Types of Indexes

PostgreSQL supports many types of indexes for different use cases.

Classic

  • B-tree - universal, default
  • Hash - only equality

Specialized

  • GiST — geometry, full text
  • SP-GiST - sparse data
  • GIN - arrays, JSONB, FTS

For analytics

  • BRIN — Block Range Index
  • Ideal for time-series data
  • Minimum index size

Key Concepts for System Design

🔄MVCC and insulation

  • Each transaction sees its own “snapshot” of data
  • Old versions of rows are stored in the same table
  • VACUUM removes “dead” versions
  • Trade-off: high competition vs accumulation of bloat

📝WAL and durability

  • Changes are first written to the log, then to the data
  • In case of failure - recovery from WAL (crash recovery)
  • Basis for replication (streaming replication)
  • Checkpoint balances between recovery time and I/O

🎯Index selection

  • B-tree: ranges, sorting, uniqueness
  • GIN: full text search, JSONB, arrays
  • GiST: geodata, R-tree for coordinates
  • BRIN: huge append-only tables (logs, metrics)

Query optimization

  • The planner chooses a path based on statistics
  • ANALYZE updates table statistics
  • Index-only scan avoids heap access
  • Parallel queries for large tables

Related chapter

Guide to Databases

Tutorial from PostgreSQL - fundamentals of the relational model, SQL and DBMS architecture.

Читать обзор

PostgreSQL vs other DBMS

AspectPostgreSQLMySQL (InnoDB)
MVCCHeap versions, requires VACUUMUndo log, automatic cleaning
IndexesB-tree, Hash, GiST, GIN, BRIN, SP-GiSTB-tree, Full-text, Spatial
ReplicationStreaming (physical), LogicalAsync, Semi-sync, Group Replication
ExtensibilityExtensions, custom types, operatorsPlugins, storage engines

About the author

Egor Rogov is a PostgreSQL expert working at Postgres Professional. Author of courses on PostgreSQL administration and development, active participant in the PostgreSQL community. The book is based on many years of teaching and consulting experience.

Related chapters

Verdict

"PostgreSQL from the inside" is a must-read for everyone who seriously works with PostgreSQL. The book is free, written in Russian and provides a deep understanding of mechanisms that usually remain a “black box”. It is especially valuable for those who are preparing for a System Design interview and want to demonstrate not superficial knowledge, but an understanding of real trade-offs in the design of data storage systems.

Where to find the book

Enable tracking in Settings