Kafka matters not because it is a well-known broker, but because the immutable log changes how services integrate, how streaming is built, and how data can be replayed.
In real engineering work, this book helps design partitioning, retention, consumer groups, delivery semantics, and lag management as parts of one event backbone instead of a pile of unrelated settings.
In interviews, reviews, and architecture conversations, it is especially useful when you need to show how ordering, lag spikes, rebalance behavior, and storage growth affect the reliability of the whole system, not just the messaging layer.
Practical value of this chapter
Design in practice
Provides a practical framework for Kafka as an event backbone at scale.
Decision quality
Improves partitioning, retention, and consumer-group policy decisions by workload.
Interview articulation
Helps explain delivery semantics, replay, and DLQ strategies in production terms.
Risk and trade-offs
Surfaces ordering-break, lag spike, and storage-growth risks.
Source
Post in Book Cube
Original book review from Alexander Polomodov
Kafka: The Definitive Guide, 2nd Edition
Authors: Gwen Shapira, Todd Palino, Rajini Sivaram, Krit Petty
Publisher: O'Reilly Media, Inc.
Length: 485 pages
Distributed stream processing platform: producers, consumers, partitions, replication, delivery semantics and Kafka Streams.
Book editions
Fall 2017 - 11 chapters covering the basics of Kafka, producers, consumers, administration and stream processing.
Late 2021 - expanded edition with an emphasis on cloud deployments and new platform capabilities.
Key Concepts of Kafka
Messages and packages
Basic data units in Kafka. Messages are grouped into batches for efficient transmission over the network.
Topics and partitions
Topics are logical message channels divided into partitions for parallel processing and scaling.
Producers
Clients writing messages to Kafka. Kafka is aimed at writers - high write throughput.
Consumers
Clients reading messages from Kafka. Consumer groups provide parallel processing and fault tolerance.
Related topic
Designing Data-Intensive Applications
Chapter 11 takes a deep dive into stream processing.
We recommend
Streaming Data
Architecture of streaming systems: from data collection to data consumption
Book structure (2nd Edition - 14 chapters)
Meet Kafka
Introduction to publish/subscribe messaging, history of creation on LinkedIn, basic concepts: messages, batches, schemas, topics, partitions, producers, consumers, brokers.
Managing Apache Kafka ProgrammaticallyNEW
AdminClient API: asynchronous interface for managing topics, configurations, consumer groups, cluster metadata. Leader election and reassigning replicas.
Installing Kafka
Installation and configuration of brokers, selection of hardware, configuration of ZooKeeper/KRaft. The 2nd edition brings more emphasis on cloud deployments.
Kafka Producers
Configuration of producers, serialization (Avro, JSON), partitioners, headers, interceptors, quotas and bandwidth management.
Kafka Consumers
Consumer groups, partition assignment, offset management (auto-commit, sync, async), rebalance listeners, standalone consumers.
TransactionsNEW
Exactly-once semantics, transactional producer API, read_committed isolation, idempotency and atomic writes.
Kafka Internals(under the hood)
Cluster membership, controller, replication, ISR, request processing, physical storage, log segments and indexes.
Reliable Data Delivery
Delivery guarantees: at-least-once, at-most-once, exactly-once. Configuration of producer (acks, retries), consumer and broker for reliability.
Securing KafkaNEW
SSL/TLS encryption, SASL authentication (GSSAPI, PLAIN, SCRAM, OAUTHBEARER), authorization with ACLs, audit and security in production.
Building Data Pipelines
Kafka Connect: source and sink connectors, standalone and distributed mode, transformations, converters, dead letter queues.
Cross-Cluster Data Mirroring
MirrorMaker 2.0, multi-datacenter architecture (Active-Active, Active-Passive), replication of topics and consumer offsets between clusters.
Administering Kafka
Topic operations, consumer group management, partition reassignment, configuration for production, cluster operations.
Monitoring Kafka
JMX metrics, critical metrics for brokers, producers and consumers. Under-replicated partitions, lag monitoring, monitoring tools.
Stream Processing
Kafka Streams API: stateless and stateful operations, windowing, joins, KTables vs KStreams, exactly-once processing, testing.
New in 2nd edition
- ▸AdminClient API — software cluster management
- ▸Transactions — exactly-once semantics and atomic operations
- ▸Security — SSL/TLS, SASL, ACLs for production
- ▸MirrorMaker 2.0 - improved cross-cluster replication
- ▸KRaft — mention of a new mode without ZooKeeper
Message delivery semantics
At-most-once
The message is delivered no more than once. Possible loss of data. Suitable for metrics and logs.
At-least-once
The message is delivered at least once. Duplicates are possible. Standard Kafka mode.
Exactly-once
The message is delivered exactly once. Requires idempotent producer and transactional API.
Kafka cluster architecture
Hover over a component for details or press the button
Partition replication
Leader accepts writes, followers replicate
Key Takeaways for System Design
- ▸Partitioning is the key to horizontal scaling. The choice of partition key determines the load distribution.
- ▸Replication provides fault tolerance. ISR (In-Sync Replicas) guarantees consistency.
- ▸Consumer groups allow scaling of processing. Number of consumers ≤ number of partitions.
- ▸Retention policy determines how long data is stored. Kafka can act as a log store.
- ▸Kafka Connect simplifies integration with external systems without writing code (source and sink connectors).
Related chapters
- Streaming Data (short summary) - End-to-end stream architecture perspective from ingestion to consumers and windowed processing.
- Designing Data-Intensive Applications (short summary) - Foundational model of replication, consistency, and stream processing behind Kafka's design trade-offs.
- Distributed message queue - Practical case study on ordering, throughput, durability, and behavior under failure conditions.
- Event-driven architecture: Event Sourcing, CQRS, Saga - Architectural context where Kafka is often used as the transport backbone for event-driven workflows.
- Kappa architecture: a stream-first alternative to Lambda - Single processing path model where Kafka log serves as source-of-truth for online and replay workloads.
- Data pipeline / ETL / ELT architecture - How Kafka fits into production data platforms across ingestion, orchestration, data quality, and operations.
- Enterprise Integration Patterns (short summary) - Integration pattern language for designing robust producer/consumer and routing interactions.
- Big Data: Principles and best practices of scalable realtime data systems (short summary) - Strategic context for realtime data systems where Kafka frequently becomes a central platform component.
- Google Global Network: evolution and architecture principles for the AI era - Network context for cross-region replication and high-throughput stream transport at global scale.
- Google TPU: architecture evolution and impact on ML systems - AI workload context where Kafka-style logs and streams feed data and ML pipelines.
