System Design Space
Knowledge graphSettings

Updated: March 25, 2026 at 2:00 AM

Cassandra: The Definitive Guide (short summary)

hard

A Cassandra book becomes especially useful once you stop seeing it as just an AP label from a CAP diagram and start reasoning through its real write, storage, and failure model.

In engineering practice, it helps connect tunable consistency, partitioning, LSM storage, and query modeling to concrete product demands where availability and linear write growth matter more than universal query flexibility.

In interviews and architecture discussions, this chapter is strong because it lets you state Cassandra's limits honestly: it solves some classes of problems extremely well, but needs complementary patterns where ad-hoc queries or strict consistency are expected.

Practical value of this chapter

AP-first mindset

Use Cassandra where availability and linear write scaling under network partitions are primary requirements.

Data model by query

Design tables from business query paths; partition-key quality directly controls latency and load balance.

Operational discipline

Treat repair, compaction tuning, and tombstone monitoring as mandatory parts of architecture operations.

Interview limitations

Call out constraints clearly: ad-hoc querying and strict consistency often need complementary patterns.

Original

Telegram: book_cube

Original post with analysis of Cassandra.

Перейти на сайт

Cassandra: The Definitive Guide, 3rd Edition

Authors: Jeff Carpenter, Eben Hewitt
Publisher: O'Reilly Media, Inc.
Length: 426 pages

Wide Column Store: Bigtable + Dynamo architecture, tunable consistency, LSM-Tree, Gossip Protocol. AP (CAP) and PA/EL (PACELC) classification.

Original

Apache Cassandra is one of the most famous NoSQL databases, combining the best of both worlds: the data model Google Bigtable and distributed architecture Amazon Dynamo. Let's look at its architecture, consistency model and place in the CAP/PACELC classification.

Origin: Bigtable + Dynamo

Google Bigtable

From Bigtable Cassandra I took:

  • Wide Column data model
  • LSM-Tree for storage (MemTable → SSTable)
  • Append-only entries to Commit Log

Amazon Dynamo

From Dynamo, Cassandra inherited:

  • Consistent Hashing for distribution
  • Gossip Protocol for node detection
  • Tunable Consistency (adjustable consistency)

Architecture visualization

Ring Topology

ABCDEFToken Ring0 - 100

Consistent Hashing

Choose a key to see how it is distributed across the ring (RF=3):

Replication Factor = 3

Each key is stored on 3 nodes: primary node and the next 2 clockwise nodes.

Write Path

  1. Client -> any node (coordinator)
  2. Coordinator computes hash(key) -> token
  3. Token -> primary node + RF-1 replicas
  4. Parallel write to all replicas
Primary Node
Replica Nodes
Gossip Protocol

Related chapter

CAP theorem

Fundamental limitation of distributed systems: C, A, P.

Читать обзор

Cassandra and the CAP theorem

AP

Availability + Partition Tolerance

Cassandra - default AP system

When network partitioned, Cassandra prefers to remain accessible, even if it means returning potentially outdated data. This makes it ideal for scenarios where availability is more critical than strict consistency.

Tunable Consistency

Unique feature: Cassandra allows tune level of consistency at the level of each request. With settings QUORUM or ALL it is possible to obtain behavior closer to CP.

Related chapter

PACELC theorem

Extending CAP: Tradeoffs in Normal Mode.

Читать обзор

Cassandra and PACELC

PA/EL

Partition → Availability, Else → Latency

Speed over consistency

if P (Partition)

Selects Availability — continues to serve requests, even if some of the replicas are unavailable.

else E (Normal)

Selects Low Latency — does not wait for confirmation from all replicas.

This explains why Cassandra uses eventual consistency even when the network is working normally - not out of fear of partitions, but for the sake of minimal delays.

Related chapter

Consistency patterns and idempotency

Practical consistency models, idempotent operations and trade-offs.

Читать обзор

Consistency levels

Cassandra offers a range of consistency levels, from maximum speed to strong consistency:

ANY

The write is considered successful even if saved only in hinted handoff.

Min. latency
ONE

A single response is enough. Fast, but may return stale data.

QUORUM

Requires a response from most replicas (RF/2 + 1). Balance speed and consistency.

Recommended
ALL

Requires a response from all replicas. Maximum consistency, minimum accessibility.

SERIAL

Linearizable consistency via Paxos. For Lightweight Transactions (CAS operations).

LWT

Strong consistency formula: If W + R > RF, where W is the consistency level of recording, R is reading, RF is replication factor, then reading the last written version is guaranteed.

Related chapter

Replication and sharding

How to choose a sharding strategy, maintain shard balance and scale writing/reading.

Читать обзор

Architecture

LSM-Tree Storage

1. Entries go to Commit Log (durability)

2. Data is accumulated in MemTable (in-memory)

3. Flush periodically SSTable (immutable on disk)

4. Background Compaction combines SSTables

Data distribution

Consistent Hashing - key → token → node

Virtual Nodes (vnodes) - uniform distribution

NetworkTopologyStrategy — rack/DC awareness

Replication Factor — number of copies

Gossip Protocol

• Peer-to-peer node discovery

Phi Accrual Failure Detector

• No single point of failure

• Every second exchange with 1-3 nodes

Fault tolerance

Hinted Handoff — temporary storage for inaccessible nodes

Read Repair - reading correction

Anti-Entropy Repair - background synchronization

Merkle Trees — effective data comparison

When to use Cassandra

✓ Good fit

  • Time-series data (IoT, metrics, logs)
  • High write throughput
  • Geographically distributed systems
  • When accessibility is more important than consistency
  • Messaging and activity feeds

✗ Not the best choice

  • ACID transactions between records
  • Complex JOINs and analytics
  • Frequently changing data schemas
  • Small amounts of data (<100GB)
  • High requirements for read latency

Story

A detailed history of Cassandra can be found in the chapter "Cassandra: architecture and compromises" — the link leads directly to the timeline.

Today, Cassandra is used in large production systems (Netflix, Apple, Uber, etc.) as a basis for write-heavy and geo-distributed scripts.

Key Findings

  • 1.Cassandra — AP system by CAP and PA/EL by PACELC: priority of availability and low latency
  • 2.Tunable consistency allows you to balance speed and consistency on a per-request basis
  • 3.Architecture without a master node: all nodes have equal rights thanks to Gossip Protocol
  • 4.LSM-Tree Provides high write throughput through sequential disk operations
  • 5.Ideal for write-heavy loads, time-series data and geographically distributed systems

Related chapters

  • CAP theorem - Core framework for understanding why Cassandra prioritizes availability under partitions and how that affects system-level SLAs.
  • PACELC theorem - Extension of CAP that explains Cassandra latency-versus-consistency trade-offs even in healthy network conditions.
  • Cassandra: architecture and trade-offs - Practical overview of Cassandra evolution, architecture constraints, and production operating patterns.
  • Replication and sharding - Operational guidance for replication factor, data placement, and rebalancing in large Cassandra deployments.
  • Jepsen and consistency models - How fault-injection testing validates distributed guarantees and what it means for real Cassandra consistency behavior.
  • Database Selection Framework - Selection framework to justify Cassandra for write-heavy and geo-distributed workloads in modern architectures.

Where to find the book

Enable tracking in Settings