Original
Telegram: book_cube
Original post with analysis of Cassandra.
Cassandra: The Definitive Guide, 3rd Edition
Authors: Jeff Carpenter, Eben Hewitt
Publisher: O'Reilly Media, Inc.
Length: 426 pages
Wide Column Store: Bigtable + Dynamo architecture, tunable consistency, LSM-Tree, Gossip Protocol. AP (CAP) and PA/EL (PACELC) classification.
OriginalApache Cassandra is one of the most famous NoSQL databases, combining the best of both worlds: the data model Google Bigtable and distributed architecture Amazon Dynamo. Let's look at its architecture, consistency model and place in the CAP/PACELC classification.
Origin: Bigtable + Dynamo
Google Bigtable
From Bigtable Cassandra I took:
- Wide Column data model
- LSM-Tree for storage (MemTable → SSTable)
- Append-only entries to Commit Log
Amazon Dynamo
From Dynamo, Cassandra inherited:
- Consistent Hashing for distribution
- Gossip Protocol for node detection
- Tunable Consistency (adjustable consistency)
Architecture visualization
Ring Topology
Consistent Hashing
Choose a key to see how it is distributed across the ring (RF=3):
Replication Factor = 3
Each key is stored on 3 nodes: primary node and the next 2 clockwise nodes.
Write Path
- Client -> any node (coordinator)
- Coordinator computes hash(key) -> token
- Token -> primary node + RF-1 replicas
- Parallel write to all replicas
Related chapter
CAP theorem
Fundamental limitation of distributed systems: C, A, P.
Cassandra and the CAP theorem
Availability + Partition Tolerance
Cassandra - default AP system
When network partitioned, Cassandra prefers to remain accessible, even if it means returning potentially outdated data. This makes it ideal for scenarios where availability is more critical than strict consistency.
Tunable Consistency
Unique feature: Cassandra allows tune level of consistency at the level of each request. With settings QUORUM or ALL it is possible to obtain behavior closer to CP.
Related chapter
PACELC theorem
Extending CAP: Tradeoffs in Normal Mode.
Cassandra and PACELC
Partition → Availability, Else → Latency
Speed over consistency
Selects Availability — continues to serve requests, even if some of the replicas are unavailable.
Selects Low Latency — does not wait for confirmation from all replicas.
This explains why Cassandra uses eventual consistency even when the network is working normally - not out of fear of partitions, but for the sake of minimal delays.
Related chapter
Consistency patterns and idempotency
Practical consistency models, idempotent operations and trade-offs.
Consistency levels
Cassandra offers a range of consistency levels, from maximum speed to strong consistency:
The write is considered successful even if saved only in hinted handoff.
A single response is enough. Fast, but may return stale data.
Requires a response from most replicas (RF/2 + 1). Balance speed and consistency.
Requires a response from all replicas. Maximum consistency, minimum accessibility.
Linearizable consistency via Paxos. For Lightweight Transactions (CAS operations).
Strong consistency formula: If W + R > RF, where W is the consistency level of recording, R is reading, RF is replication factor, then reading the last written version is guaranteed.
Related chapter
Replication and sharding
How to choose a sharding strategy, maintain shard balance and scale writing/reading.
Architecture
LSM-Tree Storage
1. Entries go to Commit Log (durability)
2. Data is accumulated in MemTable (in-memory)
3. Flush periodically SSTable (immutable on disk)
4. Background Compaction combines SSTables
Data distribution
• Consistent Hashing - key → token → node
• Virtual Nodes (vnodes) - uniform distribution
• NetworkTopologyStrategy — rack/DC awareness
• Replication Factor — number of copies
Gossip Protocol
• Peer-to-peer node discovery
• Phi Accrual Failure Detector
• No single point of failure
• Every second exchange with 1-3 nodes
Fault tolerance
• Hinted Handoff — temporary storage for inaccessible nodes
• Read Repair - reading correction
• Anti-Entropy Repair - background synchronization
• Merkle Trees — effective data comparison
When to use Cassandra
✓ Good fit
- Time-series data (IoT, metrics, logs)
- High write throughput
- Geographically distributed systems
- When accessibility is more important than consistency
- Messaging and activity feeds
✗ Not the best choice
- ACID transactions between records
- Complex JOINs and analytics
- Frequently changing data schemas
- Small amounts of data (<100GB)
- High requirements for read latency
Story
A detailed history of Cassandra can be found in the chapter "Cassandra: architecture and compromises" — the link leads directly to the timeline.
Today, Cassandra is used in large production systems (Netflix, Apple, Uber, etc.) as a basis for write-heavy and geo-distributed scripts.
Key Findings
- 1.Cassandra — AP system by CAP and PA/EL by PACELC: priority of availability and low latency
- 2.Tunable consistency allows you to balance speed and consistency on a per-request basis
- 3.Architecture without a master node: all nodes have equal rights thanks to Gossip Protocol
- 4.LSM-Tree Provides high write throughput through sequential disk operations
- 5.Ideal for write-heavy loads, time-series data and geographically distributed systems
