System Design Space
Knowledge graphSettings

Updated: March 2, 2026 at 4:19 PM

Cassandra: architecture and trade-offs

mid

History of Apache Cassandra, masterless architecture, tunable consistency and LSM-like storage.

Source

Apache Cassandra

History, architecture and features of Apache Cassandra.

Перейти на сайт

Apache Cassandra is a distributed wide-column database that combines the ideas of Dynamo and Bigtable. It is designed for scalability and high availability, and consistency is customizable to system requirements.

Cassandra specifics

Wide-column store

The data is organized in keyspace and tables optimized for known query patterns.

Masterless architecture

All nodes are equal, eliminating a single point of failure and increasing availability.

AP + tunable consistency

The system is focused on availability and separation resistance, and the level of consistency is configurable.

Limitations and compromises

  • Limited support for complex join and ad-hoc queries.
  • The query model requires a pre-thought-out scheme for reading.
  • Works optimally with large volumes and high write loads.

Architecture visualization

Ring Topology

ABCDEFToken Ring0 - 100

Consistent Hashing

Choose a key to see how it is distributed across the ring (RF=3):

Replication Factor = 3

Each key is stored on 3 nodes: primary node and the next 2 clockwise nodes.

Write Path

  1. Client -> any node (coordinator)
  2. Coordinator computes hash(key) -> token
  3. Token -> primary node + RF-1 replicas
  4. Parallel write to all replicas
Primary Node
Replica Nodes
Gossip Protocol

History: key milestones

2008

Facebook -> open source

Cassandra was created by Facebook and opened to the community in 2008.

2009

Apache Incubator

The project moved to the Apache Incubator and began to develop as an open-source initiative.

2010

Top-level project

Apache Cassandra became the top-level project.

2011

1.0: first stable major release

The production-ready status of Cassandra as an independent distributed DBMS has been secured.

2013

2.0: LWT and development of CQL

Lightweight transactions (CAS/Paxos) and noticeable improvements to the query model appear.

2015

3.0: major storage update

Major internal changes to the storage layer and performance improvements.

2021

4.0: Focus on stability

A release with a focus on reliability, predictability and operational maturity.

2024

5.0: SAI and vector scripts

New major release with Storage-Attached Indexes and capabilities for modern search/AI workloads.

2025

IBM and DataStax

The purchase of DataStax by IBM has been announced, which strengthens the enterprise contour around the Cassandra ecosystem.

Cassandra architecture by layers

The architecture features coordinator, replication and LSM-like storage with commit log, memtable and SSTable.

Clients and CQL
CQLDriversProtocol
Layer transition
Routing and partitioning
Partitioned row storeDynamic columnsKeyspace / table
Layer transition
Replication and consistency
AP systemTunable consistencyMasterlessMulti-DC
Layer transition
Storage (LSM)
Commit logMemtableSSTableCompactionTombstones
Layer transition
OS + hardware
DiskCPU/RAMNetwork

Cluster architecture

All nodes are equalNo single point of failureLinear scaling

Data model

Keyspace -> Table -> RowFlexible columnsDenormalization

DDL vs DML: how the request goes

DDL works with the keyspace and table schema, DML works with data. Below are the basic steps for both types of requests.

How a request flows through Cassandra

Comparing the execution chain for DDL (schema) and DML (data)

Interactive replayStep 1/5

Active step

1. Node accepts request

Any cluster node can accept a DML request.

Data operations

  • DML works with data and indexes without changing schema.
  • Write path is optimized for high write throughput.
  • Consistency level defines write acknowledgement behavior.
Write-optimized pathLSM storageTunable consistency

Why choose Cassandra

  • Linear scaling when adding nodes.
  • High availability without a single point of failure.
  • Good write performance thanks to LSM-like storage.
  • Flexible consistency settings for different scenarios.

Enable tracking in Settings

System Design Space

© 2026 Alexander Polomodov