Source
MongoDB
History of MongoDB and key features, including transactions and consistency issues.
MongoDB is a document-oriented NoSQL database that has evolved over time from “fast JSON storage” to a system with replication, sharding and multi-document ACID transactions. In this chapter, we will look at the history of its appearance and how consistency guarantees have changed.
History: key milestones
Start of development
10gen begins development of MongoDB as part of the PaaS platform.
Release and open source
The company is shifting its focus to an open-source model and commercial support.
MongoDB Inc.
10gen is renamed MongoDB Inc.
Atlas
MongoDB Atlas (DBaaS) emerges and becomes the primary way to consume the product.
IPO
MongoDB goes public (ticker MDB).
4.0: transactions + snapshot
Multi-document ACID transactions and snapshot read concerns appear.
5.0: w:majority by default
According to Wikipedia, the default write concern has been raised to majority.
Documentation
Sharded Cluster Components
mongos, config servers and shard replica sets as the basic elements of the cluster.
MongoDB architecture (modern versions)
MongoDB has a client and driver layer, a routing and query layer, and a replication and sharding layer on top of the storage engine.
Sharded cluster components
mongos (router)
Routes requests to target shards based on cluster metadata.
Config servers
Store cluster metadata and sharding state.
Shards (replica sets)
Each shard is deployed as a replica set.
Typical deployment modes
Standalone
Single mongod without sharding or replication.
Replica set
Primary + multiple secondaries, synchronized through oplog.
Sharded cluster
mongos + config servers + multiple shard replica sets.
DDL vs DML: how the request goes
DDL changes the structure of collections and indexes, DML works with documents. Below are the execution chains for both types of requests.
How a request flows through MongoDB
Comparing the execution chain for DDL (schema) and DML (data)
Active step
1. Client command
A CRUD request arrives through the driver.
Data operations
- DML works with documents and indexes without changing schema.
- Core pressure is on cache, indexing, and journaling.
- Read/Write concern define the latency-vs-reliability tradeoff.
Related chapter
Jepsen and consistency models
How Jepsen tests distributed systems and what consistency models mean.
Consistency in MongoDB: what you can configure
In a distributed database, “consistency” is not a single switch, but a series of settings and trade-offs. Wikipedia emphasizes read/write concern and the emergence of transactions as key mechanisms.
Replication and sharding
MongoDB supports replication and sharding, so reads/writes are always a tradeoff of networking and failure.
Read/Write concern
MongoDB uses read concern and write concern levels to control the freshness of reads and the reliability of committing writes.
ACID transactions
Since version 4.0, multi-document ACID transactions are supported, which brings the model closer to classic RDBMS scenarios.
How models and warranties have changed over time
Safer defaults
- Wikipedia notes that the default write concern has been increased to majority (w:majority), which reduces the risk of losing confirmed records during failures.
- For strict scenarios, it is important to consciously choose read/write concern levels and understand their impact on latency and availability.
What MongoDB guarantees today (in general)
- Supports replication and sharding, as well as multi-document ACID transactions (since 4.0).
- Read concern / write concern levels allow you to choose a trade-off between speed and security.
- Wikipedia notes that in 5.0 the default write concern was raised to majority (w:majority).
Practical lesson for system design: When choosing MongoDB, it is important to agree in advance what guarantees the product needs (latency vs safety), and check that the configuration and drivers actually set the expected read/write levels.
