System Design Space
Knowledge graphSettings

Updated: March 2, 2026 at 7:29 PM

Prometheus: history and architecture

mid

Prometheus from a system design perspective: timeline, layered architecture, write/read flow, and a practical DDL-like/DML-like model for monitoring workloads.

Source

Prometheus docs

Official overview of Prometheus architecture: pull scraping, TSDB, PromQL, and rules.

Перейти на сайт

Prometheus is a purpose-built monitoring time-series system that combines pull-based metric collection, its own TSDB engine, and PromQL query semantics. In the TSDB map it represents the canonical choice for infrastructure monitoring and SLO-driven operations.

History: key milestones

2012

Born at SoundCloud

Prometheus started as an internal monitoring engine for cloud-native workloads.

2015

Open source and early ecosystem

The project became open source and quickly gained adoption in Kubernetes environments.

2016

CNCF incubating stage

Prometheus joined CNCF and established a neutral governance model.

2018

CNCF graduated

It reached graduated status and became an industry standard for infrastructure monitoring.

2023

Prometheus 2.x as the production baseline

Remote write/read patterns, operator workflows, and cardinality tuning practices matured.

2024+

Evolution toward scalable TSDB profiles

Long-term storage, federation, and hybrid Prometheus topologies became standard patterns.

Prometheus specifics

Pull-based collection

Prometheus scrapes targets itself, which simplifies topology control and endpoint health management.

TSDB with WAL + blocks

Metrics follow WAL -> head -> compacted blocks, giving a predictable storage lifecycle.

PromQL as the query language

PromQL is optimized for time-series vectors, label-based aggregation, and time-window analysis.

Rule-driven alerting

Recording/alerting rules plus Alertmanager integration create a controlled incident response loop.

Prometheus architecture by layers

At a high level, Prometheus can be read as a pipeline: ingest -> TSDB head/WAL -> block storage -> PromQL/query engine -> rules/alerts -> external integrations.

Ingestion layer
Service discoveryScrape loopsRelabelingRemote write ingest
Layer transition
TSDB Head + WAL
In-memory headAppend samplesLabel indexWAL segments
Layer transition
Block storage
2h blocksCompactionRetentionTSDB snapshots
Layer transition
PromQL query engine
ParserInstant/Range evalVector matchingAggregation
Layer transition
Rules and alerting
Recording rulesAlerting rulesRule groupsAlertmanager
Layer transition
Integrations and operations
GrafanaFederationRemote write/readHA pairs

Key features

Prometheus is optimized for monitoring workloads: pull ingest, WAL/block-based TSDB, PromQL, and rule-driven alerting.

Pull model

HTTP scrapeTarget healthDiscovery-driven topology

Label model

Metric + labelsHigh cardinality riskPromQL selectors

Data lifecycle

WAL -> Head -> BlocksCompactionRetention/TTL via flags

DDL vs DML: Prometheus model

Prometheus does not implement SQL DDL/DML literally. For system-design analysis, it is useful to separate DDL-like operations (scrape/rule topology updates) from DML-like operations (metric sample flow and PromQL read execution).

How the DDL/DML model works in Prometheus

DDL-like: scrape/rule topology updates. DML-like: sample flow and PromQL reads.

Interactive replayStep 1/5

1. Scrape / ingest

Samples + queries

Scraper or remote write ingest receives new metric samples.

2. WAL append

Samples + queries

Samples are appended to WAL for durability before deeper processing.

3. Head update

Samples + queries

TSDB head updates series state and label index for fresh data.

4. PromQL execution

Samples + queries

Query engine reads head and historical blocks, then aggregates results.

5. Compaction + retention

Samples + queries

Background compaction merges blocks and retention removes expired data.

Active step

1. Scrape / ingest

Scraper or remote write ingest receives new metric samples.

Data and query path

  • The DML-like path covers ingest, storage, and PromQL read execution.
  • Fresh data lives in head, historical data in compacted blocks.
  • Label cardinality has a direct impact on cost and query latency.

Source

InfluxDB docs

Reference context for an alternative TSDB profile.

Перейти на сайт

Prometheus vs InfluxDB

Core approach

Prometheus: Pull-scrape model with tight integration into monitoring workflows.

InfluxDB: Strong focus on ingest APIs and time-series storage for broad telemetry scenarios.

Query language

Prometheus: PromQL focused on metrics, labels, and alert-oriented analysis.

InfluxDB: InfluxQL/Flux depending on version and data-processing profile.

Typical production profile

Prometheus: Infrastructure monitoring, SLOs, alerting, and Kubernetes observability.

InfluxDB: Monitoring + IoT + telemetry where flexible ingest and retention policies are key.

Operating model

Prometheus: Often combined with federation/remote storage for long-term retention.

InfluxDB: Often deployed as a standalone TSDB layer or as a managed service.

Why Prometheus is often chosen for monitoring

Practical interpretation for system-design workloads:

  • Prometheus became the cloud-native monitoring standard due to its simple pull model and strong Kubernetes ecosystem fit.
  • PromQL and rules create one unified path for observability and alerting without a separate DSL.
  • The WAL -> head -> blocks lifecycle makes write/read behavior predictable in production.
  • Integration with Alertmanager, Grafana, and remote storage supports growth from single-node to scalable topologies.

References

Related chapters

Enable tracking in Settings

System Design Space

© 2026 Alexander Polomodov