System Design Space
Knowledge graphSettings

Updated: March 25, 2026 at 2:00 AM

Prometheus: history and architecture

medium

Prometheus from a system design perspective: timeline, layered architecture, write/read flow, and a practical DDL-like/DML-like model for monitoring workloads.

Prometheus matters not only as a de facto standard, but as a very legible monitoring architecture where the strengths and the limits of the pull model are easy to see.

In real operations, this chapter helps you think about jobs, targets, service discovery, recording rules, and alerting as one connected system instead of a pile of unrelated YAML fragments.

In interviews and engineering discussions, it is especially useful when you need to explain why Prometheus is great for baseline cloud-native monitoring yet does not always solve long retention or global scale on its own.

Practical value of this chapter

Scrape topology

Design jobs, targets, and service discovery to prevent monitoring blind spots and duplicate time series.

Rules and alert pipeline

Separate recording and alerting rules to stabilize dashboard latency and improve alert quality.

Remote write boundary

Define local Prometheus versus long-term storage responsibilities by SLA and cost constraints.

Interview articulation

Explain pull-model trade-offs and when an additional aggregation layer becomes necessary.

Source

Prometheus docs

Official overview of Prometheus architecture: pull scraping, TSDB, PromQL, and rules.

Перейти на сайт

Prometheus is a purpose-built monitoring time-series system that combines pull-based metric collection, its own TSDB engine, and PromQL query semantics. In the TSDB map it represents the canonical choice for infrastructure monitoring and SLO-driven operations.

History: key milestones

2012

Born at SoundCloud

Prometheus started as an internal monitoring engine for cloud-native workloads.

2015

Open source and early ecosystem

The project became open source and quickly gained adoption in Kubernetes environments.

2016

CNCF incubating stage

Prometheus joined CNCF and established a neutral governance model.

2018

CNCF graduated

It reached graduated status and became an industry standard for infrastructure monitoring.

2023

Prometheus 2.x as the production baseline

Remote write/read patterns, operator workflows, and cardinality tuning practices matured.

2024+

Evolution toward scalable TSDB profiles

Long-term storage, federation, and hybrid Prometheus topologies became standard patterns.

Prometheus specifics

Pull-based collection

Prometheus scrapes targets itself, which simplifies topology control and endpoint health management.

TSDB with WAL + blocks

Metrics follow WAL -> head -> compacted blocks, giving a predictable storage lifecycle.

PromQL as the query language

PromQL is optimized for time-series vectors, label-based aggregation, and time-window analysis.

Rule-driven alerting

Recording/alerting rules plus Alertmanager integration create a controlled incident response loop.

Prometheus architecture by layers

At a high level, Prometheus can be read as a pipeline: ingest -> TSDB head/WAL -> block storage -> PromQL/query engine -> rules/alerts -> external integrations.

Ingestion layer
Service discoveryScrape loopsRelabelingRemote write ingest
Layer transition
TSDB Head + WAL
In-memory headAppend samplesLabel indexWAL segments
Layer transition
Block storage
2h blocksCompactionRetentionTSDB snapshots
Layer transition
PromQL query engine
ParserInstant/Range evalVector matchingAggregation
Layer transition
Rules and alerting
Recording rulesAlerting rulesRule groupsAlertmanager
Layer transition
Integrations and operations
GrafanaFederationRemote write/readHA pairs

Key features

Prometheus is optimized for monitoring workloads: pull ingest, WAL/block-based TSDB, PromQL, and rule-driven alerting.

Pull model

HTTP scrapeTarget healthDiscovery-driven topology

Label model

Metric + labelsHigh cardinality riskPromQL selectors

Data lifecycle

WAL -> Head -> BlocksCompactionRetention/TTL via flags

DDL vs DML: Prometheus model

Prometheus does not implement SQL DDL/DML literally. For system-design analysis, it is useful to separate DDL-like operations (scrape/rule topology updates) from DML-like operations (metric sample flow and PromQL read execution).

How the DDL/DML model works in Prometheus

DDL-like: scrape/rule topology updates. DML-like: sample flow and PromQL reads.

Interactive replayStep 1/5

1. Scrape / ingest

Samples + queries

Scraper or remote write ingest receives new metric samples.

2. WAL append

Samples + queries

Samples are appended to WAL for durability before deeper processing.

3. Head update

Samples + queries

TSDB head updates series state and label index for fresh data.

4. PromQL execution

Samples + queries

Query engine reads head and historical blocks, then aggregates results.

5. Compaction + retention

Samples + queries

Background compaction merges blocks and retention removes expired data.

Active step

1. Scrape / ingest

Scraper or remote write ingest receives new metric samples.

Data and query path

  • The DML-like path covers ingest, storage, and PromQL read execution.
  • Fresh data lives in head, historical data in compacted blocks.
  • Label cardinality has a direct impact on cost and query latency.

Source

InfluxDB docs

Reference context for an alternative TSDB profile.

Перейти на сайт

Prometheus vs InfluxDB

Core approach

Prometheus: Pull-scrape model with tight integration into monitoring workflows.

InfluxDB: Strong focus on ingest APIs and time-series storage for broad telemetry scenarios.

Query language

Prometheus: PromQL focused on metrics, labels, and alert-oriented analysis.

InfluxDB: InfluxQL/Flux depending on version and data-processing profile.

Typical production profile

Prometheus: Infrastructure monitoring, SLOs, alerting, and Kubernetes observability.

InfluxDB: Monitoring + IoT + telemetry where flexible ingest and retention policies are key.

Operating model

Prometheus: Often combined with federation/remote storage for long-term retention.

InfluxDB: Often deployed as a standalone TSDB layer or as a managed service.

Why Prometheus is often chosen for monitoring

Practical interpretation for system-design workloads:

  • Prometheus became the cloud-native monitoring standard due to its simple pull model and strong Kubernetes ecosystem fit.
  • PromQL and rules create one unified path for observability and alerting without a separate DSL.
  • The WAL -> head -> blocks lifecycle makes write/read behavior predictable in production.
  • Integration with Alertmanager, Grafana, and remote storage supports growth from single-node to scalable topologies.

PromQL query examples

A compact cheat sheet for common production tasks: load, latency, error control, and SLO monitoring.

Throughput (RPS) by service

Estimate incoming traffic for `checkout-api` over the last 5 minutes.

sum(rate(http_requests_total{service="checkout-api"}[5m]))

A baseline signal for incoming traffic speed, usually one of the first RED dashboard panels.

P95 latency

Track latency degradation in the user-facing request path.

histogram_quantile(0.95, sum(rate(http_request_duration_seconds_bucket{service="checkout-api"}[5m])) by (le))

For p95/p99, use histogram `*_bucket` data instead of averages to capture tail behavior.

Error rate (%)

Measure the share of 5xx responses relative to total service traffic.

100 * (sum(rate(http_requests_total{service="checkout-api",status=~"5.."}[5m])) / sum(rate(http_requests_total{service="checkout-api"}[5m])))

A common base metric for SLO-aligned alerting and burn-rate rules.

CPU saturation by pod

Find pods that are approaching CPU limits.

sum(rate(container_cpu_usage_seconds_total{namespace="prod",pod=~"checkout-api-.*"}[5m])) by (pod)

Useful during autoscaling tuning and hotspot diagnosis per pod replica.

Top-k by memory usage

Quickly isolate the heaviest pods by working set memory.

topk(5, container_memory_working_set_bytes{namespace="prod",pod=~"checkout-api-.*"})

Helpful for OOMKill investigations and requests/limits right-sizing.

Error budget burn rate

Approximate error-budget spending speed for a 99.9% SLO.

(sum(rate(http_requests_total{service="checkout-api",status=~"5.."}[5m])) / sum(rate(http_requests_total{service="checkout-api"}[5m]))) / (1 - 0.999)

Values significantly above `1` indicate the service is burning budget faster than allowed.

References

Related chapters

Enable tracking in Settings