Primary source
Prometheus TSDB
Storage fundamentals for metrics: local blocks, retention, compaction, and remote write.
Time Series Databases (TSDB) are easiest to select across four axes: data model, storage engine, workload profile, and operating model. This chapter extends Database Selection Framework and helps you decide when a purpose-built TSDB is the right choice versus an SQL/columnar engine used as a time-series platform.
TSDB selection map: 4 axes
1. Data model and query language
Native TSDB, SQL extension on top of RDBMS, TSDB layer over distributed storage, or columnar OLAP used as TSDB.
This axis defines how easy ad-hoc analytics, joins, and BI integration will be.
2. Storage design
Append-only/LSM, time partitioning, row vs column layout, built-in compression and downsampling.
It directly affects write throughput, storage cost, and speed of long-window aggregations.
3. Primary workload
Monitoring, IoT telemetry, financial time series, logs/metrics for product analytics.
Different workloads require different trade-offs across latency, retention, cardinality, and query flexibility.
4. Operating model
Self-hosted, managed cloud, or hybrid with multiple storage tiers.
It affects TCO, SRE requirements, and delivery speed for observability and analytics capabilities.
Main TSDB families
1. Purpose-built TSDB systems
Engines designed from day one for time-series workloads and high write ingestion.
Typical characteristics
- Append-only write path optimized for high ingest throughput.
- Time/value compression, retention/TTL, and downsampling out of the box.
- Time-bucket aggregations and window functions as a first-class use case.
Typical products
InfluxDB, Prometheus, VictoriaMetrics, M3, Thanos, Graphite/Whisper
When to use
- Infrastructure monitoring and application metrics.
- IoT telemetry with relatively simple label schemas.
- Workloads where write path and retention matter more than complex joins.
Trade-offs
- Complex relational analytics is often limited.
- High-cardinality labels can increase storage and query cost quickly.
2. TSDB as an extension of relational DB
A path where time-series capabilities are added to an SQL engine (for example, PostgreSQL).
Typical characteristics
- SQL as the main query language and easy BI integration.
- Time partitioning, hypertables/continuous aggregates, and space sharding.
- Joins with OLTP tables and one data model for metrics plus domain entities.
Typical products
TimescaleDB, PipelineDB-like patterns, kdb+
When to use
- You need complex SQL and frequent joins with transactional data.
- Your team is already strong in the PostgreSQL ecosystem.
- You want fewer technologies in the platform stack.
Trade-offs
- Peak ingest is often lower than in highly specialized TSDB engines.
- At very large scale, tuning complexity grows quickly.
3. TSDB on top of distributed storage systems
A time-series layer built over HBase/Cassandra/Bigtable-like storage for extreme scale.
Typical characteristics
- Horizontal scale to very large volumes with long retention.
- Fault tolerance and replication inherited from the underlying KV/column-store layer.
- Usually a multi-component architecture with explicit capacity planning.
Typical products
OpenTSDB (HBase), KairosDB/Cassandra patterns, custom TSDB schemas
When to use
- Telecom/cloud/SaaS environments with trillions of points and long retention.
- You need linear scale-out as cluster size grows.
- The team is ready for distributed-system operational complexity.
Trade-offs
- Higher administration and tuning complexity.
- Longer path from idea to production due to many moving parts.
4. Columnar analytical DBs used as TSDB
General-purpose columnar systems that frequently act as metrics/log/time-series storage in practice.
Typical characteristics
- Strong columnar compression and fast aggregations on long time windows.
- Flexible ad-hoc SQL and BI-friendly querying over event data.
- A practical balance between ingest throughput and analytical flexibility.
Typical products
ClickHouse, Apache Druid, Apache Pinot, MPP DWH (Vertica, etc.)
When to use
- Unified logs + metrics + BI analytics workflows.
- You need advanced slicing, retention/cohort analysis, and product reporting.
- Data and product teams need flexible ad-hoc analytics.
Trade-offs
- Not always a full replacement for monitoring-native TSDB alerting.
- Low-latency operational monitoring often still needs a dedicated layer.
Fast scenario-based selection
Infrastructure monitoring
Native TSDB (Prometheus/VictoriaMetrics) + long-term storage layer
Strong ecosystem for alerting, SLO tracking, and dashboard-driven operations.
IoT telemetry and device events
Native TSDB or SQL extension (TimescaleDB), depending on analytics depth
Key factors: ingest rate, tag cardinality, and retention/downsampling policies.
Financial and market time series
SQL/columnar path (kdb+, ClickHouse, TimescaleDB)
You typically need precise aggregations, window functions, and complex analytics.
Logs/metrics with BI and ad-hoc analytics
Columnar analytics (ClickHouse/Druid/Pinot)
This path usually wins on SQL flexibility and large-scale analytical scans.
Self-hosted vs managed cloud
Self-hosted
- Maximum control over storage layout, indexing, and upgrade strategy.
- Good fit when you already have mature SRE/DBA practices and custom operations requirements.
- TCO strongly depends on team expertise and automation quality.
Managed cloud
- Faster time-to-value with lower operations overhead and predictable provider SLA.
- Good fit when delivery speed matters more than deep platform customization.
- Model egress, retention tiers, and lock-in risk early.
Recommendations for the first selection
Start from workload profile: ingest/sec, cardinality, retention horizon, and query patterns.
Separate operational monitoring (real-time alerting) from product analytics (deep exploratory queries).
Define write/read SLOs and validate them on realistic datasets, not synthetic benchmarks only.
If you need flexible SQL and joins with business data, evaluate PostgreSQL/columnar TSDB paths first.
Plan downsampling and tiered storage early; this is the main lever for long-term TSDB cost control.
Common TSDB selection mistakes
Trying to use one engine for both operational monitoring and deep BI analytics without clear workload separation.
Choosing by peak ingest numbers only while ignoring query and storage cost behavior.
Ignoring high-cardinality label risk until production incidents happen.
Postponing retention/downsampling strategy until infrastructure cost spikes.
Evaluating engines without considering operating model (self-hosted vs managed cloud).
