A top-products dashboard looks like a simple report only until the business asks to see leading products almost immediately after each event and still reconcile those numbers with finance.
The chapter breaks down how to separate the fast metric path from historical recompute, where to keep metric marts, and how to avoid turning every new filter into an expensive scan over raw data.
For interviews and engineering discussions, this case is useful because it quickly shifts the conversation to metric freshness, query cost, formula versions, and the boundary between operational dashboards and exact reporting.
Metric Freshness
The business needs more than a number: it needs to know how recent that number is, which path produced it, and whether a follow-up recompute is still pending.
Historical Recompute
Schema fixes, late data, and formula changes almost always require a separate recompute path that does not damage the fast user-facing flow.
Query Cost
The risk is not only latency but also spend: without guardrails, one heavy analytical query quickly turns into a habit shared across the whole team.
Formula Versions
If the system does not keep the metric together with its calculation version, the dashboard loses comparability over time and stops being a trustworthy decision surface.
Top Products Dashboard stops being a simple report the moment the business wants near-real-time rankings, clear freshness guarantees, and numbers that still reconcile with finance. The real design problem is an analytics serving tier that separates the fast user-facing path from heavier recompute and correction workflows without losing trust in the metrics.
Source
Acing the System Design Interview
Chapter 17: Design Top Products Dashboard, focused on metric marts, historical recompute, and reconciliation.
Where this pattern matters
- E-commerce BI: top sellers by revenue, margin, and conversion with near-real-time visibility.
- Marketplaces: SKU rankings by region, channel, and campaign.
- Operational dashboards: a shared metric mart for category managers, marketing, and leadership.
- Promotion analytics: quickly separating campaign effects from assortment or pricing changes.
Functional requirements
Core dashboard API
GET /top-products- top-N for the selected metricGET /kpi-breakdown- slice by dimension and segmentGET /freshness- last update time and formula versionPOST /reconcile- recompute and correct a chosen range
Product and analytics features
- Filters by timeframe, region, category, channel, and customer segment
- Multiple ranking formulas: revenue, margin, conversion, orders
- Access control for sensitive marts and export policies
- Metric explainability: source, formula version, and last recompute time
A strong design also supports drill-down from the top list into the reasons behind a metric shift, while relying on pre-aggregated data instead of pushing every read back to raw transactions.
Non-functional requirements
| Requirement | Target | Why it matters |
|---|---|---|
| Dashboard response latency (p95) | < 300 ms | Users should experience the dashboard as an interactive tool, not a heavy report. |
| Data freshness | 1-5 minutes | Category and campaign decisions often depend on fresh rather than end-of-day numbers. |
| Drift versus finance reports | < 0.2% | The business needs a clear line between operational analytics and finance-grade reporting. |
| Availability | 99.9% | The dashboard becomes a shared daily tool for several teams. |
| Historical recompute impact | No degradation of the online path | Backfills must not break the interactive path or its SLA. |
High-Level Architecture
Theory
ETL/ELT architecture
Ingest, transformation, and serving layers for analytics platforms and metric marts.
High-Level Architecture
event intake -> aggregates -> serving tier -> reconciliationThis topology separates event intake, dashboard serving, and the reconciliation and recompute control loop.
The architecture separates event intake from dashboard serving and from background reconciliation so the fast path runs on ready-to-read aggregates while historical correction happens on dedicated views and recompute workflows.
Write and Read Paths
Write and Read Paths
How metrics land in the serving tier and how the dashboard reads aggregates under load.
Events are validated, aggregated, and committed into the serving tier together with freshness and formula metadata.
Event Intake
Layer 1orders and clicks stream
Product events enter through the ingest API and are validated against contracts.
Stream Bus
Layer 2partitioned log
Events are published to the stream bus and distributed across aggregation jobs.
Aggregation Jobs
Layer 3windowed compute
Windowed aggregates are computed for the main dimension combinations.
Serving Store
Layer 4OLAP views
Top-N and KPI rows are stored in tables prepared for dashboard filters.
Freshness Metadata
Layer 5watermark and version
Watermark, formula version, and source metadata are stored for explainability.
Write path checkpoints
- •Schema contracts are validated at ingress to avoid polluted aggregates.
- •Aggregation jobs should be idempotent to support safe retries.
- •Watermark and formula version must be stored with KPIs for explainability.
Metric consistency and resilience
Deeper dive
ClickHouse overview
Columnar storage, materialized views, and patterns for analytical query performance.
Two truth paths
Reliable dashboards usually separate the fast operational path from the more exact historical correction path:
dashboard_metrics = stream_aggregates(events) finance_metrics = batch_recompute(raw_events)
- Stream path provides the quick response and freshness the dashboard promises.
- Batch path gives you accurate historical recompute and finance alignment.
- Reconciliation explains why numbers may temporarily differ and when that gap is closed.
Operational controls
- Data contracts: prevent silent metric breakage when event schemas change.
- Freshness watermark: must be shown in the UI so users understand how recent the numbers are.
- Query guardrails: keep expensive filters and wide time ranges from exhausting the budget.
- Backfill throttling: historical recompute should live in a separate lane from the online path.
Risks and common mistakes
- Querying OLTP directly: BI on top of the production transactional database quickly turns into a shared source of degradation.
- Hidden freshness: users see a number but cannot tell how old it is or where it came from.
- Formula drift: changing the calculation without versioning destroys comparability over time.
- Unbounded ad-hoc analysis: without cost controls, exploratory queries can dominate the platform bill.
- No reconciliation path: if you cannot recompute and explain differences, trust in the dashboard disappears.
What to cover in an interview
- Where you draw the line between a fast product dashboard and finance-grade reporting.
- How KPI formulas are versioned and how recomputes stay reproducible.
- Which SLOs and signals you monitor: response latency, freshness lag, drift, and query cost.
- How the system behaves under late events, schema changes, and large historical recomputes.
Related chapters
- A/B Testing platform - A neighboring analytics case where event quality and aggregate correctness directly shape product decisions.
- ClickHouse overview - A columnar OLAP engine for fast dashboard queries, metric marts, and materialized views.
- ETL/ELT architecture - Foundational patterns for ingest, transformation, and serving in analytics platforms.
- Streaming Data - Streaming foundations for freshness, watermarks, and handling late events in operational analytics.
- Kafka - Patterns for the event backbone, partitioning, and controlled consumer lag.
- FinOps and analytics cost - How to control the cost of exploratory queries, historical recomputes, and long-term retention.
