Top Products Dashboard — System Design Space

A top-products dashboard looks like a simple report only until the business asks to see leading products almost immediately after each event and still reconcile those numbers with finance.

The chapter breaks down how to separate the fast metric path from historical recompute, where to keep metric marts, and how to avoid turning every new filter into an expensive scan over raw data.

For interviews and engineering discussions, this case is useful because it quickly shifts the conversation to metric freshness, query cost, formula versions, and the boundary between operational dashboards and exact reporting.

Metric Freshness

The business needs more than a number: it needs to know how recent that number is, which path produced it, and whether a follow-up recompute is still pending.

Historical Recompute

Schema fixes, late data, and formula changes almost always require a separate recompute path that does not damage the fast user-facing flow.

Query Cost

The risk is not only latency but also spend: without guardrails, one heavy analytical query quickly turns into a habit shared across the whole team.

Formula Versions

If the system does not keep the metric together with its calculation version, the dashboard loses comparability over time and stops being a trustworthy decision surface.

Top Products Dashboard looks like a simple report right up to the moment the business wants to see top sellers almost immediately after the event, read the current freshness of the data, and make decisions on key KPI values. From that point it is an analytics serving problem for metrics: the fast user-facing path and the heavy recomputes have to be pulled apart, or you lose either response speed or reproducibility of the numbers — and with it trust in the mart.

Source

Acing the System Design Interview

Chapter 17: Design Top Products Dashboard, focused on metric marts, historical recompute, and reconciliation.

Читать обзор

Where this pattern matters

E-commerce BI: top sellers by revenue, margin, and conversion with near-real-time visibility.
Marketplaces: SKU rankings by region, channel, and campaign.
Operational dashboards: a shared metric mart for category managers, marketing, and leadership.
Promotion analytics: quickly separating campaign effects from assortment or pricing changes.

Functional requirements

Core dashboard API

GET /top-products - top-N for the selected metric
GET /kpi-breakdown - slice by dimension and segment
GET /freshness - last update time and formula version
POST /reconcile - recompute and correct a chosen range

Product and analytics features

Filters by timeframe, region, category, channel, and customer segment
Multiple ranking formulas: revenue, margin, conversion, orders
Access control for sensitive marts and export policies
Metric explainability: source, formula version, and last recompute time

A strong design also supports drill-down from the top list into the reasons behind a metric shift, while relying on pre-aggregated data instead of pushing every read back to raw transactions.

Non-functional requirements

Requirement	Target	Why it matters
Dashboard response latency (p95)	< 300 ms	Users should experience the dashboard as an interactive tool, not a heavy report.
Data freshness	1-5 minutes	Category and campaign decisions often depend on fresh rather than end-of-day numbers.
Drift versus finance reports	< 0.2%	The business needs a clear line between operational analytics and finance-grade reporting.
Availability	99.9%	The dashboard becomes a shared daily tool for several teams.
Historical recompute impact	No degradation of the online path	Backfills must not break the interactive path or its SLA.

High-Level Architecture

Theory

ETL/ELT architecture

Ingest, transformation, and serving layers for analytics platforms and metric marts.

Читать обзор

High-Level Architecture

event intake -> aggregates -> serving tier -> reconciliation

Ingestion plane

Data Sources

orders, clicks, events

Ingest API

contracts and validation

Stream Bus

Kafka / PubSub

Aggregation and serving plane

Aggregation Jobs

windows and views

Serving Store

OLAP aggregates

Query Cache

hot filters

Dashboard API

read endpoint

Control plane

Query Planner

cost guardrails

Raw Lake

immutable source

Warehouse

historical truth

Reconcile Job

drift and recompute

Ingestion plane

Sources -> API -> stream

contracts and event flow

Aggregation and serving plane

Aggregation -> serving store

windows, top-N, and KPIs

Planner -> cache -> API

hot filters and reads

Control plane

Raw -> history -> reconcile

recompute and corrections

This topology separates event intake, dashboard serving, and the reconciliation and recompute control loop.

The architecture separates event intake from dashboard serving and from background reconciliation so the fast path runs on ready-to-read aggregates while historical correction happens on dedicated views and recompute workflows.

Write and Read Paths

How metrics land in the serving tier and how the dashboard reads aggregates under load.

Events are validated, aggregated, and committed into the serving tier together with freshness and formula metadata.

Metric consistency and resilience

Deeper dive

ClickHouse overview

Columnar storage, materialized views, and patterns for analytical query performance.

Читать обзор

Two truth paths

One path will not satisfy both requirements at once, so keep the fast operational path and the more exact historical correction path explicitly apart:

dashboard_metrics = stream_aggregates(events)
finance_metrics   = batch_recompute(raw_events)

Stream path provides the quick response and freshness the dashboard promises.
Batch path gives you accurate historical recompute and finance alignment.
Reconciliation explains why numbers may temporarily differ and when that gap is closed.

Operational controls

Data contracts: prevent silent metric breakage when event schemas change.
Freshness watermark: must be shown in the UI so users understand how recent the numbers are.
Query guardrails: keep expensive filters and wide time ranges from exhausting the budget.
Backfill throttling: historical recompute should live in a separate lane from the online path.

Risks and common mistakes

Querying OLTP directly: BI on top of the production transactional database quickly turns into a shared source of degradation.
Hidden freshness: users see a number but cannot tell how old it is or where it came from.
Formula drift: changing the calculation without versioning destroys comparability over time.
Unbounded ad-hoc analysis: without cost controls, exploratory queries can dominate the platform bill.
No reconciliation path: if you cannot recompute and explain differences, trust in the dashboard disappears.

What to cover in an interview

Where you draw the line between a fast product dashboard and finance-grade reporting.
How KPI formulas are versioned and how recomputes stay reproducible.
Which SLOs and signals you monitor: response latency, freshness lag, drift, and query cost.
How the system behaves under late events, schema changes, and large historical recomputes.

References

Uber Engineering — Restaurant Manager: the UberEATS analytics dashboard built on Apache Pinot (Uber Blog, 2017)Apache Pinot — documentation: real-time OLAP for user-facing analytics with sub-second latency (Apache Software Foundation)ClickHouse — Materialized Views: pre-aggregating metrics at insert time (ClickHouse Docs)Martin Kleppmann — Designing Data-Intensive Applications: batch and stream processing, reconciliation (O'Reilly)

Related chapters

A/B Testing platform - A neighboring analytics case where event quality and aggregate correctness directly shape product decisions.
ClickHouse overview - What keeps the dashboard's fast path fast: a columnar engine and materialized views instead of reads against raw transactions.
ETL/ELT architecture - Ingest, transformation, and serving layers — the skeleton this case uses to split the online path from historical recompute.
Streaming Data - Streaming foundations for freshness, watermarks, and handling late events in operational analytics.
Kafka - Patterns for the event backbone, partitioning, and controlled consumer lag.
FinOps and analytics cost - How to control the cost of exploratory queries, historical recomputes, and long-term retention.