System Design Space
Knowledge graphSettings

Updated: March 25, 2026 at 5:53 PM

Ad Click Event Aggregator

medium

Classic task: stream ingestion, dedupe, windowed aggregations, freshness SLA and billing accuracy.

Click aggregation is a streaming system where volume is enormous, events arrive out of order, and accounting mistakes turn directly into financial and analytical risk.

The chapter helps break down ingest, deduplication, windowed aggregation, late events, hot partitions, and serving near-real-time metrics.

For interviews and engineering discussions, this case is useful because it brings the conversation to the trade-off between throughput, correctness, and billing trust.

Pipeline Thinking

Ingestion, partitioning, deduplication, and stage latency drive system behavior.

Serving Layer

Index and cache-locality decisions directly shape user-facing query latency.

Consistency Window

Explicitly define where eventual consistency is acceptable and where it is not.

Cost vs Freshness

Balance update frequency with compute/storage cost and operational complexity.

Acing SDI

Practice task from chapter 11

Ad click event aggregator: dedupe, windowing, and consistent analytics outputs.

Читать обзор

Ad Click Event Aggregator tests your ability to design a streaming system where speed, correctness, and metric explainability all matter at once. It is a common interview case at the boundary of data platform and product analytics.

Functional requirements

  • Ingest ad click/impression/conversion events.
  • Deduplicate events for billing correctness.
  • Build minute/hour/day window aggregates.
  • Serve realtime dashboards and batch reports.

Non-functional requirements

  • Stable operation under campaign traffic bursts.
  • Bounded latency for near-realtime analytics.
  • Clear data freshness and lineage visibility.
  • Controlled storage and recomputation costs.

High-Level Architecture

Theory

Streaming Data

Windowing, watermarks, late events, reprocessing, and realtime/batch trade-offs.

Читать обзор

High-Level Architecture

stream ingest + window aggregation + reconciliation

This topology combines ingest flow, window aggregation, and a reconciliation/backfill control loop for billing correctness.

Event Sources
SDK/trackers
Collector API
validate + enrich
Event Bus
Kafka/PubSub
Dedupe/Normalize
idempotency key
Window Aggregator
minute/hour/day
Hot Aggregate Store
ClickHouse/Pinot
Dashboard API
query/filters
Raw Event Lake
immutable log
Batch Backfill Job
historical replay
Reconciliation Job
online vs billing

The architecture separates ingest, realtime serving, and reliability control loops with batch reconciliation. This keeps dashboard latency predictable while preserving billing correctness.

Write/Read Paths

Write/Read Paths

How events are written into aggregates and how dashboards read metrics under load.

Write path: ingest accepts events, runs deduplication/windowing, and updates serving aggregates for near-realtime analytics.

Event Sources

Layer 1

SDK / trackers / pixels

Clicks, impressions, and conversions are sent to ingest endpoints.

Collector API

Layer 2

validate + enrich

Schema validation, enrichment, and idempotency key generation.

Stream + Dedupe

Layer 3

Kafka/PubSub + state

Stream processor applies dedupe, ordering, and late-event handling.

Window Aggregator

Layer 4

minute / hour / day

Windowed aggregates are computed and written into serving storage.

Serving Store

Layer 5

ClickHouse/Pinot

Aggregate storage optimized for fast analytical reads.

Write path checkpoints

  • Ingress idempotency protects billing from double counting.
  • Window aggregation builds minute/hour/day views while handling late events.
  • Immutable raw storage remains the source of truth for replay and reconciliation.

Data and deduplication

  • Idempotency key such as ad_id + user_id + ts_bucket.
  • Late events handled via watermarks and grace periods.
  • Schema evolution with strict versioning and backward compatibility.
  • Aggregate correction through reprocessing over immutable raw data.

SLO and operational metrics

  • Data freshness (p95 end-to-end lag).
  • Duplicate rate and window completeness.
  • Reprocessing duration and backfill cost.
  • Mismatch between online dashboard and billing reports.

Questions to clarify in interview

  • Required billing precision: near-exact or acceptable tolerance.
  • Dashboard freshness SLA and what lag is considered critical.
  • Need for drill-down into raw events and retention duration.
  • Auditability and legal/compliance constraints for event history.

Related chapters

  • Event-Driven Architecture - Event contracts, pipeline choreography, and integration patterns for ad analytics systems.
  • Streaming Systems - Windowing, watermarks, late-event handling, and replay-safe stream processing foundations.
  • Kafka - Partitioned log and consumer-group model behind large-scale clickstream ingestion.
  • ClickHouse overview - OLAP serving layer for near-real-time aggregates and analytical query workloads.
  • A/B Testing platform - Adjacent experimentation case where event quality directly affects statistical validity.
  • Payment System - Parallel discussion of correctness, idempotency, and reconciliation in critical pipelines.

Enable tracking in Settings