System Design Space
Knowledge graphSettings

Updated: February 21, 2026 at 8:00 PM

A/B Testing platform

mid

Experiment system design: hash-based assignment, layer architecture, event streaming and statistical analysis.

Statement of the problem

A/B testing is one of the most important tools for decision-making in product companies. It is necessary to design a platform that allows experiments to be carried out on millions of users with minimal impact on the performance of the main product.

Functional Requirements

Experiment management

  • Creating experiments with variations (Control/Treatment)
  • Determining the target audience (targeting)
  • Setting the experiment duration
  • Defining Metrics to Measure

Distribution of options

  • Random distribution of users among options
  • Consistent assignment (one user = one option)
  • Support traffic splitting (1%, 5%, 50%...)
  • Progressive rollout (gradual increase in traffic)

Data collection

  • Logging user actions
  • Linking events to an experiment variant
  • Aggregation of metrics (CTR, conversion, retention)

Analysis of results

  • Calculation of statistical significance (p-value)
  • Confidence intervals
  • Visualization of results in real time

Non-functional requirements

Critical

Low latency

Variant detection should take <10ms to avoid impacting UX

Important

Consistency

The user should see the same option throughout the experiment

Scale

Scalability

Process billions of events per day without performance degradation

High-level architecture

Main Components

Experiment Management Service

CRUD operations for experiments, targeting rules, variant configuration

Variant Assignment Service

Quickly determine the option for the user (critical path)

Event Ingestion Pipeline

Collection and processing of events linked to experiments

Analysis Engine

Statistical analysis, p-value calculation, confidence intervals

🧪

The architecture is built around two paths:
Hot Path — purpose of the option
Cold Path — data collection and analysis

Visualization via C4 Model

Below, the A/B platform system is decomposed into C4 levels: first the external context, then the platform containers, and finally the detailing of the critical variant assignment container. More details about the approach itself: head C4 Model.

L1 — System Context

Who interacts with the platform and what external systems are involved in the loop.

UserWeb / MobileProduct AppClient ServicesA/B PlatformAssignment + Experiment ConfigEvent Pipeline + AnalyticsDashboardProduct / DSData PlatformDWH / BIget variantresultsevents

Randomization algorithms

A high-quality randomization algorithm should ensure: absence of bias, consistency and independence between experiments.

Hash and Partition (HP)

Recommended
variant = Hash(UserID + ExperimentID) % 100
if variant < 50: return "Control"
else: return "Treatment"
  • Does not require state storage
  • Deterministic - one input = one output
  • Independence between experiments thanks to ExperimentID
  • Easily scalable

Pseudorandom with Caching (PwC)

Alternative

Generating a random number and then caching the result for the user.

  • Server-side: database storage
  • Client-side: storage in cookies
  • Requires additional storage
  • Potential consistency issues when clearing cookies

Methods for assigning options

Server-side Assignment

  • More secure (logic hidden)
  • Ability to test backend logic
  • Requires fast service or embedded library
  • Additional network hop

Client-side Assignment

  • Faster for UI changes
  • Lightweight SDK
  • The configuration is loaded at startup
  • Logic is visible to users

Optimization: Configuration Push

Experiment configurations are pushed to edge nodes or to the Redis cache to minimize latency. The SDK on the client receives the configuration via CDN and executes the hash-based assignment locally.

Data Pipeline

📱

Client Events

📨

Kafka

Flink/Spark

🗄️

ClickHouse

📊

Dashboard

Ingestion

Events are sent to Kafka for high-throughput processing. Each event contains user_id, experiment_id, variant, timestamp and payload.

Processing

Stream processing (Flink) for real-time metrics or batch (Spark) for complex aggregations and statistical analysis.

Storage & Reporting

OLAP database (ClickHouse, Pinot) for fast analytical queries. Dashboard with real-time update of results.

Parallel experiments (Layers)

How to run several experiments simultaneously without mutual influence? Solution - concept layers.

Problem

Experiment A tests the search algorithm, Experiment B tests the button color. If the user ends up in both Treatments, how can we understand what influenced the conversion?

Solution: Domains/Layers

Experiments are grouped by domain (UI, Backend, Algorithm). Within a domain - mutually exclusive, between domains - independent.

Layer Architecture Example

Layer: UI → [Button Color Test, Layout Test] (mutually exclusive)
Layer: Search → [Ranking Algorithm Test] (independent)
Layer: Recommendations → [ML Model A/B Test] (independent)

Common mistakes

Sample Ratio Mismatch (SRM)

A 50/50 distribution gives 52/48 in reality. Reasons: bot traffic, redirect issues, client-side bugs. Always check the SRM before analyzing the results.

Peeking Problem

Premature analysis of results before achieving statistical significance. Solution: sequential testing or fixed sample size.

Network Effects

In social products, users influence each other. Cluster-based randomization instead of user-based can help.

Multiple Testing

Analyzing multiple metrics increases the likelihood of false positives. Use Bonferroni correction or select primary metric.

Key Findings

Hash-based assignment — the preferred method for consistent and stateless distribution of options

Configuration push — configurations on edge nodes or in Redis for minimal latency

Layer architecture — isolation of experiments across domains for parallel running

Event streaming - Kafka + Flink/Spark for processing billions of events

Statistical rigor — SRM checks, proper sample size, sequential testing

OLAP for analytics — ClickHouse/Pinot for real-time dashboards and complex queries

The material was prepared based on a public interview «System Design Interview: A/B Testing Platform» and articles by Ron Kohavi “Trustworthy Online Controlled Experiments”

Related materials

Related chapters

Enable tracking in Settings

System Design Space

© 2026 Alexander Polomodov