System Design Space
Knowledge graphSettings

Updated: March 24, 2026 at 5:36 PM

Data Consistency Patterns and Idempotency

medium

How to choose a consistency model and implement idempotency in APIs, event processing and background tasks.

Distributed correctness does not fail only on obvious outages. It also fails on retries, stale reads, and partially completed operations.

The chapter ties together consistency models, read-your-writes, idempotency keys, consumer deduplication, transactional outbox, and saga compensation into one design frame where the key question is which invariants can be relaxed and which ones must survive every retry and replay.

For system design interviews, this is powerful because it lets you talk about correctness through actual safeguards instead of hiding behind the marketing phrase of exactly-once, which explains almost nothing about failure behavior.

Practical value of this chapter

Consistency level

Select consistency model per use case: strong, bounded staleness, or eventual with compensation strategy.

Idempotent contracts

Design APIs and consumers with idempotency keys, dedupe stores, and explicit retry behavior.

Failure scenarios

Model race conditions, duplicate delivery, and partial commits through explicit failure timelines.

Interview precision

Show how correctness is preserved in distributed systems when exactly-once cannot be guaranteed end to end.

Theory

CAP Theorem

Consistency is always chosen as a trade-off with availability and latency.

Open chapter

Consistency and idempotency patterns let systems safely survive retries, redelivery, and partial failures. Core principle: you cannot rely only on "perfect delivery" guarantees; the system must stay correct under repeated and out-of-order events.

Related

Jepsen consistency models

Empirical analysis of consistency models and real DB behavior under failures.

Open chapter

Consistency Models

Strong consistency

Financial operations, critical invariants, and workflows with high cost of error.

Higher latency/cost and lower availability under partition scenarios.

Read-your-writes / session consistency

User-facing flows where users must immediately see their own updates.

Requires session routing/sticky reads and cache validation discipline.

Eventual consistency

Catalogs, recommendations, analytical views, and asynchronous integrations.

Temporary divergence appears, so UX/business handling policy is required.

Idempotency Patterns

Active pattern

Idempotency Key for synchronous APIs

POST/command operations: payments, order creation, invoice issuance, workflow execution.

How to implement

  • Client sends `Idempotency-Key`; server stores key + request fingerprint + final response.
  • On retry with the same key and payload, return original result instead of creating a new operation.
  • Choose key TTL by business risk (often 24-72 hours for financial operations).

Risk: If reused key arrives with a different payload, return conflict error; otherwise hidden duplicates appear.

Practical guardrails

  • Idempotency protects against duplicate delivery, but does not replace concurrency and invariant controls.
  • For critical commands, store not only processed flag but also canonical response/reason code for replay.
  • Monitor retry hit-rate, dedupe reject rate, and conflict-resolution latency.

Validation

Testing Distributed Systems

Idempotency must be validated with duplicate/out-of-order scenarios.

Open chapter

Usage Scenarios

Payment API

Retry after timeout without an idempotent contract often results in double charge.

Failure path

Client

timeout + retry

Payment API

no idempotency key

DB

duplicate charge

User

double debit

Resilient path

Client

Idempotency-Key

API

dedupe + unique constraint

Ledger

single transaction

API

status replay

What happens

  • Idempotency key maps repeated retries of one client intent to a single business operation.
  • Even with redelivery, server returns the same result instead of creating a new transaction.
  • Unique constraint and status endpoint close race conditions between retries.

Risk: Key TTL and key scope must match the business time window of the operation.

Scenario must remain correct under retries, redelivery, and out-of-order delivery.

Practical Checklist

For every critical command, duplicate-request behavior is explicitly defined.

Events have stable unique identity and consumer deduplication strategy.

Consistency model is chosen intentionally and reflected in API contracts/documentation.

Reconciliation processes exist to detect and repair divergence.

Team tests retries, redelivery, and out-of-order delivery in integration tests.

Common anti-pattern: assuming once-only delivery is guaranteed by platform and skipping idempotency.

References

Related chapters

Enable tracking in Settings