A payment system starts with money as an invariant: retries, duplicates, and partial failures are more dangerous here than any pure performance issue.
The case helps connect the ledger, idempotency keys, auth-capture flow, reconciliation, anti-fraud checks, and audit trail into one correctness-first architecture.
For interviews and design reviews, it is useful because it quickly shows whether you talk about correctness first and scaling second.
Safety First
Money safety dominates: idempotency, anti-double-spend controls, and auditability.
Regulatory Constraints
Embed compliance, traceability, and controlled manual procedures by design.
Risk Controls
Fraud scoring, limits, and anomaly controls must be part of the critical path.
Resilience
Preserve transaction correctness even when downstream dependencies degrade.
Source
System Design Interview
Classic approach to payment architecture: orchestration, ledger and integration with PSP.
Payment System - this is not just an API for writing off money. This is a distributed system where the correctness of states, idempotency, resistance to partial failures and the safety of processing sensitive data are critical. Basic principle: exactly-once is in practice replaced by at-least-once + idempotency + reconciliation.
Requirements
Functional
- Creating a payment intent for an order (authorization/capture flow).
- Support for cards and alternative payment methods via PSP.
- Idempotent write-off and return operations (refund/partial refund).
- Webhooks from PSP for statuses: authorized, captured, failed, chargeback.
- Payment history and transparent audit trail for support/finance.
Non-functional
Availability: 99.99%
Payments are a critical revenue path; downtime directly impacts business.
Latency: p95 < 300ms
Checkout should remain responsive and predictable for the user.
Correctness: No double charge
The system should not write off twice during retry and network failures.
Safety: PCI DSS scope control
We minimize the processing of card data within the platform.
High-Level Architecture
Payment Platform: High-Level Map
sync checkout path + async settlement/reconcile pathSync Plane
Async Plane
The payment platform is split into a synchronous checkout plane and an asynchronous settlement/reconcile plane.
The key pattern is separation synchronous checkout path And asynchronous settle/reconcile path. This increases stability, reduces latency for the user and makes financial statuses verifiable.
Critical Flows
Payment Flow Explorer
Scenario explorer for sync checkout and async settlement/reconciliation.
Sync payment path: operational notes
- The Payment API creates an intent and captures an idempotency key for secure retry.
- Orchestrator manages the INITIATED -> AUTHORIZED -> CAPTURED transitions.
- PSP adapter isolates provider-specific logic from the core domain.
- A synchronous checkout response does not wait for a full settle/reconcile cycle.
- Intermediate statuses are recorded in payment db for recoverability.
Async settle/reconcile path: operational notes
- Webhooks are processed as at-least-once events with deduplication.
- Ledger/outbox records financial effects in immutable transactions.
- Reconcile jobs check internal statuses with PSP reports.
- Discrepancies (missing capture/refund mismatch) go to the remediation queue.
- Finance and support receive events via outbox without direct connection to the PSP.
Data model (simplified)
payment_transactions
- payment_id (UUID), order_id, customer_id
- amount, currency, status, psp_reference
- idempotency_key, created_at, updated_at
ledger_entries
- entry_id, payment_id, account_id
- direction (debit/credit), amount, currency
- entry_type (auth/capture/refund/chargeback)
Reliability and consistency
Mandatory patterns
- Idempotency key for all mutating operations (authorize/capture/refund).
- Transactional outbox for publishing events without loss or duplicates.
- Retry with exponential backoff + jitter and circuit breaker on PSP calls.
- State machine with strict transitions (INITIATED -> AUTHORIZED -> CAPTURED/FAILED/REFUNDED).
- Daily reconciliation between internal ledger and PSP reports.
Dangerous anti-patterns
- Consider a webhook to be the only source of truth without regular reconciliation.
- Do not store the idempotency key or make it short-lived.
- Write financial status in one table without an immutable change log.
- Mix checkout business logic and gateway-specific code in one module.
- Process PAN/CVV in your system without strict necessity.
Security and compliance minimum
- Card data tokenization: PAN/CVV should not be included in your core service.
- mTLS/service identity between internal services of the payment circuit.
- Strict RBAC for refund/manual capture operations.
- Continuous audit trail for financial and operational activities.
- PCI DSS scope boundaries: the smaller the zone, the lower the operational risks.
Reference
Payment Intents
A practical example of stateful payment flow with authorization/capture.
Related chapters
- Rate Limiter - Protect payment APIs from traffic spikes, abuse patterns, and checkout-path overload.
- API Gateway - Entry-point policies, auth, and routing before payment orchestration services.
- Identification/AuthN/AuthZ - Access-control model for charge, refund, and sensitive operational actions.
- Encryption, keys and TLS - Cryptographic protection of payment flows with mTLS and secure key management.
