System Design Space
Knowledge graphSettings

Updated: March 25, 2026 at 3:00 AM

Distributed Transactions: 2PC and 3PC

hard

Practical analysis of distributed transactions: coordinator, prepare/commit phases, failure modes, blocking trade-offs and alternatives via Saga/outbox.

Distributed transactions become painful exactly where the business wants atomicity but the architecture has already split across multiple services and stores.

In real engineering work, this chapter helps choose between 2PC, 3PC, Saga, and outbox not by diagram aesthetics, but by domain boundaries, acceptable failure behavior, blocking characteristics, and the cost of coordination.

In interviews, reviews, and design conversations, it is especially useful when you need to speak plainly about timeout semantics, partial commit, compensations, and idempotency instead of just saying distributed transaction.

Practical value of this chapter

Design in practice

Helps choose transaction patterns by domain boundaries and acceptable failure behavior.

Decision quality

Compares 2PC/3PC/Saga by latency, locking impact, and operational complexity.

Interview articulation

Provides a clear narrative for coordinator, participants, commit point, and recovery.

Risk and trade-offs

Makes blocking, partial-commit, timeout, and idempotency trade-offs explicit.

Context

Consistency and idempotency

Distributed transactions are one way to ensure consistency, but not the only one.

Open chapter

Distributed Transactions (2PC/3PC) are needed when a business invariant requires a coordinated change in several independent resources. The price of this choice is delays, blocking and complex recovery logic for partial failures.

When is a distributed transaction needed?

  • One business operation affects several independent resources/services.
  • Temporal inconsistency cannot be accepted for a particular class of operations.
  • A partial commit error results in significant financial/regulatory risks.

2PC flow

2PC: two-phase commit

Prepare -> votes -> global decision (commit/abort)

The coordinator collects participant votes and makes one global commit/abort decision for the whole transaction.

Strengths

  • Simple and easy-to-understand coordination model.
  • Clearly separates preparation from the final decision.

Risks

  • Blocking is possible if the coordinator fails at the wrong time.
  • Highly sensitive to timeout/retry tuning.

Protocol Steps

Current Command

Click Start to play the protocol step-by-step.

Coordinator

Waiting to start

Coordinator commands: 0

Participants

3 participants

Active step: 0 / 8

Order

participant A

Waiting for commands

Involved in steps: 0

Payment

participant B

Waiting for commands

Involved in steps: 0

Inventory

participant C

Waiting for commands

Involved in steps: 0

3PC flow

3PC: three-phase commit

CanCommit -> PreCommit -> DoCommit

Adds an intermediate pre-commit phase to reduce the risk of blocking when coordinator issues occur.

Strengths

  • Reduces probability of hanging in an uncertain state.
  • Explicitly separates intent from final commit.

Risks

  • More network rounds and a more complex state machine.
  • Requires very careful timeout and recovery tuning.

Protocol Steps

Current Command

Click Start to play the protocol step-by-step.

Coordinator

Waiting to start

Coordinator commands: 0

Participants

3 participants

Active step: 0 / 12

Order

participant A

Waiting for commands

Involved in steps: 0

Payment

participant B

Waiting for commands

Involved in steps: 0

Inventory

participant C

Waiting for commands

Involved in steps: 0

Alternative

Event-Driven Architecture

In many scenarios, Saga + outbox gives better balance than global 2PC/3PC.

Open chapter

Trade-offs and alternatives

2PC is simple in concept, but can lock down the system if the coordinator fails.

3PC reduces the likelihood of blocking, but adds network rounds and state machine complexity.

Both approaches are sensitive to network partition, timeout tuning and correct recovery logic.

In a microservice architecture, a full ACID transaction between services is often too expensive and fragile.

Saga (orchestration/choreography)

Breaks the transaction into local steps with compensating actions instead of a global lockstep commit.

Transactional outbox

Guarantees consistency of the local database and event publishing without a distributed XA transaction.

Idempotent commands + reconciliation

Repeatable operations and background leveling reduce the effects of partial failure.

Domain redesign

Sometimes it is cheaper to change the boundaries of aggregates and remove the cross-service atomic requirement.

Practical checklist

  • It is explicitly defined where strict atomicity is needed and where eventual consistency is acceptable.
  • There is a coordinator recovery strategy and durable transaction log.
  • Timeout policies tested for partition/delay scenarios.
  • All participants support idempotent commit/abort processing.
  • There is a business mechanism for compensation and manual resolution of controversial cases.

Frequent anti-pattern: introducing 2PC between services without evaluating blocking, retry model and recovery cost.

References

Related chapters

Enable tracking in Settings