Context
Consistency and idempotency
Distributed transactions are one way to ensure consistency, but not the only one.
Distributed Transactions (2PC/3PC) are needed when a business invariant requires a coordinated change in several independent resources. The price of this choice is delays, blocking and complex recovery logic for partial failures.
When is a distributed transaction needed?
- One business operation affects several independent resources/services.
- Temporal inconsistency cannot be accepted for a particular class of operations.
- A partial commit error results in significant financial/regulatory risks.
2PC flow
2PC: two-phase commit
Prepare -> votes -> global decision (commit/abort)
The coordinator collects participant votes and makes one global commit/abort decision for the whole transaction.
Strengths
- Simple and easy-to-understand coordination model.
- Clearly separates preparation from the final decision.
Risks
- Blocking is possible if the coordinator fails at the wrong time.
- Highly sensitive to timeout/retry tuning.
Protocol Steps
Current Command
Click Start to play the protocol step-by-step.
Coordinator
Waiting to start
Coordinator commands: 0
Participants
3 participants
Active step: 0 / 8
Order
participant A
Waiting for commands
Involved in steps: 0
Payment
participant B
Waiting for commands
Involved in steps: 0
Inventory
participant C
Waiting for commands
Involved in steps: 0
3PC flow
3PC: three-phase commit
CanCommit -> PreCommit -> DoCommit
Adds an intermediate pre-commit phase to reduce the risk of blocking when coordinator issues occur.
Strengths
- Reduces probability of hanging in an uncertain state.
- Explicitly separates intent from final commit.
Risks
- More network rounds and a more complex state machine.
- Requires very careful timeout and recovery tuning.
Protocol Steps
Current Command
Click Start to play the protocol step-by-step.
Coordinator
Waiting to start
Coordinator commands: 0
Participants
3 participants
Active step: 0 / 12
Order
participant A
Waiting for commands
Involved in steps: 0
Payment
participant B
Waiting for commands
Involved in steps: 0
Inventory
participant C
Waiting for commands
Involved in steps: 0
Alternative
Event-Driven Architecture
In many scenarios, Saga + outbox gives better balance than global 2PC/3PC.
Trade-offs and alternatives
2PC is simple in concept, but can lock down the system if the coordinator fails.
3PC reduces the likelihood of blocking, but adds network rounds and state machine complexity.
Both approaches are sensitive to network partition, timeout tuning and correct recovery logic.
In a microservice architecture, a full ACID transaction between services is often too expensive and fragile.
Saga (orchestration/choreography)
Breaks the transaction into local steps with compensating actions instead of a global lockstep commit.
Transactional outbox
Guarantees consistency of the local database and event publishing without a distributed XA transaction.
Idempotent commands + reconciliation
Repeatable operations and background leveling reduce the effects of partial failure.
Domain redesign
Sometimes it is cheaper to change the boundaries of aggregates and remove the cross-service atomic requirement.
Practical checklist
- It is explicitly defined where strict atomicity is needed and where eventual consistency is acceptable.
- There is a coordinator recovery strategy and durable transaction log.
- Timeout policies tested for partition/delay scenarios.
- All participants support idempotent commit/abort processing.
- There is a business mechanism for compensation and manual resolution of controversial cases.
Frequent anti-pattern: introducing 2PC between services without evaluating blocking, retry model and recovery cost.
