Acing SDI
Practice task from chapter 9
Distributed Message Queue as a foundational primitive for async service integration.
Distributed Message Queue is a core decoupling pattern in modern backend systems. In interviews, you need to show how you handle delivery semantics, ordering, retries, and backpressureunder growth and partial failures.
Functional requirements
- Publish/consume APIs for events and background jobs.
- Consumer groups and replay by offset.
- Delivery guarantees (at-least-once as baseline).
- Dead Letter Queue for problematic messages.
Non-functional requirements
- High throughput under burst traffic.
- Linear scaling through partitions.
- Controlled end-to-end delivery latency.
- Resilience to broker, consumer, and network failures.
Deep dive
Kafka (book summary)
Partitioned log, consumer groups, replication, and key operational trade-offs.
High-Level Architecture
Baseline DMQ setup: broker ingress, partitioned replicated log, and delivery control with consumer groups, retries, and DLQ boundaries.
Architecture Map
partitioned log + consumer groups + retry/DLQThe diagram covers publish flow, consume flow, and the retry/DLQ control loop.
Data Model Map
Queue event structure and placement model inside partitioned log.
Message Envelope
key
order:1234
payload
{ status: "created", amount: 9900 }
headers
Log Placement
partitioning
hash(key) -> topic: orders / partition: 7
offsets
offset: 912334 (append-only)
retention
7d / 100GB per partition / compaction
Ordering
Guaranteed within a partition, but not across partitions.
Replay
Offset lets consumers resume processing after crashes.
Idempotency
`message_id` helps deduplicate repeated deliveries.
Read / Write Path through components
Interactive flow of producer write path and consumer read path, including offset commits and retry/DLQ fallback.
Read/Write Path Explorer
Interactive walkthrough of publish/consume paths across core distributed message queue components.
Write path
- Partition key defines ordering scope and load distribution across partitions.
- Ack policy (leader vs quorum) controls latency vs durability trade-off.
- Producer batching and compression are usually key for burst throughput.
- Replication lag should be monitored separately from end-to-end consumer lag.
Delivery semantics
- At-most-once: fewer duplicates, potential loss.
- At-least-once: production baseline, requires idempotent consumers.
- Effectively-once: idempotent producer + consumer-side dedupe.
- Ordering is usually guaranteed per partition, not globally.
Operational controls
- Consumer lag as a core SLO signal.
- Backpressure via producer throttling and bounded retries.
- Retention policy balancing storage cost and replay ability.
- Poison-message policy: max retries, quarantine, manual remediation.
Common interview mistakes
- Promising global ordering without discussing costs and constraints.
- Ignoring idempotency with at-least-once delivery.
- No strategy for poison messages and retry storms.
- Mixing broker throughput metrics with end-to-end user latency.
