A balancing algorithm is easy to treat as a small implementation detail, even though it often decides how evenly the system handles long requests, hot keys, and pool changes.
The chapter compares Round Robin, Least Connections, and Consistent Hashing through request duration, uneven load, key locality, and the cost of redistribution during degradation.
In interviews, that helps shift the conversation from naming an algorithm to explaining which traffic shape you are protecting and how you would notice the choice has stopped working.
Practical value of this chapter
Traffic Profile
Start with the shape of the traffic: even flow, long requests, hot keys, and the share of stateful work.
Uneven Load
Look at what happens under long requests, spikes, and degraded nodes, not only during calm steady state.
Key Locality
Understand the price of locality: fewer cache misses, but more risk of hot keys and skewed distribution.
Decision Rationale
Explain which traffic profile the algorithm protects and which signals tell you it needs to be revisited.
Reference
Envoy Load Balancing
Practical guide to balancing algorithms and real-world usage scenarios.
A load-balancing algorithm is not a tiny implementation detail. It shapes latency, resilience, and how evenly the pool behaves under real traffic.
Round Robin, Least Connections, and Consistent Hashing solve the same routing problem in different ways: one gives a clean fairness baseline, one adapts to current pressure, and one preserves key-to-node locality.
The real choice depends on workload shape, state affinity, fairness expectations, the share of stateless versus stateful traffic, and the operational cost of pool changes.
Fairness
Round Robin gives a clean fairness baseline when nodes are similar in capacity.
Load Adaptation
Least Connections adapts better when some requests finish quickly and others stay open much longer.
Key Locality
Consistent Hashing reduces cache misses and key migration when the pool changes.
Resilience
An algorithm only works in production when health checks, slow start, and safe draining are part of the picture.
How to Choose an Algorithm
Describe the traffic profile
Step 1Capture request duration, burst behavior, key distribution, and the share of stateful operations.
Match the algorithm to traffic behavior
Step 2Round Robin fits even pools, Least Connections fits uneven request duration, and Consistent Hashing fits key locality and state affinity.
Test degradation paths
Step 3Simulate node shutdown, overload, and retries, then evaluate p95/p99 and the cost of key redistribution.
Add operational guardrails
Step 4Configure health policy, slow start, draining, and alerts for saturation and hot keys.
Core Algorithms and Visual Walkthrough
This visualization shows how the same request stream is distributed differently by each algorithm.
Round Robin
Requests are distributed in a cycle: S1 -> S2 -> S3 -> S1.
Pros
- Very simple implementation and predictable behavior.
- Works well for stateless backends with homogeneous nodes.
- Low runtime overhead in the balancer.
Limitations
- Does not account for current load or slow nodes.
- Can hurt latency on heterogeneous server pools.
Request Queue
Load Balancer
Round Robin
Server A
Server B
Server C
Ready for simulation. Start auto mode or run a single step.
Last decision: —
Comparing the Algorithms
Compare the algorithms by how they behave under overload, pool churn, and uneven key distribution. For Consistent Hashing, virtual nodes, hot keys, and key skew usually matter more than the headline description of the algorithm itself.
| Algorithm | What It Observes | Behavior During Degradation | Distribution Quality | Complexity | Best fit |
|---|---|---|---|---|---|
| Round Robin | Almost no awareness of current node state | Removes a failed node with minimal extra logic | Solid when traffic is even and nodes are similar | Low | Homogeneous pools of stateless services |
| Least Connections | Tracks the number of active connections | Usually handles long requests and traffic spikes better | Higher when request duration varies a lot | Medium | Traffic with mixed short and long requests |
| Consistent Hashing | Routes deterministically by request key | Reassigns only part of the keys when the pool changes | Depends on virtual-node setup and key shape | High | Traffic with state affinity and strong key-locality needs |
Practical Guidance
Quick Rules
- Start with Round Robin when the pool is homogeneous and request duration is roughly even.
- Move to Least Connections when short and long requests are mixed in the same flow.
- Use Consistent Hashing when keeping the same key on the same node matters more than perfect evenness.
- Validate the choice on peak load and degraded-node scenarios, not only on average traffic.
Common Mistakes
- Choosing an algorithm without profiling request duration, burst shape, and key skew.
- Using Consistent Hashing without virtual nodes or hot-key monitoring.
- Ignoring health checks, slow start, and safe connection draining for new instances.
- Treating active connections as the only load metric while ignoring CPU, RPS, and p99 latency.
- Mixing sticky and non-sticky routing without a clear fallback policy.
Checks Before Production
Related chapters
- Load Balancing - provides the L4/L7, health-management, and global-routing context on top of which concrete algorithms are chosen.
- Design principles for scalable systems - explains how traffic-distribution decisions connect to latency, throughput, and the broader system-growth model.
- Service Discovery - shows how to keep target pools up to date, which is required for correct balancing behavior.
- Service Mesh Architecture - extends balancing algorithms to service-to-service traffic with retries and unhealthy-instance policies.
- Caching strategies - helps reason about key locality and hot-key pressure, which are critical for Consistent Hashing.
- Multi-region / Global Systems - adds the regional traffic-distribution layer and failover strategy across data centers.
- API Gateway - demonstrates an applied L7 scenario where balancing algorithms work alongside routing policies.
