System Design Space
Knowledge graphSettings

Updated: March 2, 2026 at 12:12 AM

Balancing algorithms: Round Robin, Least Connections, Consistent Hashing

mid

Practical analysis of popular balancing algorithms, their trade-offs and recommendations for choosing for stateless/stateful loads.

Reference

Envoy Load Balancing

Practical guide to balancing algorithms and real-world usage scenarios.

Open reference

A balancing algorithm directly affects latency, resilience, and resource efficiency. The same instance pool can behave very differently depending on whether you choose equal rotation, current-load adaptation, or key-based routing.

Core Algorithms and Visualization

Round Robin

Requests are distributed in a cycle: S1 -> S2 -> S3 -> S1.

Pros

  • Very simple implementation and predictable behavior.
  • Works well for stateless backends with homogeneous nodes.
  • Low runtime overhead in the balancer.

Limitations

  • Does not account for current load or slow nodes.
  • Can hurt latency on heterogeneous server pools.
Best fit: Stateless APIs, even traffic, and similar instance capacity.

Request Queue

REQ-101Web
user:42
REQ-102Mobile
user:77
REQ-103Partner
tenant:acme
REQ-104Web
user:42

Load Balancer

Round Robin

Each new request is sent to the next server in a circular order.

Server A

handled: 0
active connections: 0

Server B

handled: 0
active connections: 0

Server C

handled: 0
active connections: 0
Ready

Ready for simulation. Start auto mode or run a single step.

Last decision: —

Comparison and Trade-offs

AlgorithmState awarenessFailover behaviorBalancing qualityComplexityBest fit
Round RobinNoSimple node exclusionMediumLowStateless, homogeneous pool
Least ConnectionsActive connectionsWorks well for burst + long-lived connectionsHighMediumMixed latency workload
Consistent HashingKey-aware routingPartial remap to neighboring nodesDepends on virtual nodesHighStateful/cache-sensitive traffic

Selection in Practice

Quick Rules

  • Start with Round Robin for simple stateless APIs.
  • Move to Least Connections for long and uneven requests.
  • Use Consistent Hashing when locality and sticky state matter.
  • Validate on peak traffic profile, not only average load.

Common Mistakes

  • Choosing an algorithm without profiling traffic shape (request duration, burst, key skew).
  • Using Consistent Hashing without virtual nodes and hot-key monitoring.
  • Ignoring health checks and slow-start for new instances.
  • Treating active connections as the only load metric without CPU/RPS/p99 latency.
  • Mixing sticky and non-sticky routing without an explicit fallback policy.

Mini Checklist Before Production

1. Active/passive health checks are enabled.
2. Slow-start and graceful connection draining are configured.
3. Metrics cover p95/p99, backend saturation, and retry-storm rate.
4. Failover is tested for both hard shutdown and node degradation.

Related chapters

Enable tracking in Settings

System Design Space

© 2026 Alexander Polomodov