System Design Space
Knowledge graphSettings

Updated: April 21, 2026 at 4:55 PM

WebSocket protocol

medium

HTTP Upgrade, long-lived connections, WebSocket frames, heartbeat and reconnect behavior, and how balancing shapes realtime channel stability.

This chapter matters because with WebSocket the architecture stops being a series of independent requests and becomes a system of persistent connections, delivery queues, and server-side state.

In practice, it helps you design realtime paths with fan-out, backpressure, reconnect behavior, heartbeat signals, and orderly shutdown in mind, which is where reliability usually breaks first.

In interviews and design discussions, it lets you frame WebSocket not as 'HTTP, only faster,' but as a separate operational class of trade-offs.

Practical value of this chapter

Realtime channel design

Guides persistent-channel architecture with fan-out, backpressure, and server-side state in mind.

Connection lifecycle

Covers reconnect behavior, heartbeat signals, and orderly shutdown for resilient realtime systems.

Scale implications

Explains the impact on load balancing, session stickiness, and infrastructure resource usage.

Interview robustness

Improves realtime case answers with explicit failure handling and ops trade-off reasoning.

RFC

RFC 6455 (WebSocket Protocol)

Core specification for handshake, framing model, connection management, and close codes.

Перейти на сайт

WebSocket is an application protocol used when a system needs to keep one channel open and exchange events in both directions after an initial HTTP Upgrade. It removes the need to re-establish a request on every update and turns connection state into a first-class architectural concern.

That persistent channel still depends on TCP, but the design pressure shifts toward full-duplex messaging, heartbeat policies, reconnect behavior, idle timeouts, and how much connection state the server must hold.

Under load, fan-out delivery, backpressure, stickiness, hotspot prevention, DNS resolution, and network latency all shape whether the WebSocket path stays predictable or turns into reconnect storms and unstable user experience.

Key properties of the WebSocket protocol

Switch from HTTP

The session begins with an HTTP/1.1 Upgrade request and then moves into WebSocket frame exchange.

Full-duplex channel

Client and server can send messages independently instead of taking turns in a request-response loop.

Long-lived connection

One channel can stay open for hours, reducing reconnect churn and repeated setup cost.

Liveness control

Ping, Pong, and heartbeat policies help detect broken links quickly and recover in a controlled way.

Backpressure and fan-out

Wide fan-out delivery needs queue control and adaptive send policy for slow or unstable clients.

How a WebSocket frame is structured

The frame layout defines message type, payload length, and masking rules in client-server exchange.

WebSocket frame (simplified)

Header from 2 bytes plus payload

FIN/RSV/Opcode

8 bits

MASK + Payload len

8 bits

Extended length (optional)

16 bits

Masking key (client -> server, optional)

32 bits

Payload data (variable)

32 bits

Browser clients usually mask payload data, while server-to-client frames are normally sent without masking. Large messages move into the extended 16- or 64-bit length format beyond the short base range.

WebSocket connection lifecycle

Connection setup and authentication

The client initiates HTTP Upgrade, and the server applies authentication and access policy before opening the channel.

Steady bidirectional exchange

Once open, both sides exchange message frames, heartbeats, and subscription updates.

Load growth and degradation

As fan-out grows, queue pressure and timeout pressure rise; rate limits and backpressure become critical.

Close and reconnect

The connection closes with code and reason, and the client reconnects with delay and jitter between attempts.

How the protocol switches from HTTP to WebSocket

The session starts with HTTP Upgrade and then becomes a persistent full-duplex channel, so both sides can exchange events without re-establishing a request for every update.

HTTP to WebSocket upgrade

Click a step or use the controls to walk through the HTTP upgrade and the message exchange that follows.

State

Click "Start" to see the transition process from HTTP to WebSocket.

ClientServer
Client
-
Server

The connection is open and ready for message exchange.

Details

Key headers and message examples will appear here.

How WebSocket behaves under load

Step through how fan-out, delivery latency, and reconnect rate evolve as realtime traffic grows.

StepInterval 1 (1 of 7)
p95 delivery (ms)Backlog (%)Reconnect rate (%)

Phase

Stable channel

Active connections

18.0k

Message stream

95 kpps

p95 delivery

42 ms

Backlog

18.0%

Reconnect rate

0.4%

Mitigation: Baseline heartbeat

What is happening: Connections are stable: messages are delivered with low latency and minimal reconnect churn.

Abbreviations

  • kpps (kilo packets/messages per second) — thousands of messages processed per second.
  • p95 delivery — delivery latency threshold under which 95% of messages complete.

What the metrics mean

  • Fan-out — one event broadcast to many active subscribers at once.
  • Backlog — queued outbound messages waiting to be delivered to clients.

Related chapter

Load Balancing

How to distribute long-lived WebSocket sessions and avoid gateway hotspots.

Open chapter

How network and routing affect WebSocket

L4/L7 balancing behavior

Incorrect balancing of long-lived sessions creates hotspots and uneven gateway resource usage.

Proxy/NAT idle timeouts

Stateful middleboxes can drop idle links, so heartbeat cadence must match infrastructure timeouts.

TLS termination cost

Large numbers of active WebSocket sessions increase edge TLS termination cost and proxy CPU usage.

Loss in unstable mobile networks

Packet loss and flapping connectivity trigger reconnect waves that can overload authentication and session layers.

Delivery path length

More hops between the event producer and the WebSocket gateway increase tail delivery latency for client updates.

Where WebSocket is especially useful

  • Chats, collaborative editors, and presence services
  • Live dashboards, alerts, and operational consoles
  • Trading and fintech feeds with frequent updates
  • Interactive gameplay and control paths with fast bidirectional updates
  • WebRTC signaling and control channels

Why this matters in System Design

  • WebSocket changes the load profile: active long-lived connections and outbound queues become first-class capacity metrics.
  • Interactive systems need explicit control over delivery latency, queue growth, and reconnect storms.
  • Balancing strategy and connection state handling directly affect user-visible stability and tail behavior.
  • Well-chosen heartbeat, timeout, and reconnect policies reduce the blast radius of unstable networks.

Common mistakes

Ignoring backpressure and broadcasting blindly, which overloads slow clients and gateway nodes.

Skipping heartbeat and timeout tuning, which turns small network flaps into reconnect storms.

Keeping session state only in one node memory without stickiness or shared state fallback.

Using WebSocket as a universal API replacement without clear boundaries between request paths and stream paths.

Related chapters

Enable tracking in Settings