Knowledge graphSettings

Updated: April 21, 2026 at 4:55 PM

HTTP protocol

medium

Request-response semantics, HTTP message structure, differences across HTTP/1.1, HTTP/2, and HTTP/3, and how network behavior reshapes web requests.

This chapter is useful because it presents HTTP evolution as a sequence of trade-offs around simplicity, caching, connection cost, and behavior under load.

In real engineering work, it helps you choose between HTTP/1.1, HTTP/2, and HTTP/3 based on traffic shape, reason about idempotency and caching semantics, and avoid confusing API design with transport behavior.

In interviews and design discussions, it gives you a structured language for discussing web-system performance and protocol-level trade-offs rather than only talking about endpoints.

Practical value of this chapter

Protocol to product

Connects HTTP behavior to UX metrics: latency, retries, caching, and API stability.

Version-aware design

Supports HTTP/1.1 vs HTTP/2 vs HTTP/3 decisions by traffic and network profile.

Performance tactics

Applies keep-alive, compression, caching, and multiplexing deliberately instead of by habit.

Interview articulation

Provides clear structure for discussing protocol-level web optimization trade-offs.

RFC

RFC 9110 (HTTP Semantics)

Current HTTP semantics specification: methods, status codes, headers, and cache behavior.

Перейти на сайт

HTTP matters not only as the language of the web, but as the contract that defines how clients ask for data and how services answer. Version choice, caching, and retry policy directly shape latency, resilience, and the cost of every request.

In system design, HTTP matters because it is the default request-response contract between clients and services. It is stateless by design, so performance and resilience have to be assembled from caching, token-based identity, database state, and application logic.

Request behavior is shaped by persistent connections, keep-alive, cache control, entity tags, and the chosen timeout and retry policy. Those choices directly influence latency, throughput, and the real processing cost of the path.

Under growth, multiplexing, cache-hit ratio, and head-of-line blocking start to matter immediately. At the same time, the request path often runs through load balancing, a CDN, or an API gateway, so the choice between HTTP/1.1, HTTP/2, and HTTP/3 is really a choice about network conditions and traffic shape.

Core properties of HTTP

Client-server interaction

The client sends an HTTP request, and the server returns a status code, headers, and optionally a body.

Stateless processing

Each request is handled independently, so state is pushed into caches, databases, tokens, and sessions.

Header-driven behavior

Headers define caching, content type, authorization, and security policy.

Intermediary layers

Proxies, CDNs, and API gateways help scale delivery, protect services, and reduce latency.

Version evolution

HTTP/1.1, HTTP/2, and HTTP/3 change runtime behavior under load while keeping the same application semantics.

How an HTTP message is structured

Regardless of version, HTTP keeps the same basic shape: a request or status line, headers, a blank line, and a body when one is needed.

HTTP request

Request line

METHOD /path HTTP/version

Headers

Host, Authorization, Content-Type, Cache-Control...

Blank line

Separates headers from the body

Optional body

JSON, HTML, or binary data

HTTP response

Status line

HTTP/version status-code reason

Headers

Content-Type, Cache-Control, ETag, Set-Cookie...

Blank line

Separates headers from the body

Optional body

API payload, HTML page, file, or data stream

Lifecycle of an HTTP request

Request preparation

The client resolves the name through DNS, chooses an endpoint, and builds method, path, headers, and an optional body.

Path traversal

The request travels through a load balancer, proxy, or API gateway before it reaches the service and its dependencies.

Response and connection reuse

The client receives status and data, applies cache rules, and reuses the open connection whenever it can.

What an HTTP exchange looks like

The same model shows up across REST APIs, edge gateways, and most synchronous integrations where a client waits for a concrete answer to a concrete request.

HTTP request ↔ response

HTTP is built around a clear pair: request from the client and response from the server.

Message structure

  • Method (GET, POST, PUT)
  • URI or resource path
  • Headers
  • Request body (optional)
ClientServer
Client
REQ
Server
The client sends a request without keeping protocol-level state on the server.

How HTTP behaves under load

Step through how cache-hit rate, error rate, and p95 latency move as traffic grows.

StepInterval 1 (1 of 6)
p95 latency (ms)Error rate (%)Cache hit (%)

Phase

Stable load

Load

2.4k RPS

p95 latency

85 ms

Error rate

0.2%

Cache hits

76.0%

Connection reuse

92.0%

Mitigation: Baseline keep-alive

What is happening: Caching and connection reuse keep latency low and predictable.

Abbreviations

  • RPS (requests per second) — number of HTTP requests served per second.
  • p95 — response time threshold under which 95% of requests complete.

What the metrics mean

  • Connection reuse — share of requests served on existing persistent connections.
  • Cache hit — share of responses returned without expensive backend processing.

Related chapter

Load Balancing

HTTP traffic almost always crosses L7/L4 balancers and policy layers before it reaches the service.

Open chapter

How network and routing affect HTTP

Connection reuse

Cold connections and extra handshakes raise p95/p99 even when the application itself is fast.

Cache hit share

As cache misses grow, more traffic reaches dependencies and user-facing latency deteriorates quickly.

Timeout and retry policy

Overly aggressive timeouts and retries can manufacture overload and spread the incident across dependencies.

L4 and L7 balancing

The balancer changes latency, request path shape, and client behavior, especially when flows need stickiness.

MTU, loss, and protocol version

Packet loss and unstable links affect HTTP/2 and HTTP/3 differently, especially in mobile scenarios.

Source

Evolution of HTTP (MDN)

Key HTTP/1.1, HTTP/2, and HTTP/3 milestones and the engineering trade-offs behind each version.

Перейти на сайт

HTTP evolution

Protocol evolution has focused on lower latency and more predictable behavior under heavy request concurrency.

HTTP/1.1

1997/1999

Text protocol over TCP

OSI mapping: Application layer semantics over TCP.

  • Persistent connections and chunked transfer
  • Head-of-line blocking within a single TCP connection
  • Often needs multiple connections to the same host

HTTP/2

2015

Binary framing and multiplexing

OSI mapping: The same application semantics with more efficient transport over TCP.

  • Multiple streams inside one connection
  • Header compression with HPACK
  • Lower connection setup overhead

HTTP/3

2022

HTTP over QUIC (UDP)

OSI mapping: The same application semantics over QUIC, where packet loss blocks neighboring streams less aggressively.

  • Faster connection establishment and recovery
  • Lower transport-level head-of-line impact
  • Better behavior in mobile and unstable networks

Where HTTP matters most

  • Public and internal APIs, including REST and browser-facing gRPC proxies
  • Web applications, SPAs, and SSR frontends
  • BFF and API gateway layers that compose responses from multiple services
  • Synchronous service-to-service calls that need predictable request-response flow
  • Edge access layers for authorization, rate limiting, and observability

Why this matters for system design

  • HTTP defines API contracts and directly affects latency, retries, and request-processing cost.
  • Version choice changes connection behavior, multiplexing, and resilience under packet loss.
  • A sound cache, timeout, and retry strategy reduces blast radius and protects dependencies.
  • HTTP metrics often provide the earliest signal of user-visible degradation.

Common mistakes

Treating HTTP as nearly free and ignoring the cost of connections, caching, and retries in capacity planning.

Using the same timeout and retry policy for every method and endpoint regardless of SLA, idempotency, and business criticality.

Ignoring cache semantics such as ETag and Cache-Control and pushing avoidable load onto services.

Combining 4xx and 5xx into one metric and losing an early signal of server-side degradation.

Related chapters

  • OSI model - shows where HTTP sits at the application layer and how it connects to transport and network behavior.
  • IPv4 and IPv6: evolution of IP addressing - explains how addressing and routing reshape HTTP request paths and latency.
  • TCP protocol - covers the main transport underneath HTTP/1.1 and HTTP/2 and the source of many request delays.
  • UDP protocol - shows the transport foundation for QUIC and HTTP/3, where delay and packet loss matter most.
  • Domain Name System (DNS) - reminds you that every HTTP request starts with name resolution and depends on DNS quality.
  • WebSocket protocol - shows how a long-lived bidirectional channel appears after HTTP Upgrade.
  • Load Balancing - breaks down L7 routing, sticky sessions, and balancing effects on HTTP behavior.
  • Remote call approaches - helps compare application protocols and timeout/retry policy at service boundaries.
  • Why distributed systems and consistency matter - moves the discussion from HTTP mechanics into distributed-architecture trade-offs.

Enable tracking in Settings