HTTP/1.1 to HTTP/2: Bottlenecks, Multiplexing, and What Stayed Broken
HTTP/1.1 evolved a set of workarounds — six connections per origin, domain sharding, sprite sheets, request concatenation — to compensate for two structural defects: request-ordered responses on a single connection and plain-text headers repeated on every request. HTTP/2 (RFC 9113, June 2022, obsoleting RFC 7540) replaced the wire format with binary frames carried over independent streams, layered HPACK header compression on top, and consolidated the connection back to one. It did not, and could not, fix TCP’s in-order delivery — which is why a single dropped packet still stalls every stream and why HTTP/3 had to move to QUIC.
This article reconstructs the design pressure behind each HTTP/2 mechanism, surfaces the trade-offs that drove the deprecations of priority and server push, and ends with a current operational checklist.
Mental model
A TCP connection is a single byte-stream the kernel must hand to userspace in order. HTTP/1.1 serialised one request-response per connection, so a slow response blocked everything behind it; the workaround was to open more connections. HTTP/2 cuts the byte-stream into frames tagged with a stream ID, lets frames from different streams interleave on the same connection, and uses a separate, application-aware compression layer (HPACK) for headers. That fixes the application-layer bottleneck. It does not change TCP: a single lost segment still blocks the kernel from delivering anything later, even on streams whose data is sitting in the receive buffer.
Trade-offs at a glance
| Aspect | HTTP/1.1 workaround | HTTP/2 mechanism | Remaining limitation |
|---|---|---|---|
| HOL blocking | ~6 TCP connections per origin | Multiplexed streams over 1 connection | TCP delivers in order — packet loss stalls every stream until retransmit clears |
| Header overhead | Verbose plain-text headers each request | HPACK static + dynamic table + Huffman | Dynamic table state is per-connection; cold connections start uncompressed |
| Resource priority | Implicit by request order | Stream priority tree (RFC 7540 §5.3) — deprecated | Replaced by simpler RFC 9218 Priority header; uneven server adoption |
| Proactive delivery | None | PUSH_PROMISE server push — disabled in browsers |
Replaced by 103 Early Hints |
Practical implications
- Consolidate origins. HTTP/2 multiplexing only works when assets share a connection; domain sharding now hurts more than it helps1.
- TCP loss still matters. On lossy networks (mobile, transcontinental), HTTP/2 over TCP can underperform six independent HTTP/1.1 connections that experience independent loss2.
- Don’t rely on HTTP/2 priority. Server adoption is uneven; the actively-evolving signal is the
Priorityheader from RFC 9218, used by Chrome 124+ and Firefox 128+.
HTTP/1.1 architectural limitations
HTTP/1.1 was first standardised in 1997 and has been re-issued multiple times — most recently as RFC 9112 (message syntax) and RFC 9110 (semantics) in 2022. Persistent connections (Connection: keep-alive) and pipelining were added early; the structural problems they tried to soften were never fully solved on the wire.
Application-layer head-of-line blocking
A single HTTP/1.1 connection processes requests serially. A 5 MB hero image in flight blocks every later request on that connection regardless of priority:
GET /hero.jpg HTTP/1.1 # 5 MB — currently transferringGET /critical.css HTTP/1.1 # blocked behind hero.jpgGET /app.js HTTP/1.1 # blocked behind critical.cssPipelining — sending requests back-to-back without waiting for responses — was specified, but never deployed at scale because:
- Responses must come back in request order, so a slow first response still blocks the rest.
- Many proxies and CDNs implement pipelining incorrectly or not at all, breaking the chain mid-flight.
- Retry semantics are ambiguous: when a connection drops mid-pipeline, the client cannot tell which requests were processed.
Connection proliferation
To work around HOL blocking, browsers open multiple TCP connections per origin — typically capped at six in modern browsers (Chrome, Firefox, Safari)3. Each extra connection costs:
| Overhead | Cost |
|---|---|
| TCP handshake | 1 RTT (SYN → SYN-ACK → ACK) |
| TLS handshake | 1 RTT in TLS 1.3, 2 RTTs in TLS 1.2 |
| Slow start | Initial congestion window of ~10 segments per RFC 6928 — each connection ramps independently |
| Memory | Per-connection socket state, kernel send/receive buffers |
| Server resources | Per-connection event-loop slot, file descriptor, timer |
Domain sharding — distributing assets across cdn1.example.com, cdn2.example.com, cdn3.example.com — multiplied the per-origin cap. Each shard incurred a fresh DNS lookup, TCP handshake, and TLS handshake, and broke HTTP/2 connection coalescing once the protocol changed.
Text framing and chunked encoding
HTTP/1.1 messages are text-framed: the parser scans for \r\n to find header boundaries and uses Content-Length (when known) or Transfer-Encoding: chunked to find the body boundary. Chunked transfer encoding (RFC 9112 §7.1) lets a server stream a response of unknown length by emitting a sequence of size-prefixed chunks terminated by a 0\r\n\r\n chunk:
HTTP/1.1 200 OKTransfer-Encoding: chunked7\r\nMozilla\r\n9\r\nDeveloper\r\n0\r\n\r\nTwo structural costs follow from text framing:
- Parsing surface. Whitespace folding, LF-vs-CRLF tolerance, chunk-extension parameters, and trailers create a wide ambiguity surface. Mismatches between front-end and back-end parsers underpin the HTTP request smuggling family of attacks (RFC 9112 §11.2).
- No interleaving. A chunked response still occupies the connection until the terminating chunk arrives. Chunked encoding solves “stream a response of unknown length”; it does not solve HOL blocking.
Header redundancy
HTTP/1.1 headers are uncompressed text repeated on every request:
GET /api/users HTTP/1.1Host: api.example.comCookie: session=abc123; tracking=xyz789; preferences=dark-modeUser-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36...Accept: application/jsonAccept-Language: en-US,en;q=0.9Accept-Encoding: gzip, deflate, brCache-Control: no-cacheAuthorization: Bearer eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9...Typical real-world headers are 500–2,000 bytes per request4. A page issuing 100 XHR calls with similar headers spends 50–200 KB on metadata — often exceeding TCP’s 10-segment initial congestion window (10 × 1,460 = ~14.6 KB), forcing additional round trips before the first body byte arrives.
HTTP/2 design decisions
RFC 9113 addressed each HTTP/1.1 limitation through binary framing and stream multiplexing, with HPACK plugged in as a separate layer for headers.
Why binary framing
Every HTTP/2 message is carried in 9-octet framed records:
+-----------------------------------------------+| Length (24 bits) |+---------------+---------------+---------------+| Type (8) | Flags (8) |+---------------+---------------+---------------+|R| Stream Identifier (31 bits) |+-----------------------------------------------+| Frame Payload (0-16,777,215) |+-----------------------------------------------+Design rationale:
- No text-parsing ambiguity. No whitespace folding, no LF-vs-CRLF edge cases, no chunked-encoding length games. Every byte boundary is computable from the length field.
- Constant-time framing. The fixed 9-octet header makes frame boundaries cheap to find without buffering the whole message.
- Type-driven extensibility. New frame types (e.g.
PRIORITY_UPDATEfrom RFC 9218) can be added without renegotiating a protocol version; unknown types are dropped per §5.5. - Stream-aware routing. The 31-bit stream identifier lets intermediaries route frames per stream rather than per connection.
Note
Binary framing is not human-readable; debugging needs nghttp -nv, curl --http2 -v, or Wireshark’s HTTP/2 dissector. Plan for this when standing up new HTTP/2 services.
Stream multiplexing
A stream is an independent, bidirectional sequence of frames identified by a 31-bit stream ID. Frames from many streams interleave freely on a single connection:
Stream identifier rules (RFC 9113 §5.1.1):
- Client-initiated streams use odd identifiers (1, 3, 5, …).
- Server-initiated streams (
PUSH_PROMISE) use even identifiers (2, 4, 6, …). - Stream 0 is reserved for connection-level frames (
SETTINGS,PING,GOAWAY,WINDOW_UPDATEfor the whole connection). - Identifiers monotonically increase; the maximum is 2³¹ − 1 per connection lifetime, after which the connection must be closed and replaced.
HPACK header compression
HPACK (RFC 7541) replaces HTTP/1.1’s plain-text headers with a stateful compressed encoding.
Why not DEFLATE
SPDY initially used DEFLATE for headers and was promptly broken by the CRIME attack: an attacker who could inject chosen-plaintext into requests on the same connection observed compressed-size changes and recovered cookies one character at a time. HPACK’s design §7 explicitly forces guesses to match an entire header value before they yield a size change, raising attack cost from O(alphabet × length) to O(alphabet^length) for high-entropy secrets.
Encoding mechanism
| Component | Size | Purpose |
|---|---|---|
| Static table | 61 entries | Predefined common headers (:method: GET, :status: 200, accept, …)5 |
| Dynamic table | Default 4 KiB per direction6 | FIFO cache of recently sent headers; size negotiated via SETTINGS_HEADER_TABLE_SIZE |
| Huffman code | 5–30 bits per symbol | Static code table tuned on real HTTP traffic (RFC 7541 Appendix B) |
Each dynamic-table entry costs name_length + value_length + 32 octets (RFC 7541 §4.1). The 32-octet constant is a budgeted allowance for implementation overhead (reference counts, pointers); a 4 KiB table typically holds 50–100 real headers.
The encoder picks one of four representations per header, in roughly this order:
A typical header set of 2–3 KB compresses to 100–300 bytes on the first request and to ~10 bytes on subsequent requests once the dynamic table is warm7.
Important
Sensitive headers (Authorization, Cookie, Set-Cookie) should be encoded with the never-indexed literal representation (RFC 7541 §6.2.3). This forbids intermediaries from copying the value into their own dynamic table — important when a downstream proxy is also HTTP/2-terminating and could otherwise leak the secret on a different connection.
Flow control
HTTP/2 implements credit-based flow control at both the connection and stream levels via WINDOW_UPDATE frames:
| Setting | Default | Range |
|---|---|---|
SETTINGS_INITIAL_WINDOW_SIZE |
65,535 bytes (2¹⁶ − 1) | 0 to 2³¹ − 1 |
| Initial connection window | 65,535 bytes (fixed by the spec; not negotiable) | n/a |
SETTINGS_MAX_FRAME_SIZE |
16,384 bytes (2¹⁴) | 2¹⁴ to 2²⁴ − 1 |
Unlike TCP’s transport-layer flow control, HTTP/2’s windows operate between application endpoints, allowing a slow consumer of one stream (e.g. a video chunk decoder behind a backpressured renderer) to throttle that stream alone. Setting a window to 2³¹ − 1 and continuously refilling it effectively disables flow control — useful for high-throughput one-way streaming.
Stream priority — specified, then deprecated
RFC 7540 §5.3 defined a dependency tree where each stream had a parent stream and a 1–256 weight. The intent was for the server to use this to schedule frame interleaving.
RFC 9113 §5.3 deprecates the dependency tree because:
- Clients expressed priorities inconsistently across browsers.
- Many servers and CDNs ignored or partially propagated priority signals.
- The protocol forced a complex, stateful tree even when applications wanted a simple urgency/incremental hint.
- Real-world performance gains were marginal compared to the implementation surface.
The replacement, RFC 9218, defines:
- A
Priorityrequest header withu=(urgency, 0–7, default 3) andi(incremental, boolean) parameters — usable on any HTTP version. - A
PRIORITY_UPDATEframe for HTTP/2 and HTTP/3 to update priority after the request was sent.
Browser support as of early 2026: Chrome / Edge 124+, Firefox 128+, Safari not yet8. The header is sent on both HTTP/2 and HTTP/3 connections — it is not HTTP/3-only — but server- and CDN-side implementation of the scheduling logic is what actually determines whether it changes anything9.
Server push — designed in, dialled out
PUSH_PROMISE (RFC 9113 §6.6) let a server initiate a stream proactively — typically to push CSS or JS the moment the HTML request arrived, saving one round trip.
Why it failed in practice:
- The server rarely knows what the client has cached. Pushing a 200 KB CSS file the client already has is pure waste.
- Pushed bytes compete for bandwidth and the connection-level flow window with explicitly-requested bytes.
- Pushes race with client requests; if the client is faster, the push is cancelled mid-flight (
RST_STREAM). - The implementation surface (per-resource decisions, cache awareness, push-cache plumbing in browsers) outpaced the latency gains.
Browser timeline:
| Browser | What changed | When | Source |
|---|---|---|---|
| Chrome 106 | Server push disabled by default | September 2022 (announcement August 2022) | Chrome blog |
| Firefox 132 | network.http.http2.allow-push flipped to false by default |
October 29, 2024 | Firefox 132 release notes |
| HTTP/3 (any browser) | Server push never shipped in browsers | n/a | RFC 9114 §4.6 defines it; no browser exposes it |
The replacement is 103 Early Hints. The server returns a 103 informational response with Link: …; rel=preload headers while it is still computing the final response, letting the browser fetch critical sub-resources during server think-time without committing bytes the client may already have.
HTTP/2 103 Early HintsLink: </styles/critical.css>; rel=preload; as=styleLink: </scripts/app.js>; rel=preload; as=scriptHTTP/2 200 OKContent-Type: text/html...TCP head-of-line blocking
HTTP/2 fixed the application layer. The transport layer remained the bottleneck.
TCP guarantees in-order delivery to the application. When segment N is lost, segments N+1, N+2, … sit in the kernel receive buffer and the application sees nothing — even if those segments belong to different HTTP/2 streams whose data is already complete. On networks with sustained packet loss in the 1–2% range, a single HTTP/2 connection can lose to six independent HTTP/1.1 connections that absorb loss in parallel2.
The RFC 9113 introduction states the constraint explicitly: “Note, however, that TCP head-of-line blocking is not addressed by this protocol.” That is the primary motivation for HTTP/3’s adoption of QUIC (RFC 9000), which delivers each stream independently over UDP and recovers loss per stream rather than per connection.
Protocol negotiation
ALPN
Browsers require TLS for HTTP/2 and select the protocol via ALPN (RFC 7301), an extension carried inside ClientHello/ServerHello:
Standard protocol identifiers:
h2— HTTP/2 over TLS.http/1.1— HTTP/1.1.h2c— HTTP/2 over cleartext (not supported by browsers — see below).
ALPN piggybacks on the TLS handshake, so HTTP/2 negotiation costs zero extra round trips versus an HTTP/1.1-only TLS connection.
HTTP/2 connection preface
Once TLS reports h2, the client sends a fixed 24-octet connection preface followed by a SETTINGS frame:
PRI * HTTP/2.0\r\n\r\nSM\r\n\r\nThe server’s preface is its own SETTINGS frame, which must be the first frame it sends. The preface string was deliberately chosen to be invalid as an HTTP/1.1 request: an HTTP/1.1-only server that received it would respond with a parse error, preventing protocol confusion if the client mis-detected the server’s capabilities.
Cleartext HTTP/2 (h2c)
RFC 7540 §3.2 originally defined an HTTP/1.1 Upgrade: h2c handshake to start HTTP/2 over cleartext TCP:
GET / HTTP/1.1Host: example.comConnection: Upgrade, HTTP2-SettingsUpgrade: h2cHTTP2-Settings: <base64url-encoded SETTINGS frame>RFC 9113 §3.1 marks both the h2c upgrade token and the HTTP2-Settings header as obsolete because the mechanism was never widely deployed. Cleartext HTTP/2 in 2026 must instead use prior knowledge (RFC 9113 §3.3): the client opens a TCP connection and sends the 24-octet HTTP/2 connection preface immediately, with no Upgrade dance.
Warning
No browser supports cleartext HTTP/2 in any form. In the public web HTTP/2 always implies TLS. Prior-knowledge h2c is only useful for backend-to-backend traffic (gRPC over h2c on a private network, ingress-to-pod inside a service mesh) where both sides are pre-configured to skip negotiation.
Operational checklist
When standing up or auditing HTTP/2 in production:
- Negotiate via ALPN with TLS 1.3. TLS 1.3 cuts the handshake to 1 RTT and supports 0-RTT resumption; combined with ALPN there is no protocol-upgrade penalty for HTTP/2.
- Consolidate origins. One TLS-terminating origin per logical site beats domain sharding; HTTP/2 connection coalescing only kicks in when names share an IP and certificate.
- Send
103 Early Hintsfor critical sub-resources during server think-time. Pair withLink: …; rel=preload. AvoidPUSH_PROMISE. - Set
Priorityheaders for time-sensitive requests (e.g.Priority: u=1, ifor above-the-fold images). Recognise that server adoption is uneven — measure, don’t assume. - Monitor TCP retransmission rate on your edge. Above ~1–2% loss, multiple HTTP/1.1 connections can outperform a single HTTP/2 one. This is also the signal to evaluate HTTP/3.
- Encode sensitive headers as never-indexed. Verify your HTTP/2 server actually emits the never-indexed representation for
AuthorizationandCookieheaders, not merely “literal with indexing”. - Cap the dynamic header table at 4 KiB unless you have measured benefit from raising it. Larger tables give more compression but cost memory per connection across thousands of clients.
Appendix
Prerequisites
- TCP three-way handshake and basic congestion control (slow start, cwnd, RTO).
- TLS handshake basics (
ClientHello,ServerHello, ALPN extension). - HTTP request-response model and the role of intermediaries (proxies, CDNs, load balancers).
Terminology
| Abbreviation | Full form | Meaning |
|---|---|---|
| HOL | Head-of-Line | Blocking condition where one stalled item prevents subsequent items from progressing |
| HPACK | HTTP/2 header compression | Stateful header encoding using static + dynamic tables and Huffman coding |
| ALPN | Application-Layer Protocol Negotiation | TLS extension that selects the application protocol during the handshake |
| RTT | Round-Trip Time | One full sender → receiver → sender packet trip |
| cwnd | Congestion Window | TCP sender’s cap on unacknowledged in-flight bytes |
| h2 | HTTP/2 over TLS | ALPN protocol identifier for encrypted HTTP/2 |
| h2c | HTTP/2 cleartext | ALPN protocol identifier for unencrypted HTTP/2 (backend use only) |
| QUIC | Quick UDP Internet Connections | Transport protocol underlying HTTP/3; provides per-stream loss recovery over UDP |
Summary
- HTTP/1.1’s serialised request-response model and its ~6-connection per-origin cap forced expensive workarounds — domain sharding, sprite sheets, asset concatenation — without ever fixing application-layer HOL blocking.
- HTTP/2’s binary framing and stream multiplexing remove application-layer HOL on a single TCP+TLS connection. HPACK adds a dedicated, CRIME-resistant compression layer for headers.
- TCP’s in-order delivery is unsolved by HTTP/2 — a single dropped segment stalls every multiplexed stream, motivating HTTP/3 / QUIC.
- The two HTTP/2 features that looked best on paper — the priority dependency tree and
PUSH_PROMISEserver push — were deprecated after deployment showed inconsistent implementation and marginal gains. RFC 9218’sPriorityheader and103 Early Hintsare the current replacements. - ALPN folds protocol selection into the TLS handshake at zero extra RTT cost;
h2cis a cleartext path that browsers never adopted.
References
Standards (primary sources)
- RFC 9113 — HTTP/2 (June 2022) — current HTTP/2 specification, obsoletes RFC 7540.
- RFC 9112 — HTTP/1.1 message syntax and routing (June 2022)
- RFC 9110 — HTTP semantics (June 2022)
- RFC 7541 — HPACK header compression for HTTP/2 (May 2015)
- RFC 9218 — Extensible Prioritization Scheme for HTTP (June 2022)
- RFC 8297 — 103 Early Hints (December 2017)
- RFC 7301 — TLS Application-Layer Protocol Negotiation (July 2014)
- RFC 6928 — Increasing TCP’s Initial Window (April 2013)
- RFC 9114 — HTTP/3 (June 2022)
Browser implementation and rollout
- Chrome — Removing HTTP/2 Server Push
- Firefox 132 release notes — server push deactivated
- MDN —
Priorityheader (RFC 9218) - MDN —
103 Early Hints
Practitioner and secondary sources
- Cloudflare — HPACK: the silent killer feature of HTTP/2
- Cloudflare — Better HTTP/3 prioritization for a faster web (RFC 9218)
- Ilya Grigorik — High Performance Browser Networking, HTTP/2 chapter
- Evert Pot — HTTP/2 Server Push is dead
Footnotes
-
Ilya Grigorik, High Performance Browser Networking — HTTP/2: connection management, O’Reilly. HTTP/2 connection coalescing breaks down once shards land on different IPs or certificates. ↩
-
IETF Working Group, RFC 9000 — QUIC: A UDP-Based Multiplexed and Secure Transport, §1. Explicitly motivates QUIC by the inability of TCP-based HTTP/2 to recover from per-stream loss independently. ↩ ↩2
-
Ilya Grigorik, High Performance Browser Networking — Building Blocks of TCP. Six is a de facto convention, not a spec requirement: RFC 2616 §8.1.4 (HTTP/1.1, 1999) recommended at most two connections per server; RFC 7230 §6.4 and its successor RFC 9112 §9.4 dropped the explicit cap, leaving the per-origin limit to client policy. Browsers settled on six. ↩
-
HTTP Archive — Web Almanac, HTTP chapter reports median HTTP request header sizes well under 1 KiB and median response header sizes around 0.6–1.3 KiB on real-world pages. ↩
-
RFC 7541 Appendix A lists all 61 entries of the HPACK static table —
:authority,:method GET,:status 200, commonaccept-*values, etc. ↩ -
RFC 9113 §6.5.2 sets
SETTINGS_HEADER_TABLE_SIZEdefault to 4,096 octets. ↩ -
Cloudflare, HPACK: the silent killer feature of HTTP/2, 2016. Reports compression ratios of 76–96% on real traffic samples. ↩
-
Can I use —
PriorityHTTP header. Chrome/Edge 124+ (April 2024), Firefox 128+ (July 2024), Safari not yet. ↩ -
Cloudflare, Better HTTP/3 prioritization for a faster web, 2022. Confirms the
Priorityheader works for HTTP/2 and HTTP/3, but only matters when the server actually implements RFC 9218 scheduling. ↩