v1.10.90-0e025b8
Skip to main content
TechnicalProtocols

Proxying WebSockets: Upgrade Headers, Persistent Connections, and the Load Balancing Problem

11 min read

By Hex Proxies Engineering Team

Proxying WebSockets: Upgrade Headers, Persistent Connections, and the Load Balancing Problem

WebSockets were designed as an escape hatch from HTTP's request-response model. They carry long-lived bidirectional streams over a single TCP connection, and they power everything from trading dashboards to live sports feeds to collaborative document editors. Proxying them is harder than proxying HTTP because the connection is both persistent and stateful, and because the load balancer in front of the proxy does not get the natural rebalancing opportunity that a fresh HTTP request provides. This post covers how the WebSocket upgrade actually works, what goes wrong when you try to scale it, and concrete patterns for proxying WebSocket traffic in production.

The Upgrade Handshake

A WebSocket connection starts life as an HTTP/1.1 request with specific headers (RFC 6455, Section 4.1):

GET /ws HTTP/1.1
Host: target.example
Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Key: dGhlIHNhbXBsZSBub25jZQ==
Sec-WebSocket-Version: 13
Origin: https://target.example

The server, if it accepts the upgrade, responds with status 101 Switching Protocols and a computed accept key:

HTTP/1.1 101 Switching Protocols
Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Accept: s3pPLMBiTxaQ9kYGzzhZRbK+xOo=

The accept key is SHA-1 of the client's Sec-WebSocket-Key concatenated with the fixed GUID 258EAFA5-E914-47DA-95CA-C5AB0DC85B11, base64-encoded. After the 101 response, the TCP connection is no longer speaking HTTP. It is carrying WebSocket frames in both directions until either side sends a close frame or the TCP connection drops.

What an HTTP Proxy Must Do

An HTTP proxy that wants to forward WebSocket traffic must:

  1. Recognize the Upgrade header and not treat the connection as a normal request.
  2. Forward the upgrade request and the 101 response transparently.
  3. After the 101, splice the client and upstream sockets so raw bytes flow in both directions.
  4. Not apply HTTP-level timeouts to the now-persistent connection.

nginx handles this correctly if you configure it explicitly. The default proxy_pass directive does not forward the Upgrade header because nginx hop-by-hop-header handling strips it per RFC 7230 Section 6.1. You have to set it back:

location /ws {
    proxy_pass http://backend;
    proxy_http_version 1.1;
    proxy_set_header Upgrade $http_upgrade;
    proxy_set_header Connection "upgrade";
    proxy_read_timeout 3600s;
    proxy_send_timeout 3600s;
}

The 3600-second timeout is the one that bites people. The nginx default is 60 seconds. A WebSocket connection that sits idle for 61 seconds dies silently. Applications that use the WebSocket for low-frequency notifications (stock alerts, CI build status, order fills) will see the connection close and reconnect in a loop.

SOCKS5 and WebSockets

SOCKS5 (RFC 1928) does not care about HTTP at all. It establishes a TCP tunnel based on a target address and port, and the bytes in that tunnel can be any protocol. WebSocket works through SOCKS5 transparently because SOCKS5 never looks at the Upgrade header. For scraping and automation workloads that need to connect to WebSocket endpoints, a SOCKS5 proxy is often the path of least resistance.

The catch is that SOCKS5 authentication happens once, at connection time. A proxy provider that bills per-request cannot meter a long-lived WebSocket connection the same way. Most providers bill WebSocket traffic by bytes transferred or by connection-hour rather than per request. Verify the billing model before wiring a WebSocket workload to a proxy meant for per-request scraping.

The Load Balancing Problem

A typical HTTP load balancer hashes the incoming request to pick a backend. If a backend goes unhealthy, the next request gets a different backend. For WebSockets, "the next request" might not come for an hour. If you have 10,000 WebSocket connections pinned to backend A and backend A restarts, all 10,000 clients reconnect at roughly the same moment. This is the thundering herd, and it is the defining operational problem of WebSocket infrastructure.

Four mitigation patterns are in use:

  1. Connection draining with grace periods. The backend signals the load balancer that it should no longer receive new connections, waits for existing connections to close naturally, and only then restarts. Works for rolling deployments. Does not help with crashes.
  2. Reconnection jitter. Client libraries add random delay (say, 0 to 30 seconds) before reconnecting after an unexpected disconnect. This spreads the herd across time. Socket.IO does this by default; hand-rolled WebSocket clients usually do not.
  3. Sticky sessions by connection ID, not client IP. Assign each connection a UUID and route reconnection attempts (identified via a query parameter) back to the same backend if it is still healthy. Enables server-side state recovery.
  4. Publish-subscribe fan-out. The WebSocket server subscribes to a shared message bus (Redis pub/sub, NATS, Kafka). State lives in the bus, not in the server process. A backend restart only loses in-flight frames, not application state.

A Concrete Use Case: Live Sports Odds Aggregation

An odds comparison platform needs to receive real-time price updates from 40+ sportsbooks across 12 jurisdictions. Most sportsbooks expose WebSocket endpoints to their web clients, and scraping HTTP endpoints via polling would miss price changes that happen within the poll interval. The platform needs to:

  1. Connect to each sportsbook from an IP in an allowed jurisdiction for that book.
  2. Maintain the connection through normal operation, with automatic reconnect on drop.
  3. Handle the sportsbook's anti-bot layer, which often inspects the WebSocket subprotocol and custom headers during the upgrade.

The practical architecture uses one proxy session per target sportsbook, with residential proxies for jurisdictions that block datacenter traffic and ISP proxies for jurisdictions that do not. The proxy session is sticky for the lifetime of the WebSocket connection; rotating the IP mid-stream would terminate the connection because the target's session affinity is tied to the source IP.

Per-Message Framing and Buffering

WebSocket frames have a 2 to 14 byte header and a payload up to 2^63 bytes in theory, though practical implementations cap at 16 MB or so. The frame header includes an opcode (text, binary, ping, pong, close), a mask flag (clients must mask, servers must not, per RFC 6455 Section 5.3), and a length field.

A proxy that splices bytes at the TCP layer never needs to parse frames. A proxy that wants to inspect or modify traffic has to. Inspecting WebSocket traffic is rare in commercial proxy infrastructure because it breaks the abstraction, but it is common in enterprise security deployments that want to log or filter WebSocket content for DLP purposes.

One subtle frame-level issue: the server must not send a frame larger than the client's buffer, and the client must not send a frame larger than the server's buffer. A misconfigured proxy with a small intermediate buffer can silently drop oversized frames. If your WebSocket application sometimes loses large messages under load, check the proxy's buffer configuration before blaming the application.

Health Checks That Actually Work

You cannot health-check a WebSocket backend by sending it an HTTP GET. The whole point of the service is that it speaks WebSocket. Two patterns work:

  1. Ping frames. The load balancer opens a WebSocket connection to the backend, sends a ping frame, and expects a pong frame back within 100 ms. This verifies the entire upgrade path and the backend's frame handling.
  2. Sidecar HTTP endpoint. The backend exposes /healthz on the same port or an adjacent port, returning a simple 200 OK. The load balancer health-checks the HTTP endpoint and trusts that a healthy HTTP stack implies a healthy WebSocket stack. Simpler to implement, but can miss bugs specific to the WebSocket handling code.

Conclusion

Proxying WebSockets is a well-understood problem with well-understood solutions, but the defaults in most off-the-shelf tools are wrong for long-lived connections. If you are running a real-time data pipeline through a proxy, verify three things before you go to production: that the Upgrade header actually reaches the upstream, that the timeouts on every hop exceed your longest expected idle period, and that your reconnect strategy does not create a thundering herd when a backend restarts. Get those right and WebSocket over proxy is reliable. Get them wrong and you will be debugging silent disconnects at 3 a.m.