The Problem With HTTP for Real-Time
HTTP is request/response: the client asks, the server answers. If you want real-time updates (chat messages, stock prices, game state), the server cannot push to the client unprompted. The workarounds:
- Polling: the client sends a request every N seconds. Wasteful — 90% of requests return "nothing new." N seconds of latency.
- Long polling: the client sends a request, the server holds it open until there's new data (or timeout). Better, but each message requires a new HTTP request with full headers. Connection churn is high.
- Server-Sent Events (SSE): server pushes events to the client over a single HTTP connection. Unidirectional — the client can't send data back on the same connection.
WebSockets solve all of these: a persistent, full-duplex connection where both sides can send messages at any time with minimal overhead (2-14 bytes per frame, vs. ~200+ bytes per HTTP request).
Slack's real-time messaging uses WebSockets. When you type a message, it's sent via WebSocket to the server and broadcast to all connected clients via their WebSocket connections. No polling, no HTTP round trips — just instant delivery.
The WebSocket Handshake
A WebSocket connection starts as a regular HTTP request with an Upgrade header. The client says "I want to upgrade this HTTP connection to WebSocket." The server responds with 101 Switching Protocols. From that point on, the TCP connection is a WebSocket — no more HTTP framing.
// Client request
GET /chat HTTP/1.1
Host: server.example.com
Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Key: dGhlIHNhbXBsZSBub25jZQ==
Sec-WebSocket-Version: 13
// Server response
HTTP/1.1 101 Switching Protocols
Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Accept: s3pPLMBiTxaQ9kYGzzhZRbK+xOo= Figure 1: WebSocket starts with an HTTP upgrade handshake, then becomes a persistent bidirectional channel. Messages have just 2-14 bytes of framing overhead.
Scaling WebSockets
Each WebSocket connection holds a TCP socket open. A single server can handle 100K-1M concurrent connections (limited by file descriptors and memory, not CPU). But when you need to broadcast a message to all connected users, you need every server to know about every connection. Solutions:
- Redis Pub/Sub: each server subscribes to a Redis channel. When a message arrives, it's published to Redis, and every server forwards it to their local connections. Simple, widely used (Socket.IO's Redis adapter).
- NATS: lightweight messaging system built for this use case. Lower latency than Redis Pub/Sub at high message rates.
- Sticky sessions: route all connections from the same user/room to the same server. Eliminates cross-server broadcasting for most cases.
WebSocket vs. SSE: When to Use Which
SSE (Server-Sent Events) is simpler if you only need server-to-client push. SSE works over regular HTTP, supports automatic reconnection, and plays nicely with HTTP/2 multiplexing. Use SSE for dashboards, notifications, live feeds. Use WebSockets when you need bidirectional communication — chat, multiplayer games, collaborative editing.
The Proxy Problem
WebSocket connections can be broken by HTTP proxies, load balancers, and firewalls that don't understand the Upgrade header. Nginx requires explicit WebSocket proxy configuration. AWS ALB supports WebSockets natively but has an idle timeout (default 60 seconds). Corporate proxies sometimes strip the Upgrade header entirely. Socket.IO handles this by falling back to long polling when WebSocket fails.
Discord maintains millions of concurrent WebSocket connections. Each user's client holds a persistent WebSocket to receive messages, typing indicators, presence updates, and voice state changes in real time. The connection stays open for the entire session.