How Does MQTT Keep Billions of Devices Talking?

How Does MQTT Keep Billions of Devices Talking?

A Protocol Born in an Oil Pipeline

In 1999, Andy Stanford-Clark at IBM and Arlen Nipper at Eurotech had a problem: monitor oil pipelines via satellite links. Bandwidth was expensive, connections were unreliable, and the devices had almost no compute power. HTTP was out of the question — too chatty, too heavy. They designed MQTT (Message Queuing Telemetry Transport): a publish/subscribe protocol with a minimum overhead of just 2 bytes per packet.

Twenty-five years later, MQTT is everywhere. AWS IoT Core, Azure IoT Hub, Google Cloud IoT, Tesla's vehicles, Facebook Messenger (until they built their own), smart homes, industrial automation, and an estimated 2.4 billion connected IoT devices. The protocol designed for satellite links turned out to be perfect for the entire Internet of Things.

Facebook used MQTT for Messenger's mobile push notifications because it reduced battery drain by 40% compared to HTTP long polling. The tiny packet overhead and persistent connections made it ideal for mobile devices on flaky cellular networks.

Publish/Subscribe: Not Request/Response

HTTP is request/response: the client asks, the server answers. MQTT is publish/subscribe: clients publish messages to topics, and other clients subscribe to topics they care about. A central broker handles routing.

A temperature sensor publishes to home/livingroom/temperature. Your phone subscribes to home/+/temperature (the + wildcard matches any single level). The broker receives the sensor's message and forwards it to your phone. The sensor does not know or care who is listening. The phone does not know or care who is publishing. This decoupling is powerful.

MQTT Publish/Subscribe Architecture Sensor A (PUB) Sensor B (PUB) Camera (PUB) MQTT Broker Topic routing Session management QoS enforcement Phone (SUB) Dashboard (SUB) Alert Svc (SUB) PUBLISH messages DELIVER to subscribers Publishers and subscribers are decoupled

Figure 1: MQTT's pub/sub model. Publishers send messages to topics via the broker. The broker routes messages to all clients subscribed to matching topics. Publishers and subscribers never communicate directly.

QoS Levels: How Reliable Do You Need It?

MQTT defines three Quality of Service levels that trade reliability for overhead:

  • QoS 0 (At most once): Fire and forget. The message is sent once with no acknowledgment. It might get lost. Use for sensor readings where the next reading is 5 seconds away — missing one is fine.
  • QoS 1 (At least once): The sender retransmits until it gets a PUBACK. The message is guaranteed to arrive but might arrive twice. Use for alerts where duplicates are acceptable.
  • QoS 2 (Exactly once): A four-step handshake (PUBLISH → PUBREC → PUBREL → PUBCOMP) guarantees the message arrives exactly once. Use for billing events or commands where duplicates would cause problems. Expensive — 4 packets instead of 1.

Most IoT deployments use QoS 1. The overhead of QoS 2 (4x the packets) is rarely justified when you can design idempotent message handlers that tolerate duplicates.

Why MQTT Beats HTTP for IoT

The numbers tell the story:

  • Minimum packet overhead: MQTT's fixed header is 2 bytes. HTTP's minimum header is ~200 bytes. On a constrained network, this is the difference between working and not working.
  • Persistent connection: MQTT opens one TCP connection and keeps it alive with tiny PINGREQ/PINGRESP packets (2 bytes each). HTTP/1.1 opens a new connection per request (or reuses with keep-alive, but the overhead per request is still large).
  • Bidirectional: The server can push to the client at any time. No polling needed. HTTP requires long polling, SSE, or WebSockets for server push.
  • Battery friendly: Fewer bytes = less radio time = less battery drain. Critical for devices running on coin cells for years.

Topic Hierarchy and Wildcards

MQTT topics are slash-separated strings like a file path: factory/floor3/machine42/temperature. Two wildcards make subscriptions flexible:

  • + matches exactly one level: factory/+/machine42/temperature matches floor1, floor2, floor3.
  • # matches everything below: factory/# matches all topics starting with factory/.

Retained Messages and Last Will

Retained messages solve the "I just subscribed, what is the current state?" problem. When a publisher sends a message with the retain flag, the broker stores it. Any new subscriber to that topic immediately receives the last retained message, even if it was published hours ago.

Last Will and Testament (LWT) handles ungraceful disconnects. When a client connects, it registers a "last will" message with the broker. If the client disconnects without sending DISCONNECT (crash, network loss), the broker publishes the last will message. Other clients subscribed to the will topic learn that the device went offline.

MQTT 5.0: What Changed

MQTT 5.0 (released 2019) added features the IoT community had been begging for:

  • Reason codes: Every acknowledgment now includes why it succeeded or failed (no more guessing).
  • Shared subscriptions: Load-balance messages across multiple subscribers. Critical for scaling message processing.
  • Message expiry: Messages can have a TTL. Stale sensor readings automatically disappear.
  • Topic aliases: Replace long topic strings with short integers after the first publish, reducing per-message overhead further.
  • User properties: Attach arbitrary key-value metadata to any packet. Headers, basically.

References and Further Reading