19  MQTT Advanced Topics

In 60 Seconds

MQTT’s binary packet format uses a compact 2-byte minimum fixed header with variable-length encoding to minimize overhead on constrained networks. This chapter covers packet structure internals, scalable topic hierarchy design patterns, bandwidth optimization techniques, and MQTT 5.0 features like message expiry, topic aliases, and shared subscriptions.

19.1 Learning Objectives

By the end of this chapter, you will be able to:

  • Analyze MQTT Packet Structure: Decode binary MQTT packets field by field and calculate exact byte overhead for given topic lengths and QoS levels
  • Design Topic Hierarchies: Construct scalable, maintainable topic naming conventions that support wildcard queries for large device fleets
  • Calculate Bandwidth Savings: Apply optimization techniques — short topic names, binary payloads, topic aliases — and quantify savings across a device fleet
  • Implement MQTT 5.0 Features: Configure message expiry, topic aliases, and shared subscriptions in code and explain when each feature is appropriate
  • Evaluate Broker Options: Compare Mosquitto, EMQX, HiveMQ, and cloud brokers and justify selection based on device count, message throughput, and cost
  • Diagnose Capacity Requirements: Assess connection counts, RAM needs, and bandwidth demand for a production IoT deployment
  • Topic: UTF-8 string hierarchy (e.g., sensors/building-A/room-101/temperature) routing messages to subscribers
  • Topic Level: Segment between / separators — each level represents a dimension of the topic hierarchy
  • Single-Level Wildcard (+): Matches exactly one topic level: sensors/+/temperature matches sensors/room1/temperature
  • Multi-Level Wildcard (#): Matches remaining levels: sensors/# matches all topics starting with sensors/
  • Retained Message: Last message stored per topic — new subscribers immediately receive current state on subscription
  • Topic Hierarchy Design: Best practice: device-type/device-id/measurement enables fine-grained subscription filtering
  • $SYS Topics: Reserved broker system topics (e.g., $SYS/broker/clients/connected) publishing broker statistics

19.2 For Beginners: MQTT Advanced Concepts

Beyond basic publish-subscribe, MQTT offers powerful features for building robust IoT systems. Retained messages store the last value so new subscribers get data immediately. Last will messages announce when a device goes offline. These features turn simple messaging into a reliable IoT communication platform.

“I just learned about retained messages and they’re amazing!” exclaimed Sammy the Sensor. “When I publish my temperature with the retain flag, the broker remembers it. So when a new phone app connects at midnight, it instantly sees ‘22 degrees’ instead of waiting until my next reading.”

Lila the LED shared her discovery: “And I set up a Last Will message! When I connect to the broker, I say: ‘If I ever disconnect unexpectedly, tell everyone that Lila is offline.’ So if the power goes out, the monitoring system knows immediately – even though I can’t send messages anymore because I’m off!”

“My favorite,” said Bella the Battery, “is clean session = false. When I go to sleep to save power, the broker holds all messages that arrive while I’m napping. When I wake up, I get everything I missed – like checking your text messages after airplane mode. Nothing gets lost during my power naps!”

Max the Microcontroller summed up: “These aren’t just nice extras – they solve real problems. Retained messages prevent stale data. Last will detects failures. Persistent sessions handle intermittent connections. That’s why MQTT runs billions of IoT devices worldwide!”

19.3 Prerequisites

Before diving into this chapter, you should be familiar with:

19.4 MQTT Packet Structure

Understanding MQTT’s binary packet format is essential for protocol debugging and optimization.

19.4.1 Fixed Header (All Packets)

Every MQTT packet begins with a 2-byte minimum fixed header:

Bit Position 7 6 5 4 3 2 1 0
Byte 1 MsgType[3] MsgType[2] MsgType[1] MsgType[0] DUP QoS[1] QoS[0] RETAIN
Byte 2+ Remaining Length (1-4 bytes, variable-length encoded)

19.4.2 Fixed Header Fields

Field Size Description Values
Message Type 4 bits Packet type 1=CONNECT, 3=PUBLISH, 8=SUBSCRIBE
DUP 1 bit Duplicate flag 0=First, 1=Duplicate
QoS Level 2 bits Quality of Service 00=QoS 0, 01=QoS 1, 10=QoS 2
RETAIN 1 bit Retained message 0=No, 1=Yes
Remaining Length 1-4 bytes Remaining packet size 0-268,435,455 bytes

19.4.3 Remaining Length Encoding

MQTT uses variable-length encoding to minimize overhead:

Value Range Bytes Needed Example
0 - 127 1 byte 23 -> 0x17
128 - 16,383 2 bytes 200 -> 0xC8 0x01
16,384 - 2,097,151 3 bytes 50,000 -> 0xD0 0x86 0x03
2,097,152 - 268,435,455 4 bytes 1,000,000 -> 0xC0 0x84 0x3D

19.4.4 PUBLISH Packet Example

Topic: "sensor/temp"
Payload: "25.5"
QoS: 1

Hex dump:
32 13 00 0B 73 65 6E 73 6F 72 2F 74 65 6D 70 00 01 32 35 2E 35

Breakdown:
32        -> Fixed header: PUBLISH (0011), DUP=0, QoS=01, RETAIN=0
           (0x32 = 0011 0010: bits 7-4 = message type 3, bit 3 = DUP=0, bits 2-1 = QoS 1, bit 0 = RETAIN=0)
13        -> Remaining length: 19 bytes
           (2 topic-length + 11 topic + 2 packet-ID + 4 payload = 19 = 0x13)
00 0B     -> Topic length: 11 bytes
73 65 6E 73 6F 72 2F 74 65 6D 70  -> "sensor/temp" (UTF-8)
00 01     -> Packet ID: 1 (required for QoS 1 acknowledgment)
32 35 2E 35  -> "25.5" (UTF-8 payload)

19.4.5 Control Packet Types

Type Name Direction Purpose
1 CONNECT Client->Broker Connection request
2 CONNACK Broker->Client Connection acknowledgment
3 PUBLISH Both Publish message
4 PUBACK Both QoS 1 acknowledgment
5 PUBREC Both QoS 2 step 1
6 PUBREL Both QoS 2 step 2
7 PUBCOMP Both QoS 2 step 3
8 SUBSCRIBE Client->Broker Subscribe to topics
9 SUBACK Broker->Client Subscribe acknowledgment
12 PINGREQ Client->Broker Keep-alive ping
13 PINGRESP Broker->Client Ping response
14 DISCONNECT Client->Broker Graceful disconnect

19.5 Packet Size Optimization

Tips for Battery-Powered Devices
  1. Keep topic names short - h/l/t vs home/living_room/temperature saves 20 bytes
  2. Use binary payloads - 0x19 (1 byte) vs "25" (2 bytes)
  3. Choose appropriate QoS - QoS 0 uses 50% less messages than QoS 1
  4. Limit retained messages - Only essential status topics
  5. Increase keep-alive interval - 300s vs 60s = 80% fewer PINGREQ packets

Example savings:

  • Topic: h/b/t (5 bytes) vs home/bedroom/temperature (23 bytes) = 18 bytes saved
  • 100 messages/day x 365 days = 657 KB saved per year per device
  • For 1000 devices = 641 MB saved annually

Every MQTT message includes the full topic string. Shorter topics save bandwidth:

Message size with topic: $ S_{} = S_{} + S_{} + S_{} $

Where: - \(S_{\text{fixed}}\) = 4 bytes (MQTT fixed header + topic length field) - \(S_{\text{topic}}\) = topic string length in bytes - \(S_{\text{payload}}\) = payload size

Long vs short topic comparison (100 messages/day, 1 year):

Long topic: building/floor3/room305/sensors/temperature (42 bytes) $ S_{} = 4 + 42 + 10 = 56 $

Short topic: b/3/305/t (8 bytes) $ S_{} = 4 + 8 + 10 = 22 $

Savings per message: \(56 - 22 = 34\text{ bytes}\) (61% reduction)

Annual bandwidth savings (1000 sensors @ 100 msg/day): $ = 1000 = 1.16 $

Energy savings (cellular @ 8 mA TX, 1 byte = 32 μs @ 250 kbps): $ E_{} = 34 = 8.7 $

Over a year: \(36,500\text{ msgs} \times 8.7\text{ μAs} = 317.6\text{ mAs} \approx 0.088\text{ mAh}\)

Cellular data cost savings ($0.10/MB): \(1.16\text{ GB} = 1{,}188\text{ MB}\)

\(1{,}188\text{ MB} \times \$0.10\text{/MB} = \$118.80\text{ per year for fleet}\)

Lesson: Topic naming conventions have real operational costs. Short, hierarchical topics save bandwidth, energy, and money at scale.

19.6 Worked Example: Smart Building Topic Design

19.7 Designing Topic Hierarchy for Smart Building

Scenario: Design the MQTT topic structure for a 10-story commercial office building. Each floor has 20 rooms with temperature sensors, occupancy detectors, and smart lighting.

19.7.1 Step 1: Identify Requirements

  • Telemetry: Temperature, humidity, occupancy from 200 rooms
  • Commands: Control lights, blinds, HVAC per room
  • Status: Online/offline for 600+ devices
  • Alerts: Fire alarms, security events
  • Access patterns: Dashboard shows all temps, HVAC controls one floor

19.7.2 Step 2: Design Base Structure

Bad approach (flat topics):

sensor_floor1_room101_temp     # No hierarchy, can't use wildcards
sensor_floor1_room101_humidity # 600+ individual subscriptions!

Good approach (hierarchical):

building/floor1/room101/sensors/temperature
building/floor1/room101/sensors/humidity
building/floor1/room101/lights/status
building/floor1/room101/lights/command

19.7.3 Step 3: Apply Naming Conventions

{building}/{floor}/{room}/{device_type}/{measurement_or_action}

Examples:
building/floor03/room305/sensors/temperature    # Telemetry
building/floor03/room305/lights/command         # Control
building/floor03/room305/lights/status          # State
building/floor03/hvac/setpoint                  # Zone control
building/alerts/fire                            # Building-wide

Naming rules:

  1. Use lowercase with no spaces
  2. Use / as separator only
  3. Pad numbers for sorting: floor03, not floor3
  4. End with action type: /temperature, /command, /status

19.7.4 Step 4: Plan Wildcard Subscriptions

Use Case Subscription Pattern Matches
All temps (dashboard) building/+/+/sensors/temperature 200 topics
One floor’s sensors building/floor05/+/sensors/# 40 topics
One room’s everything building/floor03/room305/# ~10 topics
All alerts building/alerts/# Fire, security

19.7.5 Step 5: Handle Edge Cases

# Shared spaces (no room number)
building/floor01/lobby/sensors/occupancy
building/stairwell-a/sensors/smoke

# Building-wide systems
building/hvac/chiller/status
building/elevator/car1/position
building/energy/meter/consumption

# System topics
$SYS/broker/clients/connected
building/$status/gateway/floor03

19.7.6 Result: Topic Hierarchy

MQTT topic hierarchy
Figure 19.1: Smart building MQTT topic hierarchy

Key design decisions:

  1. Physical hierarchy (building/floor/room) enables location-based queries
  2. Device type grouping (sensors, lights) separates telemetry from control
  3. Action suffixes (status, command) distinguish read vs write
  4. Padded numbers (floor03) ensure correct sorting

19.8 Worked Example: Fleet Tracking Topics

Designing Topics for Delivery Truck Fleet

Scenario: 500 delivery trucks with GPS, fuel level, and engine temperature sensors. Dispatch needs individual truck queries and aggregate fleet data.

Step 1: Define topic structure

fleet/{truck_id}/{sensor_type}

Examples:
fleet/truck-001/gps
fleet/truck-001/fuel
fleet/truck-001/temp

Step 2: Enable efficient queries

Query Subscription Why It Works
All data from truck-001 fleet/truck-001/# Single subscription
All GPS data fleet/+/gps Cross-fleet GPS
All data fleet/# Fleet dashboard

Step 3: Add metadata topics

fleet/truck-001/status        # online/offline (retained)
fleet/truck-001/location/city # Current city (retained)
fleet/alerts/breakdown        # Fleet-wide alerts

19.9 MQTT 5.0 Features

MQTT 5.0 introduced significant enhancements for enterprise IoT:

19.9.1 Message Expiry (TTL)

# MQTT 5.0: Message expires after 60 seconds
publish_properties = Properties(PacketTypes.PUBLISH)
publish_properties.MessageExpiryInterval = 60  # seconds

client.publish(
    "sensors/temperature",
    "25.5",
    qos=1,
    properties=publish_properties
)
# If subscriber is offline > 60 seconds, message is discarded

Use case: Sensor readings that become stale quickly (GPS, real-time status).

19.9.2 Topic Aliases (Bandwidth Optimization)

# First message: Establish alias
publish_properties = Properties(PacketTypes.PUBLISH)
publish_properties.TopicAlias = 1

client.publish(
    "building/floor03/room305/sensors/temperature",  # 42 bytes
    "25.5",
    properties=publish_properties
)

# Subsequent messages: Use alias only
publish_properties.TopicAlias = 1
client.publish(
    "",  # Empty topic, use alias (saves 42 bytes!)
    "25.6",
    properties=publish_properties
)

Savings: 1000 messages/hour with 40-byte topics = 40 KB/hour saved per device.

19.9.3 Shared Subscriptions (Load Balancing)

# Three workers share subscription to same topic
Worker-1: SUBSCRIBE "$share/workers/sensors/temperature"
Worker-2: SUBSCRIBE "$share/workers/sensors/temperature"
Worker-3: SUBSCRIBE "$share/workers/sensors/temperature"

# Broker distributes messages round-robin:
Message 1 -> Worker-1
Message 2 -> Worker-2
Message 3 -> Worker-3
Message 4 -> Worker-1 (cycles)

19.9.4 Request/Response Pattern

# Requester: Send command with response topic
request_props = Properties(PacketTypes.PUBLISH)
request_props.ResponseTopic = "devices/sensor001/response"
request_props.CorrelationData = b"request-123"

client.publish(
    "devices/sensor001/command",
    '{"cmd": "get_config"}',
    properties=request_props
)

# Responder: Reply to specified topic
def on_message(client, userdata, msg):
    cmd = json.loads(msg.payload)
    if cmd["cmd"] == "get_config":
        response_props = Properties(PacketTypes.PUBLISH)
        response_props.CorrelationData = msg.properties.CorrelationData

        client.publish(
            msg.properties.ResponseTopic,
            '{"interval": 60, "qos": 1}',
            properties=response_props
        )

19.9.5 Feature Comparison

Feature MQTT 3.1.1 MQTT 5.0
Message expiry Not supported Built-in TTL
Reason codes 1 (success/fail) 256 detailed codes
User properties Encode in payload Native support
Topic aliases Not supported Up to 65535 aliases
Shared subscriptions Broker-specific Standardized
Flow control Not supported Built-in

Recommendation: Use MQTT 5.0 for new projects. Fall back to 3.1.1 only for legacy compatibility.

19.10 Broker Selection

19.10.2 Selection Criteria

Device Count Recommended Broker
< 1,000 devices Mosquitto (simple, free)
1,000 - 100,000 EMQX or VerneMQ
> 100,000 HiveMQ, AWS IoT Core
Multi-cloud Self-hosted cluster

19.11 Worked Example: MQTT Broker Capacity Planning

Production Sizing: 50,000-Device Smart Metering Platform

Scenario: A utility company is deploying 50,000 smart electricity meters across a metropolitan area. Each meter reports consumption every 15 minutes and must receive firmware updates and tariff schedules. Design the MQTT infrastructure.

Step 1: Calculate message rates

Inbound (meters -> broker):
  50,000 meters x 4 readings/hour = 200,000 messages/hour
  Peak (all meters reporting in same minute): 50,000/15 = 3,333 msg/sec burst

Outbound (broker -> meters):
  Tariff updates: 50,000 meters x 1 update/day = 2,083 msg/hour
  Firmware: 500 meters/night x 200 chunks = 100,000 msg/night (batched)

Total sustained: ~205,000 messages/hour = 57 messages/second average
Peak: 3,333 messages/second (15-minute boundary)

Step 2: Size the broker

Resource Calculation Requirement
Connections 50,000 persistent + 50 admin + 10 analytics 50,060 concurrent
RAM per connection ~20 KB (session state + subscriptions) 1.0 GB for sessions
Message queue RAM QoS 1 requires store-and-forward 2.0 GB for inflight
Network bandwidth 3,333 msg/sec x 150 bytes avg = 488 KB/sec peak 4 Mbps sustained
Disk (persistent messages) 200,000 msg/hr x 150 bytes x 24 hrs 720 MB/day retention

Step 3: Select broker and topology

Option Configuration Monthly Cost Pros/Cons
EMQX cluster 3 nodes x 8 vCPU, 16 GB RAM 1,800 EUR (self-hosted) Open source, full control, needs DevOps
HiveMQ Cloud Managed, auto-scaling 3,200 EUR Zero ops, SLA guaranteed, vendor lock-in
AWS IoT Core Serverless, pay-per-message 4,100 EUR (at 205K msg/hr) No infrastructure, but 0.08 USD/million messages adds up

Decision: EMQX cluster selected. At 50,000 devices, self-hosted saves 1,400-2,300 EUR/month vs. managed alternatives. Break-even for managed services is below ~15,000 devices where DevOps overhead exceeds subscription cost.

Step 4: Topic structure for operations

utility/{region}/{meter_id}/reading     # QoS 0, every 15 min
utility/{region}/{meter_id}/alert       # QoS 1, tamper/outage events
utility/{region}/{meter_id}/command     # QoS 1, tariff updates
utility/{region}/{meter_id}/firmware    # QoS 1, OTA chunks
utility/{region}/{meter_id}/status      # QoS 0, retained, online/offline

Operations subscriptions:
  utility/north/+/alert    -> NOC dashboard (region-filtered)
  utility/+/+/reading      -> Analytics pipeline (all readings)
  $SYS/broker/#            -> Monitoring (broker health)

Monitoring thresholds:

Metric Warning Critical
Message queue depth > 10,000 > 50,000
Connection rate > 500/sec > 1,000/sec (possible reconnect storm)
Publish latency (p99) > 100 ms > 500 ms
Retained message count > 100,000 > 200,000

Concept Check

Knowledge Check: Match and Sequence

Concept Relationships

Advanced MQTT features connect to both protocol internals and system architecture:

MQTT Protocol Layers:

Advanced Features:

  • MQTT 5.0 Specification - Official standard
  • Shared Subscriptions - Load balancing pattern
  • Message Expiry - Time-to-live for stale data
  • Topic Aliases - Bandwidth optimization

Broker Technologies:

  • Mosquitto - Single-node, learning deployments
  • EMQX - Clustering for 10M+ connections
  • HiveMQ - Enterprise with managed clustering
  • VerneMQ - Distributed Erlang-based broker

System Integration:

  • Message Broker Clustering - HA patterns
  • Load Balancing - Connection distribution
  • Edge Gateway Design - Topic bridging
  • Capacity Planning - Sizing brokers

Prerequisites You Should Know:

  • MQTT packet structure: 2-byte fixed header + variable header + payload
  • Variable-length encoding saves bytes for small messages
  • Broker memory: ~10-20 KB per connection + message queue storage
  • Topic hierarchy depth impacts wildcard matching performance

What This Enables:

  • Optimize bandwidth usage with topic aliases (23 bytes saved per message)
  • Design scalable topic hierarchies supporting wildcard queries
  • Plan broker capacity: connections, message throughput, memory
  • Select appropriate broker for deployment scale (100K vs 10M devices)

See Also

MQTT Protocol Internals:

Broker Comparison:

Topic Design Patterns:

MQTT 5.0 Features:

Implementation Guides:

Try It Yourself

Experiment 1: MQTT Packet Structure Analysis

Capture and analyze MQTT packets with Wireshark:

# Install mosquitto broker and clients
sudo apt install mosquitto mosquitto-clients

# Start Wireshark with MQTT filter
wireshark -f "tcp port 1883" -k

# In another terminal, publish a message
mosquitto_pub -h localhost -t "test/topic" -m "Hello MQTT" -q 1

What to Observe:

  • Fixed header: 2 bytes (0x32 for PUBLISH QoS 1)
  • Variable header: topic length (2 bytes) + topic (10 bytes) + packet ID (2 bytes)
  • Payload: “Hello MQTT” (10 bytes)
  • Total: 26 bytes (minimal overhead!)

Experiment 2: Topic Hierarchy Performance

Compare wildcard matching efficiency:

import paho.mqtt.client as mqtt
import time

# Flat topics (inefficient)
flat_topics = [f"sensor_{i}_temperature" for i in range(1000)]

# Hierarchical topics (efficient)
hierarchical_topics = [f"building/floor{i//100}/room{i%100}/temp" for i in range(1000)]

# Measure subscription time
client = mqtt.Client(mqtt.CallbackAPIVersion.VERSION2)
client.connect("localhost", 1883)

start = time.time()
for topic in flat_topics:
    client.subscribe(topic, qos=0)
flat_time = time.time() - start

# Clear subscriptions
client.disconnect()
client.connect("localhost", 1883)

start = time.time()
client.subscribe("building/+/+/temp", qos=0)  # Single wildcard subscription!
hierarchical_time = time.time() - start

print(f"Flat (1000 subscriptions): {flat_time:.3f}s")
print(f"Hierarchical (1 wildcard): {hierarchical_time:.3f}s")
print(f"Speedup: {flat_time/hierarchical_time:.0f}x faster")

What to Observe:

  • Flat: ~0.5-1.0 seconds for 1,000 subscriptions
  • Hierarchical: ~0.001 seconds for 1 wildcard
  • 500-1000x faster subscription setup!

Experiment 3: MQTT 5.0 Topic Aliases

Measure bandwidth savings with topic aliases (requires MQTT 5.0 broker):

from paho.mqtt.client import Client as MQTTClient, MQTTv5
from paho.mqtt.properties import Properties
from paho.mqtt.packettypes import PacketTypes

client = MQTTClient(callback_api_version=2, protocol=MQTTv5)
client.connect("localhost", 1883)

# First message: establish alias
long_topic = "farm/northfield/zone1/row12/plant45/soil/moisture"
props = Properties(PacketTypes.PUBLISH)
props.TopicAlias = 1

client.publish(long_topic, "25.5", properties=props)
print(f"First message: {len(long_topic)} byte topic")

# Subsequent messages: use alias (empty topic)
for i in range(100):
    props = Properties(PacketTypes.PUBLISH)
    props.TopicAlias = 1
    client.publish("", f"2{i}.{i}", properties=props)  # Empty topic, uses alias!

savings = len(long_topic) * 100  # Bytes saved over 100 messages
print(f"Saved {savings} bytes with topic alias")

What to Observe:

  • Topic name: 50 bytes
  • 100 messages: saves 50 × 100 = 5,000 bytes
  • Critical for LoRaWAN (200 byte/day limit)

Challenge: Broker Capacity Planning

Calculate broker requirements for a smart city deployment:

Given:
- 100,000 streetlights
- Publish every 5 minutes (12 msg/hour each)
- 3 subscribers per message (dashboard, analytics, alerts)
- Average message: 120 bytes

Calculate:
1. Messages per hour
2. Broker fan-out factor
3. Required bandwidth
4. RAM for connections (assume 20 KB per connection)
5. Select appropriate broker (Mosquitto, EMQX, or HiveMQ)

Bonus: Build your own capacity planning calculator!

19.12 Summary

Key takeaways:

  • MQTT packets have minimal overhead (2-byte header minimum)
  • Topic hierarchy enables powerful wildcard queries
  • MQTT 5.0 adds message expiry, topic aliases, and shared subscriptions
  • Choose broker based on scale and feature requirements
  • At 50,000+ devices, self-hosted brokers save 40-55% vs. managed/serverless options

Topic design principles:

  1. Use hierarchical structure for wildcard queries
  2. Physical-then-logical organization
  3. Consistent naming conventions
  4. Plan for future scalability

19.13 What’s Next

Now that you understand MQTT’s advanced features, continue with:

Chapter Focus Why Read It
MQTT Practice and Exercises Hands-on exercises and common pitfalls Apply packet analysis and topic design in guided scenarios
MQTT Comprehensive Review Broker internals and message flow Deepen understanding of how brokers route and store messages at scale
MQTT Labs ESP32 implementation and real hardware Build working MQTT clients and integrate sensors with a live broker
MQTT Security TLS, authentication, and access control Secure your broker and understand how encryption adds overhead
MQTT QoS Levels Acknowledgment flows and session state Understand the packet exchanges underpinning QoS 1 and QoS 2
CoAP Protocol REST-style IoT protocol over UDP Compare with MQTT and choose the right protocol for your use case