MQTT offers three QoS levels that trade reliability for overhead: QoS 0 (fire-and-forget, 1 message) is fastest but may lose data, QoS 1 (at-least-once, 2 messages) guarantees delivery but may duplicate, and QoS 2 (exactly-once, 4 messages) prevents both loss and duplication at 3x the battery cost. Start with QoS 1 for most IoT applications and use QoS 0 for high-frequency telemetry where the next reading replaces the last.
16.1 Learning Objectives
By the end of this chapter, you will be able to:
Distinguish QoS Levels: Explain the differences between MQTT QoS 0, QoS 1, and QoS 2 and justify when each is appropriate
Calculate Message Overhead: Compute bandwidth, packet counts, and energy costs for each QoS level given real device parameters
Select Appropriate QoS: Evaluate IoT scenarios and apply the correct reliability level based on loss impact, duplicate impact, and battery constraints
Assess Battery Trade-offs: Analyze how QoS level selection affects device power consumption and calculate resulting battery life
Key Concepts
MQTT: Message Queuing Telemetry Transport — pub/sub protocol optimized for constrained IoT devices over unreliable networks
Broker: Central server routing messages from publishers to all matching subscribers by topic pattern
Topic: Hierarchical string (e.g., home/bedroom/temperature) used to route messages to interested subscribers
QoS Level: Quality of Service 0/1/2 trading delivery guarantee for message overhead
Retained Message: Last message on a topic stored by broker for immediate delivery to new subscribers
Last Will and Testament: Pre-configured message published by broker when a client disconnects ungracefully
Persistent Session: Broker stores subscriptions and pending messages allowing clients to resume after disconnection
16.2 For Beginners: MQTT Quality of Service
MQTT QoS (Quality of Service) lets you choose how reliably messages are delivered. QoS 0 is fire-and-forget (fast, no guarantee). QoS 1 guarantees at-least-once delivery. QoS 2 guarantees exactly-once delivery. It is like choosing between a text message, a delivery confirmation email, and a registered letter.
Sensor Squad: Three Levels of Care
“I need all three QoS levels for different things!” said Sammy the Sensor. “My routine temperature readings use QoS 0 – if one gets lost every few minutes, no big deal. The next one is coming soon anyway.”
“But my door lock commands use QoS 1!” said Max the Microcontroller. “When someone says ‘lock the front door,’ that message MUST arrive. QoS 1 means the broker keeps trying until it gets an acknowledgment. Sure, the lock might get the same ‘lock’ command twice, but locking an already-locked door is harmless.”
Lila the LED raised the stakes: “My payment system uses QoS 2 – exactly once. Imagine charging someone’s credit card twice because of a duplicate message! The four-step handshake (PUBLISH, PUBREC, PUBREL, PUBCOMP) is slower, but it guarantees the message arrives exactly once.”
Bella the Battery summed it up: “QoS 0 costs 1 message. QoS 1 costs 2. QoS 2 costs 4. Choose the cheapest level that’s safe for your use case – my battery will thank you!”
16.3 Prerequisites
Before diving into this chapter, you should be familiar with:
Conclusion: QoS 2 uses 3.25× more energy than QoS 0. For battery devices, use QoS 0 for telemetry and QoS 1 for important events; reserve QoS 2 only for truly critical commands.
Packet overhead: 4 messages per side (8 total) Delivery guarantee: Exactly once (no duplicates, guaranteed delivery) Use case: Critical commands (unlock door, dispense medication)
Why 4 steps?
PUBLISH - “Here’s a message”
PUBREC - “I received it, but haven’t processed yet”
PUBREL - “OK, you can process it now”
PUBCOMP - “Done processing, we’re complete”
16.5 QoS Trade-off Analysis
Tradeoff: QoS Level Selection
Option A: Use QoS 0 (fire-and-forget) for all telemetry data Option B: Use QoS 1/2 for guaranteed delivery with acknowledgments
Decision Factors:
Factor
QoS 0
QoS 1
QoS 2
Message overhead
1 packet
2 packets
4 packets
Battery impact
Lowest
Medium (+63%)
Highest (+225%)
Delivery guarantee
None
At-least-once
Exactly-once
Duplicate messages
No
Possible
Never
Network traffic
Baseline
+12%
+29%
Broker load
Lowest
2x
4x
Choose QoS 0 when:
High-frequency sensor readings where next value supersedes lost ones
Battery life is critical priority
Data redundancy exists (multiple sensors)
Network is generally reliable
Choose QoS 1 when:
Important notifications and alerts
Audit logging where completeness matters
Commands where idempotent execution is safe
Choose QoS 2 when:
Financial transactions or billing events
Security-critical commands
State changes where duplicate execution is dangerous
Default recommendation: QoS 0 for 95%+ of IoT telemetry; QoS 1 for alerts; QoS 2 only when exactly-once is required
16.6 Interactive Calculators
16.6.1 QoS Energy & Battery Life Calculator
Estimate how each QoS level affects battery life for a coin-cell or battery-powered IoT device.
Show code
viewof battCap = Inputs.range([50,10000], {value:230,step:10,label:"Battery capacity (mAh)"})viewof txCurrent = Inputs.range([1,200], {value:8,step:1,label:"TX current (mA)"})viewof rxCurrent = Inputs.range([1,100], {value:5,step:1,label:"RX current (mA)"})viewof txDuration = Inputs.range([5,500], {value:50,step:5,label:"TX duration per packet (ms)"})viewof msgsPerMin = Inputs.range([0.1,60], {value:1,step:0.1,label:"Messages per minute"})
The Misconception: Many developers assume that “important” IoT data should always use QoS 2 because it provides the strongest guarantee.
The Reality: QoS 2 has a 4x message overhead compared to QoS 0 and can reduce battery life by 70-80% in battery-powered devices.
Real-World Example - Smart Agriculture Deployment:
A large-scale agricultural IoT deployment initially configured 10,000 soil moisture sensors with QoS 2 for all readings because the data was considered “critical for irrigation decisions.”
Results after 3 months:
Battery life: 4.2 months (expected 12-18 months with QoS 0)
Network congestion: 40% of messages delayed >10 seconds
# Publisher configurationTOPICS = {"farm/soil/moisture": {"qos": 0, "retain": True}, # Latest reading for new subscribers"farm/alerts/frost": {"qos": 1, "retain": False}, # Event, not state"farm/valves/+/command": {"qos": 2, "retain": True} # Last command = current state}
Key Insight: QoS 2 usage is <5% of messages but ensures critical commands never duplicate. QoS 0 for 94% of traffic optimizes battery life without compromising reliability.
16.8.8 Cost Implications (AWS IoT Core Pricing)
QoS Level
Monthly Messages
Cost @ $1/million
Annual Cost
QoS 0
432,000
$0.43
$5.18
QoS 1
864,000
$0.86
$10.37
QoS 2
1,728,000
$1.73
$20.74
Key insight: QoS 2 costs 4x more than QoS 0 for the same sensor data.
16.9 Interactive: QoS Level Simulator
Interactive Animation: This visualization is under development.
16.10 Session Management
Pitfall: Ignoring Clean Session Implications
The Mistake: Using clean_session=True without understanding that the broker discards all stored state, causing missed messages after reconnection.
Why It Happens: Many examples use clean_session=True for simplicity. When a device reconnects after network loss, it must re-subscribe, and messages published during disconnection are lost.
The Fix: For most IoT applications, use clean_session=False with a persistent client ID:
# BAD: Clean session loses all state on reconnectclient.connect(broker, clean_session=True)# GOOD: Persistent session retains subscriptions and queued messagesclient.connect(broker, client_id="device-ABC123", clean_session=False)
With persistent sessions:
Subscriptions survive reconnection
QoS 1/2 messages are queued while client is offline
Session expiry interval (MQTT 5.0) controls how long broker retains state
Interactive: MQTT QoS Levels Animation
Common Pitfalls
1. Running MQTT Without TLS
Unencrypted MQTT exposes device credentials and sensor data to network eavesdroppers — in a building IoT deployment on shared WiFi, this means any connected device can read all sensor data. Always enable TLS 1.2+ on the broker and generate unique client certificates for each device class.
2. Ignoring Last Will and Testament Configuration
Without LWT, there is no automatic notification when a device disconnects ungracefully — missed timeout alarms and false-healthy device status are common consequences. Configure LWT on every device connection to publish an offline status message, enabling real-time fleet health monitoring.
3. Using a Single MQTT Connection for High-Throughput Publishing
A single MQTT connection serializes all publishes through one TCP socket — at 100 messages/second with QoS 1, TCP backpressure creates queuing latency. Use multiple parallel MQTT connections or partition topics across connection pools for throughput above 1,000 messages/second.
Label the Diagram
Order the Steps
🔗 Match the Concepts
16.11 What’s Next
Now that you can distinguish MQTT QoS levels and calculate their overhead: