MQTT QoS levels trade reliability for battery life: QoS 0 (fire-and-forget) uses minimal power but may lose messages, QoS 1 (acknowledged) guarantees delivery but may duplicate, and QoS 2 (exactly-once) prevents both loss and duplication at 3x the battery cost. Most sensor telemetry is replaceable, so QoS 0 is usually correct; reserve QoS 1/2 for critical alerts and commands.
20.1 Quality of Service (QoS) Levels
MQTT provides three Quality of Service levels that control message delivery guarantees. This chapter explores each level’s characteristics, use cases, and impact on battery life and performance.
20.2 Learning Objectives
By the end of this chapter, you will be able to:
Compare QoS Levels: Analyze the trade-offs between QoS 0, 1, and 2 in terms of reliability, latency, and energy cost
Select Appropriate QoS: Evaluate IoT scenarios and justify the choice of QoS level for each
Calculate Battery Impact: Apply the energy cost formulas to calculate battery life for different QoS configurations
Design Reliable Systems: Construct IoT architectures that correctly configure QoS levels and handle delivery limitations
Distinguish Delivery Semantics: Explain the difference between hop-by-hop and end-to-end delivery guarantees
Diagnose QoS Pitfalls: Identify and fix common misconfigurations such as using QoS 2 unnecessarily or ignoring cleanSession behavior
Key Concepts
QoS 0 (At Most Once): Fire-and-forget delivery with no acknowledgment — lowest overhead, message loss possible
QoS 1 (At Least Once): ACK-based delivery ensuring message arrives at least once — duplicate possible if PUBACK lost
Message Ordering: MQTT guarantees ordered delivery within a client session but not across sessions or publishers
20.3 Most Valuable Understanding (MVU)
MQTT QoS levels trade reliability for battery life: QoS 0 (fire-and-forget) uses minimal power but may lose messages, QoS 1 (acknowledged) guarantees delivery but may duplicate, QoS 2 (exactly once) guarantees single delivery but costs 3x the battery.
For battery-powered IoT sensors sending data every few seconds, choosing QoS 0 instead of QoS 2 can extend battery life by 3x or more. The key insight is that most sensor telemetry is replaceable (the next reading comes soon anyway), so QoS 0 is usually correct. Reserve QoS 1/2 only for critical commands or events.
Remember: QoS guarantees delivery between publisher-to-broker and broker-to-subscriber independently - it does NOT guarantee end-to-end delivery if the subscriber is offline.
Hey there, future IoT wizard! Let’s learn about MQTT QoS with the Sensor Squad!
Sammy the Sensor needs to send temperature readings to Cloudy the Cloud Server. But Sammy lives far away, and sometimes messages get lost on the journey!
Think of it like sending a letter:
QoS 0 - The Paper Airplane Spell: Sammy folds the message into a paper airplane and throws it toward Cloudy. Maybe it arrives, maybe the wind blows it away. Super fast, barely uses any energy! “Fire and forget!”
QoS 1 - The Carrier Pigeon Spell: Sammy sends a carrier pigeon. The pigeon flies to Cloudy, and when Cloudy gets the message, the pigeon flies back with a “Got it!” note. If Sammy doesn’t get the note, the pigeon goes again! Sometimes Cloudy gets TWO copies, but at least it arrives!
QoS 2 - The Magic Portal Spell: Sammy creates a special magic portal. First, Sammy says “Ready to send!” Cloudy says “Ready to receive!” Then Sammy says “Sending now!” and Cloudy says “Got it exactly once!” Four messages total, but guaranteed perfect delivery!
The Sensor Squad Energy Challenge:
Spell
Energy Used
Delivery Guarantee
Best For
Paper Airplane (QoS 0)
1 battery bar
Maybe arrives
Temperature every minute
Carrier Pigeon (QoS 1)
2 battery bars
Definitely arrives (maybe twice)
“Door opened!” alerts
Magic Portal (QoS 2)
3 battery bars
Arrives exactly once
“Unlock the door!” commands
Lila the Light says: “I use QoS 0 to tell everyone my brightness level every second. If one message gets lost, another comes right away!”
Max the Motor adds: “But when I get a ‘STOP!’ command, I need QoS 2! If I get ‘STOP’ twice, I might get confused. If I miss ‘STOP’, something might break!”
Fun Fact: With the same battery and the same publish rate, QoS 0 can last about 3 times longer than QoS 2 — because QoS 2 needs 4 times as many radio messages per publish! That can mean months of extra battery life on a real sensor.
Try This: Think about your favorite smart device at home. What kind of messages does it send? Would missing a message matter? Which QoS spell would YOU choose?
For Beginners: Understanding Message Delivery Guarantees
Analogy: QoS is Like Shipping Options
Imagine you’re running an online store that ships packages:
Shipping Option
MQTT Equivalent
How It Works
When to Use
Standard Mail
QoS 0
Drop in mailbox, hope it arrives. No tracking.
Catalogs, flyers (replaceable)
Tracked Delivery
QoS 1
Delivery confirmation. Might accidentally ship twice if tracking glitches.
Most packages (duplicates annoying but OK)
Certified with Signature
QoS 2
Full handshake: “Ready?” “Yes!” “Sending!” “Received!”
Legal documents (exactly once critical)
Key Terms:
Term
Definition
QoS
Quality of Service - the level of delivery guarantee
PUBACK
The acknowledgment message for QoS 1
Handshake
Exchange of messages to confirm delivery
Idempotent
An operation that produces the same result even if repeated
Fire-and-forget
Sending without waiting for confirmation
Retained message
A message stored by the broker for new subscribers
Why does this matter in IoT?
Challenge
QoS Solution
Battery life
Use QoS 0 for telemetry to minimize radio time
Network reliability
Use QoS 1/2 on unreliable connections
Critical commands
Use QoS 2 to prevent duplicate execution
High-frequency data
Use QoS 0 - missing one sample is OK
20.5 QoS Level Summary
QoS
Name
Guarantee
Use Case
0
At most once
Fire and forget
Frequent sensor readings
1
At least once
Acknowledged delivery
Important alerts
2
Exactly once
Guaranteed once
Critical commands
20.5.1 QoS 0: At Most Once (Fire and Forget)
Figure 20.1: MQTT QoS 0: At most once delivery
20.5.2 QoS 1: At Least Once (Acknowledged)
Figure 20.2: MQTT QoS 1: At least once delivery with acknowledgment
20.5.3 QoS 2: Exactly Once (Four-Way Handshake)
Figure 20.3: MQTT QoS 2: Exactly once delivery with four-way handshake
20.5.4 QoS Comparison Overview
Figure 20.4: Side-by-side comparison of all three QoS levels
Deep Dive: QoS Implementation Details
Understanding how QoS levels work under the hood helps you make informed design decisions for production IoT systems.
Common Misconception: “QoS 2 ensures the subscriber receives the message.” False! QoS guarantees delivery from publisher->broker and broker->subscriber independently. If subscriber is offline, broker queues the message (with Clean Session=0), but subscriber may never reconnect. For true end-to-end confirmation, implement application-level acknowledgments.
Hybrid Approach: Use QoS 0 for periodic telemetry + QoS 1 for critical events. This optimizes both battery life and reliability.
20.6 Knowledge Checks
Interactive Quizzes: Match and Order
Matching Quiz — Match each MQTT QoS concept to its correct definition or use case.
Ordering Quiz — Arrange the steps of the MQTT QoS 2 four-way handshake in the correct sequence.
Worked Example: Battery Life Optimization for Weather Station
Scenario: You’re deploying 200 solar-powered weather stations across a nature reserve. Each station has a 2000 mAh battery and 5W solar panel. Stations publish 4 metrics (temperature, humidity, wind speed, pressure) and must survive 7 consecutive cloudy days with minimal solar input.
Message rate: 4 metrics x (60 min x 24 hr / 0.5 min) = 11,520 messages/day
Step 1: Measure Current Battery Drain
ESP32 power consumption: - Deep sleep: 0.01 mA - Wake + WiFi connect: 120 mA for 2 seconds = 0.067 mAh - Read 4 sensors: 50 mA for 0.5 seconds = 0.007 mAh - MQTT QoS 2 publish (4 messages): 120 mA for 1.2 seconds (4 x 300ms) = 0.040 mAh - Total per cycle: 0.067 + 0.007 + 0.040 = 0.114 mAh
Daily consumption: - Wake cycles: 2880 per day (every 30 seconds) - Battery drain: 2880 x 0.114 = 328 mAh/day
Weather data is not critical – missing a few readings is acceptable. Change to QoS 0: - MQTT QoS 0 publish (4 messages): 120 mA for 0.4 seconds (4 x 100ms) = 0.013 mAh - New total per cycle: 0.067 + 0.007 + 0.013 = 0.087 mAh
Daily consumption: 2880 x 0.087 = 251 mAh/day Battery life: 2000 / 251 = 8.0 days – now meets requirement!
Step 3: Further Optimization (Publish Interval)
Reduce publish frequency to every 5 minutes (weather doesn’t change that fast): - Wake cycles: 288 per day - Daily consumption: 288 x 0.087 = 25 mAh/day - Battery life: 2000 / 25 = 80 days – 10x improvement!
Step 4: Add Smart Publishing
Only publish when data changes significantly:
float lastTemp =0, lastHum =0, lastWind =0, lastPres =0;constfloat TEMP_THRESHOLD =0.5;// 0.5 deg C changeconstfloat HUM_THRESHOLD =2.0;// 2% changeconstunsignedlong HEARTBEAT =300000;// 5 min maxvoid loop(){float temp = readTemp(), hum = readHum(), wind = readWind(), pres = readPres();// Only publish changed metricsif(abs(temp - lastTemp)>= TEMP_THRESHOLD){ client.publish("station/temp", String(temp).c_str(),false); lastTemp = temp;}// Repeat for other sensors...// Heartbeat: publish all every 5 minutes regardlessif(millis()- lastPublish >= HEARTBEAT){ publishAll();} esp_sleep_enable_timer_wakeup(60*1000000);// Sleep 60 seconds esp_deep_sleep_start();}
Expected change-based messages: ~50/day (weather is relatively stable) - Daily consumption: 50 x 0.087 = 4.35 mAh/day - Battery life: 2000 / 4.35 = 460 days (15 months)!
Summary Table:
Configuration
Messages/Day
Battery Drain
Battery Life
Cost Savings
QoS 2, 30s
11,520
328 mAh
6.1 days
Baseline
QoS 0, 30s
11,520
251 mAh
8.0 days
31% longer
QoS 0, 5min
1,152
25 mAh
80 days
13x longer
QoS 0, on-change
~50
4.35 mAh
460 days
75x longer!
Key Lessons:
QoS 0 vs QoS 2: 31% battery savings for replaceable telemetry
Reduce frequency: 10x improvement by publishing every 5 min instead of 30 sec
Publish on change: 75x improvement by only transmitting meaningful updates
Combined impact: 6 days to 460 days = practical solar-powered deployment
20.7 Battery Life Optimization Tips
Optimizing Battery Through QoS and Publish Strategy
Increase publish interval: Publishing every 5 minutes instead of every minute gives 5x battery life
Use QoS 0 for sensor data: Non-critical readings don’t need guaranteed delivery
Batch messages: Collect multiple readings and publish together
Answer a few questions about your IoT scenario to get a QoS recommendation:
Show code
viewof dataType = Inputs.radio( ["Periodic telemetry (temperature, humidity, etc.)","Event-driven alerts (door open, motion)","Device commands (lock, unlock, dispense)","Financial transactions (payments, billing)"], {label:"What type of data are you sending?",value:"Periodic telemetry (temperature, humidity, etc.)"})viewof frequency = Inputs.radio( ["High frequency (< 1 minute)","Medium frequency (1-10 minutes)","Low frequency (> 10 minutes)","On-demand only"], {label:"How often is data sent?",value:"High frequency (< 1 minute)"})viewof dupTolerance = Inputs.radio( ["Duplicates are harmless","Duplicates are annoying but manageable","Duplicates cause real problems"], {label:"How tolerant is your system of duplicate messages?",value:"Duplicates are harmless"})viewof batteryPowered = Inputs.toggle({label:"Device is battery-powered?",value:true})viewof networkQuality = Inputs.radio( ["Stable (WiFi, Ethernet)","Variable (cellular, satellite)","Unreliable (LoRa, remote sites)"], {label:"Network quality:",value:"Stable (WiFi, Ethernet)"})recommendation = {let score =0;if (dataType.includes("Periodic")) score +=0;elseif (dataType.includes("Event")) score +=1;elseif (dataType.includes("commands")) score +=2;else score +=3;if (frequency.includes("High")) score +=0;elseif (frequency.includes("Medium")) score +=0;elseif (frequency.includes("Low")) score +=1;else score +=1;if (dupTolerance.includes("harmless")) score +=0;elseif (dupTolerance.includes("annoying")) score +=1;else score +=3;if (batteryPowered) score -=1;if (networkQuality.includes("Unreliable")) score +=1;let qos, color, reasoning;if (score <=1) { qos ="QoS 0 (Fire-and-forget)"; color ="#16A085"; reasoning ="Your data is replaceable, sent frequently, and duplicates are tolerable. QoS 0 maximizes battery life and throughput with minimal overhead."; } elseif (score <=4) { qos ="QoS 1 (At least once)"; color ="#E67E22"; reasoning ="Your messages are important enough to require delivery confirmation but duplicates can be handled. QoS 1 provides a good balance of reliability and efficiency."; } else { qos ="QoS 2 (Exactly once)"; color ="#E74C3C"; reasoning ="Your use case involves critical, non-idempotent operations where duplicates could cause real problems. QoS 2 guarantees exactly-once delivery at the cost of higher latency and battery consumption."; }return { qos, color, reasoning, score };}html`<div style="background: linear-gradient(135deg, #f8f9fa 0%, #e9ecef 100%); border-radius: 8px; padding: 20px; margin: 10px 0; border-left: 4px solid ${recommendation.color};"><h4 style="color: ${recommendation.color}; margin-top: 0;">Recommended: ${recommendation.qos}</h4><p style="color: #2C3E50; margin: 8px 0;">${recommendation.reasoning}</p>${recommendation.score<=4?`<p style="color: #7F8C8D; font-size: 13px; margin: 8px 0 0 0;"><strong>Tip:</strong> Consider designing your commands to be idempotent (safe to repeat). This lets you use QoS 1 instead of QoS 2 in most cases, saving battery and reducing latency.</p>`:`<p style="color: #7F8C8D; font-size: 13px; margin: 8px 0 0 0;"><strong>Important:</strong> QoS 2 guarantees delivery between publisher-broker and broker-subscriber independently. For true end-to-end confirmation, add application-level acknowledgments.</p>`}</div>`
20.12 Broker Message Queue Size Estimator
Estimate broker memory requirements for offline message queuing with persistent sessions:
Mistake: Developers often assume that QoS 2 means “the subscriber definitely receives the message.”
Reality: QoS guarantees are hop-by-hop, not end-to-end: - Publisher to Broker: Guaranteed - Broker to Subscriber: Guaranteed only if subscriber is connected
If the subscriber is offline, the message waits in the broker (with Clean Session=false). But if the broker crashes, or the subscriber never reconnects, the message is lost.
Solution: Implement application-level acknowledgments where subscribers publish a “received” message back to the publisher for critical workflows.
Pitfall 2: Using QoS 2 for All Critical Data
Mistake: “This data is important, so I’ll use QoS 2 for everything.”
Reality: QoS 2 adds significant overhead (3x latency, 3x battery consumption) and should rarely be needed.
Better approach:
Make commands idempotent (safe to receive multiple times)
Use QoS 1 with idempotent design instead of QoS 2
Reserve QoS 2 only for truly non-idempotent operations (financial transactions, medication dispensing)
Example: Instead of toggle_light (dangerous to duplicate), use set_light_state: on (safe to repeat).
Pitfall 3: Ignoring Clean Session Impact
Mistake: Setting cleanSession=true (the default in many clients) and expecting messages to be queued while offline.
Reality: With cleanSession=true: - Broker discards all session state on disconnect - QoS 1/2 messages published while you’re offline are lost - Subscriptions are forgotten
Solution: Use cleanSession=false for devices that may disconnect and reconnect:
client.connect(host, cleanSession=False) # Keep session state
Caveat: This increases broker memory usage - the broker stores messages for offline clients.
Pitfall 4: Wrong QoS on Retained Messages
Mistake: Publishing a retained message with QoS 0, then wondering why new subscribers sometimes don’t get it.
Reality: Retained messages with QoS 0 can be lost during broker restarts or when storage limits are reached.
Solution: Use at least QoS 1 for retained messages that must persist:
Scenario: You’re running a critical smart factory with 500 sensors publishing to a single MQTT broker. At 2 AM, the broker server crashes due to a power failure.
What happens:
Publishers keep trying to connect with exponential backoff (1s, 2s, 4s, 8s…)
Messages are lost unless devices implement local buffering (QoS 1/2 don’t help if broker is unreachable)
Subscribers disconnect and enter retry loop, showing stale data in dashboards
Critical alerts missed: Fire alarm sensor can’t notify emergency system
Recovery surge: When broker restarts, 500 devices reconnect simultaneously, potentially overwhelming it again
Lessons learned:
For critical IoT: Use clustered brokers with high availability (HiveMQ, AWS IoT Core, Azure IoT Hub)
Local buffering: Implement client-side message queues that store-and-forward when connection restores
Hybrid approach: Run a local backup broker that syncs with cloud when available
Graceful degradation: Design devices to work autonomously when disconnected
Monitoring: Set up broker health checks and automatic failover
Best practice: For production systems, never rely on a single broker. Use at least 2 brokers behind a load balancer, or a managed cloud service with built-in redundancy.
🏷️ Label the Diagram
💻 Code Challenge
20.16 Summary and Key Takeaways
20.16.1 What You Learned
This chapter covered MQTT Quality of Service levels - the mechanism that controls message delivery guarantees between publishers, brokers, and subscribers.
MQTT QoS summary
20.16.2 Key Takeaways
One-Sentence Summary
MQTT QoS levels let you trade message reliability for battery life: use QoS 0 for replaceable telemetry (95% of IoT), QoS 1 for important events, and QoS 2 only for critical non-idempotent commands.
Essential Points to Remember:
QoS 0 (“fire and forget”) - Fastest, lowest battery, but messages may be lost. Use for frequent, replaceable sensor data.
QoS 1 (“at least once”) - Guaranteed delivery with possible duplicates. Use for important events where duplicates are acceptable.
QoS 2 (“exactly once”) - Perfect delivery with four-way handshake. Use only when duplicates cause real problems (payments, medication dispensing).
Battery impact is dramatic - QoS 0 gives ~3x longer battery life than QoS 2 on the same message rate.
QoS guarantees are hop-by-hop - Publisher to Broker and Broker to Subscriber are independent. Offline subscribers miss messages unless retained.
Design for idempotency - Most QoS 2 needs can be avoided by designing commands that are safe to receive multiple times (set_state: on instead of toggle).
20.16.3 Quick Reference
QoS
Messages
Latency
Battery Impact
Use Case
0
2
~8ms
1x (baseline)
Temperature every minute
1
4
~24ms
2x
Door opened alert
2
8
~67ms
3x
Payment transaction
20.16.4 Practical Recommendations
For most IoT deployments:
Start with QoS 0 for all telemetry data
Use QoS 1 only for events that trigger human action (alerts, alarms)
Avoid QoS 2 unless you’ve confirmed the operation is truly non-idempotent
Implement application-level acknowledgments for true end-to-end confirmation
Use clustered brokers or managed cloud services for production systems