MQTT Quality of Service (QoS) determines how reliably messages are delivered between clients and broker: QoS 0 sends once with no confirmation (like shouting into a crowd), QoS 1 retries until acknowledged (like registered mail), and QoS 2 uses a four-step handshake for exactly-once delivery (like a bank transfer). For most IoT sensor readings, QoS 0 or 1 suffices; reserve QoS 2 for commands where duplication would cause harm, such as billing transactions or actuator controls.
Explain QoS Semantics: Explain what Quality of Service means in MQTT messaging and justify why three distinct delivery guarantee levels exist
Distinguish QoS Levels: Distinguish between QoS 0, 1, and 2 by comparing their handshake sequences, overhead costs, and reliability guarantees
Analyze Session Trade-offs: Analyze the difference between clean and persistent sessions, assessing the impact on broker memory and offline message delivery
Apply Retained Messages: Apply retained messages correctly to device status topics and demonstrate when they provide value over real-time publishing
Select and Justify QoS Choices: Select appropriate QoS levels for common IoT scenarios and justify each selection based on criticality, idempotency, and battery constraints
Key Concepts
QoS 0 (At Most Once): Fire-and-forget delivery with no acknowledgment — lowest overhead, message loss possible
QoS 1 (At Least Once): ACK-based delivery ensuring message arrives at least once — duplicate possible if PUBACK lost
Message Ordering: MQTT guarantees ordered delivery within a client session but not across sessions or publishers
21.2 For Beginners: MQTT QoS Fundamentals
Quality of Service in MQTT determines how hard the system tries to deliver each message. At the lowest level (QoS 0), messages are sent once with no confirmation. At the highest level (QoS 2), a four-step handshake ensures the message arrives exactly once. Choosing the right level is a trade-off between reliability and speed.
Sensor Squad: Understanding Delivery Guarantees
“Why can’t everything just be QoS 2?” asked Sammy the Sensor. “Then nothing would ever get lost!”
Bella the Battery shook her head. “Because QoS 2 uses FOUR messages for every single reading: PUBLISH, PUBREC, PUBREL, PUBCOMP. If you send temperature every 10 seconds, that’s 4 times the radio usage. My battery would drain in weeks instead of months!”
Max the Microcontroller drew it out: “QoS 0 is one arrow: you send it and forget. QoS 1 is two arrows: you send, the broker sends back PUBACK. If you don’t get PUBACK, you resend. QoS 2 is four arrows with a two-phase commit. Each step up doubles the overhead but strengthens the guarantee.”
“The key insight,” said Lila the LED, “is matching QoS to the MESSAGE, not the device. Sammy can use QoS 0 for routine temperature and QoS 1 for critical alerts – in the same connection. You don’t have to pick one QoS for everything. Be smart about it and Bella stays happy!”
Common Pitfalls
1. Using QoS 2 for All IoT Telemetry
QoS 2’s four-message handshake consumes 4× the bandwidth of QoS 0 and adds 2-4 roundtrips of latency — on a 2G connection this can mean 2-8 seconds per message. Reserve QoS 2 for irreplaceable commands (actuator control, firmware triggers); use QoS 1 for telemetry where occasional duplicates are acceptable.
2. Expecting QoS to Guarantee Subscriber Delivery
MQTT QoS governs publisher-to-broker and broker-to-subscriber segments independently — QoS 1 from publisher to broker combined with QoS 0 subscription results in at-most-once overall. Always set both the publish QoS AND the subscribe QoS to the required level.
3. Not Handling Duplicate Messages at QoS 1
QoS 1 guarantees at-least-once delivery — when the broker does not receive PUBACK, it retransmits the message with dup=true flag. Applications storing sensor readings will create duplicate database entries unless they check message IDs or implement idempotent writes.
21.3 What is QoS (Quality of Service)?
Analogy: QoS is like different shipping methods when sending a package.
Three QoS levels = Three shipping methods:
QoS Level
Shipping Analogy
Real Meaning
When to Use
QoS 0
Drop in mailbox
Fire and forget (maybe lost)
Temperature readings every 10 sec
QoS 1
Certified mail
At least once (signature required, might get duplicate)
Door sensor “opened” alert
QoS 2
Registered mail
Exactly once (signed tracking, no duplicates)
Smart lock “UNLOCK” command
21.4 QoS 0: Fire and Forget
Simple explanation:
Sensor -> Broker: "Temperature: 24C"
Broker: (receives it... or doesn't, no confirmation)
Sensor: (doesn't care, sends next reading in 10 seconds anyway)
Figure 21.1: MQTT Quality of Service levels with increasing reliability guarantees
Minimum Viable Understanding: QoS Semantics
Core Concept: QoS 0 delivers at most once (may lose messages), QoS 1 delivers at least once (may duplicate), QoS 2 delivers exactly once (no loss, no duplicates) - each level adds handshake overhead for stronger guarantees. Why It Matters: Choosing the wrong QoS wastes battery life (QoS 2 for temperature readings) or loses critical data (QoS 0 for door unlock commands) - this single decision can determine whether your IoT deployment succeeds or fails. Key Takeaway: Use QoS 0 for high-frequency telemetry where the next reading supersedes any lost one, QoS 1 for important alerts and state changes, and reserve QoS 2 only for critical commands where duplicates would cause harm (like financial transactions or actuator commands).
21.7 Real-World Examples
Scenario 1: Temperature Sensor (Use QoS 0)
10:00:00 -> Sensor: "Temp: 24.0C" (QoS 0)
10:00:10 -> Sensor: "Temp: 24.1C" (QoS 0)
10:00:20 -> Sensor: "Temp: 24.1C" (QoS 0) <- This one gets lost!
10:00:30 -> Sensor: "Temp: 24.2C" (QoS 0)
Dashboard shows: 24.0, 24.1, 24.2
Missing one reading? No problem! Next one comes in 10 seconds.
User presses "Unlock" in app
App -> Broker: "UNLOCK door #1" (QoS 2)
Broker -> Lock: "UNLOCK door #1" (QoS 2)
4-way handshake ensures lock receives exactly once
Lock unlocks (never unlocks twice, never missed)
21.8 What are Sessions? (Clean vs Persistent)
Analogy: Think of sessions as hotel check-in preferences.
21.8.1 Clean Session (true)
Guest: "I want a fresh room, no baggage from last visit"
Hotel: "Sure! Starting fresh."
(Guest leaves)
Hotel: "They checked out, throw away their stuff"
MQTT Translation:
Client connects with Clean Session = true
Broker forgets everything when client disconnects
No stored messages, no subscriptions remembered
Like checking into a hotel with no history
When to use: Temporary connections, testing, devices that don’t care about missed messages
21.8.2 Persistent Session (false)
Guest: "I'm a regular, keep my preferences!"
Hotel: "Welcome back! We saved your room setup, mail, messages"
(Guest leaves for a week)
Hotel: "Keeping their stuff safe..."
(Guest returns)
Hotel: "Here are the 15 letters that arrived while you were gone"
MQTT Translation:
Client connects with Clean Session = false
Broker remembers subscriptions even after disconnect
Broker stores messages sent while client was offline (if QoS 1 or 2)
Client gets “catch-up” messages when reconnecting
When to use: Devices with intermittent connectivity, battery-powered sensors that sleep, critical subscribers that can’t miss messages
MQTT session lifecycle comparison
Figure 21.2: MQTT session lifecycle comparison showing Clean Session (temporary, discards everything on disconnect) vs Persistent Session (maintains subscriptions and queues messages for offline clients). Clean sessions are optimal for simple publishers, while persistent sessions are essential for devices with intermittent connectivity that cannot miss critical messages.
21.9 Retained Messages: Last-Known State
MQTT retained messages allow the broker to store the most recent message on a topic and deliver it immediately to new subscribers:
MQTT retained message workflow
Figure 21.3: MQTT retained message workflow showing how brokers store the last published value and deliver it immediately to new subscribers. When a sensor publishes with retain=true, the broker stores that message and automatically sends it to any future subscribers, even if the sensor is offline. This is essential for status topics where clients need the current state immediately upon connection.
Tradeoff: Persistent Session vs Clean Session
Decision context: When configuring an MQTT client, you must choose whether the broker should remember client state between connections.
Factor
Persistent Session
Clean Session
Memory usage
Higher (broker stores state)
Lower (no state stored)
Reconnect speed
Faster (subscriptions restored)
Slower (must resubscribe)
Offline messages
Queued (QoS 1/2)
Lost
Client ID
Must be unique and stable
Can be random/ephemeral
Broker scalability
Lower (state per client)
Higher (stateless)
Battery impact
Higher (catch-up data on reconnect)
Lower (fresh start)
Choose Persistent Session when:
Device has intermittent connectivity (sleep cycles, mobile networks)
Cannot afford to miss messages during disconnection
Subscribes to topics and needs subscriptions restored automatically
Device only publishes (no subscriptions to maintain)
Frequent sensor readings where missing a few is acceptable
Testing or development environments
Battery-powered devices that prioritize power savings over completeness
High-scale deployments where broker memory is constrained
Default recommendation: Clean Session unless your device subscribes to topics and cannot miss messages during offline periods.
21.10 Quick Self-Check
Q: You have a battery-powered door sensor that: - Sleeps for 10 minutes to save battery - Wakes up, connects to broker, publishes “status: OK” - Disconnects and goes back to sleep - Someone subscribes to see door sensor status while sensor is asleep
Should you use: - A) QoS 0, Clean Session = true - B) QoS 1, Clean Session = true - C) QoS 0, Clean Session = false - D) QoS 1, Clean Session = false
Click to see the answer
Answer: B) QoS 1, Clean Session = true
Analysis:
QoS Level:
QoS 1 is correct
“status: OK” is important (want delivery confirmation)
If message is lost, subscriber doesn’t know sensor is alive
QoS 1 ensures broker receives it
Possible duplicates are OK (idempotent message)
QoS 0 is risky
Message might be lost
Subscriber never knows if sensor is working
QoS 2 is overkill
“status: OK” duplicates are harmless
Wastes battery with 4-way handshake
Clean Session:
Clean Session = true is correct
Sensor doesn’t subscribe to anything (only publishes)
Sensor doesn’t care about messages from when it was offline
Saves broker memory (doesn’t store session state)
Clean Session = false is unnecessary
Sensor isn’t subscribing, so no benefit from saved subscriptions
Broker wastes memory storing empty session
Power consumption comparison:
QoS 0, Clean=true: 10 mA x 100ms = 0.28 uAh per wake cycle
QoS 1, Clean=true: 10 mA x 150ms = 0.42 uAh per wake cycle (Best balance)
QoS 1, Clean=false: 10 mA x 200ms = 0.56 uAh per wake cycle
QoS 2, Clean=false: 10 mA x 400ms = 1.11 uAh per wake cycle
What about the subscriber getting the status?
Separate concern: Use Retained Messages!
// Door sensor publishes with QoS 1 AND retain flagmqttClient.publish("door/sensor1/status","OK",1,true);// topic msg QoS retain// Now when subscriber connects (even if sensor is asleep):// Broker immediately sends retained message: "status: OK"
QoS 0: 1x overhead (shortest active time per cycle)
QoS 1: ~1.5x overhead (one extra round-trip for PUBACK)
QoS 2: ~4x overhead (four-step handshake per message)
Pro tip: For battery-powered sensors, use QoS 1 + Retained + Clean Session = true. This ensures reliable delivery without the overhead of persistent sessions.
21.11 Real-World QoS Decision Framework
Selecting the right QoS level requires analyzing three dimensions of your data: criticality, frequency, and idempotency.
Worked Example: QoS Selection for a Smart Hospital
Scenario: A hospital deploys 2,000 IoT devices across three categories. Each category has different reliability requirements and message patterns.
Device inventory and QoS assignment:
Device Type
Count
Frequency
Critical?
Idempotent?
QoS
Rationale
Room temperature
800
Every 60s
No
Yes
0
Next reading replaces lost one; 800 x 1 msg/min = manageable
Patient heart rate
600
Every 5s
Yes
Yes
1
Must arrive; duplicate “HR: 72” is harmless
IV pump dosage
400
On command
Critical
No
2
Duplicate “administer 50mg” could be dangerous
Nurse call button
200
On event
Yes
Yes
1
Must arrive; duplicate alert is harmless
Battery and bandwidth impact calculation:
QoS 0 messages: 800 devices x 1/min = 800 msg/min, 1 packet each = 800 packets/min
QoS 1 messages: 800 devices x 12/min = 9,600 msg/min, 2 packets each = 19,200 packets/min
QoS 2 messages: 400 devices x ~2/hour = 800 msg/hour, 4 packets each = 53 packets/min
See how network reliability affects actual message delivery for each QoS level. QoS 0 delivers at most once; QoS 1 retries to improve success; QoS 2 adds a full handshake.
html`<div style="background:#f8f9fa;padding:20px;border-radius:8px;border-left:4px solid #3498DB;margin:20px 0;"> <h4 style="color:#2C3E50;margin-top:0;">Delivery Probability by QoS Level</h4> <div style="display:grid;grid-template-columns:repeat(auto-fit,minmax(180px,1fr));gap:12px;margin-bottom:16px;"> <div style="background:white;padding:12px;border-radius:6px;border:1px solid #16A085;text-align:center;"> <div style="color:#7F8C8D;font-size:0.85em;">QoS 0 (single try)</div> <div style="color:#16A085;font-size:1.8em;font-weight:bold;">${(pQ0 *100).toFixed(1)}%</div> <div style="color:#7F8C8D;font-size:0.75em;">1 packet sent</div> </div> <div style="background:white;padding:12px;border-radius:6px;border:1px solid #E67E22;text-align:center;"> <div style="color:#7F8C8D;font-size:0.85em;">QoS 1 (with retries)</div> <div style="color:#E67E22;font-size:1.8em;font-weight:bold;">${(pQ1 *100).toFixed(4)}%</div> <div style="color:#7F8C8D;font-size:0.75em;">up to ${q1MaxRetries +1} attempts</div> </div> <div style="background:white;padding:12px;border-radius:6px;border:1px solid #E74C3C;text-align:center;"> <div style="color:#7F8C8D;font-size:0.85em;">QoS 2 (4-step, single)</div> <div style="color:#E74C3C;font-size:1.8em;font-weight:bold;">${(pQ2 *100).toFixed(2)}%</div> <div style="color:#7F8C8D;font-size:0.75em;">4 packets must succeed</div> </div> <div style="background:white;padding:12px;border-radius:6px;border:1px solid #9B59B6;text-align:center;"> <div style="color:#7F8C8D;font-size:0.85em;">QoS 2 (with retries)</div> <div style="color:#9B59B6;font-size:1.8em;font-weight:bold;">${(pQ2retried *100).toFixed(4)}%</div> <div style="color:#7F8C8D;font-size:0.75em;">handshake retried ${q1MaxRetries +1}x</div> </div> </div> <div style="background:white;padding:12px;border-radius:6px;font-size:0.9em;color:#2C3E50;"> <strong>Insight:</strong> At ${(netReliability *100).toFixed(0)}% per-packet reliability, QoS 0 loses ${((1- pQ0) *100).toFixed(1)}% of messages. QoS 1 retries reduce loss to ${((1- pQ1) *100).toFixed(4)}%. QoS 2 single-attempt success is ${(pQ2 *100).toFixed(2)}% (4 packets must all succeed), but retries bring it to ${(pQ2retried *100).toFixed(4)}%. </div></div>`
21.12.4 Session Type Decision Tool
Answer three questions about your device to determine whether to use a clean or persistent MQTT session.
Show code
viewof sessSubscribes = Inputs.select( ["yes","no"], {value:"no",label:"Does the device subscribe to topics?"})viewof sessMissOk = Inputs.select( ["yes","no"], {value:"yes",label:"Can it miss messages while offline?"})viewof sessConnType = Inputs.select( ["always_on","intermittent","deep_sleep"], {value:"intermittent",label:"Connection pattern"})
Show code
sessNeedPersist = (sessSubscribes ==="yes"&& sessMissOk ==="no") || (sessSubscribes ==="yes"&& sessConnType ==="deep_sleep")sessRecommendation = sessNeedPersist ?"Persistent Session (clean_session=false)":"Clean Session (clean_session=true)"sessColor = sessNeedPersist ?"#E67E22":"#16A085"sessExplanation = sessNeedPersist?"Your device subscribes to topics and cannot afford to miss messages. A persistent session lets the broker queue QoS 1/2 messages while the device is offline and restore subscriptions on reconnect.":"Your device either only publishes, or can tolerate missed messages. A clean session avoids broker-side state overhead and keeps connections lightweight."