Use this framework to select the appropriate QoS level for different message types in your MQTT deployment:
| Sensor telemetry (frequent) |
QoS 0 |
Next reading supersedes previous; 1-5% loss acceptable |
Lowest latency, minimal broker load |
| Alert notifications |
QoS 1 |
Must deliver at least once; duplicates handled by application |
Moderate overhead, good reliability |
| Actuator commands |
QoS 1 or 2 |
Command execution critical; QoS 2 if duplicates cause harm |
Higher latency, guaranteed delivery |
| Configuration updates |
QoS 2 |
Exactly-once ensures consistent state; no duplicate config application |
Highest overhead, strong guarantee |
| Heartbeat/keepalive |
QoS 0 |
Periodic signal; single miss won’t trigger alarm |
Minimal overhead |
| Device status (retained) |
QoS 1 + retain |
Last status must reach new subscribers reliably |
Stored on broker, moderate cost |
| Financial transactions |
QoS 2 |
No duplicate payments; exactly-once essential |
High latency acceptable for correctness |
Performance Comparison:
| QoS 0 |
1 (fire-and-forget) |
1× baseline |
10ms |
5% messages lost permanently |
| QoS 1 |
2 (PUBLISH + PUBACK) |
2× baseline |
25ms |
All delivered, ~5% duplicates |
| QoS 2 |
4 (PUBLISH + PUBREC + PUBREL + PUBCOMP) |
4× baseline |
60ms |
All delivered, zero duplicates |
MQTT QoS levels have dramatic battery and broker scaling impacts. Let’s quantify for a 1,000-sensor deployment:
Scenario: 1,000 sensors report every 60 seconds, 50-byte payloads.
QoS 0 analysis:
- Messages/day/sensor: 1,440
- Total broker throughput: \(1{,}000 \times 1{,}440 = 1{,}440{,}000\) msg/day
- Radio time/message: \(\frac{50 \times 8}{250{,}000} = 1.6\) ms @ 30 mA = 0.013 mAh
- Daily energy/sensor: \(1{,}440 \times 0.013 = 18.7\) mAh
- Broker load: 1.44M messages × 1 packet = 1.44M packets/day
QoS 1 analysis (2× overhead): - Broker load: 1.44M × 2 (PUBLISH + PUBACK) = 2.88M packets/day - ACK wait time: 5 ms @ 10 mA = 0.014 mAh - Daily energy: \(1{,}440 \times (0.013 + 0.014) = 38.9\) mAh - Battery penalty: \(\frac{38.9}{18.7} = 2.08\times\) vs QoS 0
QoS 2 analysis (4× overhead): - Broker load: 1.44M × 4 (4-way handshake) = 5.76M packets/day - 3 ACK phases: 15 ms @ 10 mA = 0.042 mAh - Daily energy: \(1{,}440 \times (0.013 + 0.042) = 79.2\) mAh - Battery penalty: \(\frac{79.2}{18.7} = 4.24\times\) vs QoS 0
Broker CPU scaling: At 1.44M msg/day baseline (QoS 0), Mosquitto on 2-core @ 2 GHz uses ~5% CPU. QoS 2 → 20% CPU.
Worked Example: Smart Factory with 1,000 Sensors
Scenario: 1,000 temperature sensors report every 10 seconds; 100 actuators receive commands hourly.
QoS 0 for sensors:
Messages/hour: 1,000 × 360 = 360,000
Broker processing: 360,000 messages (no ACKs)
Packet loss (5%): 18,000 readings lost
Impact: Acceptable (next reading in 10s)
Bandwidth: 360,000 × 50 bytes = 18 MB/hour
QoS 1 for actuators:
Commands/hour: 100
Broker processing: 200 messages (100 PUBLISH + 100 PUBACK)
Duplicates (5%): 5 duplicate commands
Impact: Application deduplicates (command_id check)
Bandwidth: 200 × 50 bytes = 10 KB/hour
If sensors incorrectly used QoS 1:
Messages/hour: 360,000 × 2 (with ACKs) = 720,000
Broker CPU: 2× increase (handles 720K vs 360K)
Bandwidth: 36 MB/hour (2× increase)
Benefit: Zero reading loss
Trade-off: NOT worth it - next reading in 10s makes loss irrelevant
If actuators incorrectly used QoS 0:
Lost commands: 5 per hour (5% of 100)
Impact: Valves don't actuate, production line stops
Cost: $10,000/hour downtime vs $0.01 CPU cost for QoS 1
Trade-off: Unacceptable risk for minimal savings
Key Insights:
- Don’t over-specify: QoS 1 for all messages wastes 50%+ bandwidth/CPU for replaceable sensor data
- Don’t under-specify: QoS 0 for commands risks operational failures worth far more than the ACK overhead
- Use retain strategically: Device status with retain=true + QoS 1 ensures new subscribers get last known state
- Hybrid architectures work best: QoS 0 for high-volume telemetry, QoS 1 for important events, QoS 2 for critical transactions
Production Recommendation:
- Default: QoS 0 (assume sensor telemetry)
- Exceptions: QoS 1 for alerts/commands/device status
- Rare: QoS 2 only for financial/legal/safety-critical operations where duplicates cause actual harm