11  Transport Selection and Scenarios

In 60 Seconds

This chapter provides decision trees and selection frameworks for choosing TCP, UDP, or DTLS based on IoT application requirements. Key guidance: use UDP for low-latency sensor telemetry, TCP for reliable firmware delivery, UDP with application-layer reliability for constrained devices, and TLS/DTLS for secure communication. Complex deployments often combine multiple transport strategies across different data flows.

Key Concepts
  • Scenario-Based Selection Framework: For each scenario, answer: What is the data rate? What are device constraints? Is the path over the internet? What reliability is required? What is the latency budget?
  • Smart Home Hub Scenario: Devices → BLE/Zigbee → Hub (CoAP or MQTT bridge) → Cloud (MQTT over TLS); hub acts as protocol mediator translating local wireless protocols to internet-facing TCP/MQTT
  • Industrial Control Scenario: PLC → OPC-UA over TCP (Ethernet) → SCADA; TCP provides ordered reliable delivery for command/response; OPC-UA adds semantic interoperability; no wireless for deterministic timing
  • Wearable Health Scenario: Sensor → BLE GATT → Smartphone → HTTPS/REST → Health cloud; smartphone as gateway; BLE for body-area network; HTTPS for regulated healthcare data (HIPAA compliance)
  • Environmental Monitoring Network: LoRa → LoRaWAN server → MQTT → Database; CoAP alternative for NB-IoT sensors; MQTT persistence for intermittent sensor connectivity
  • Vehicle Telematics Scenario: GNSS+CAN → cellular (LTE-M) → MQTT over TLS → fleet platform; persistent MQTT connection on vehicle power; PSM during parking; CAN data aggregated before upload
  • Supply Chain Tracking Scenario: Tag (BLE/NFC) → Scanner (LTE) → HTTPS webhook → ERP; REST HTTP for integration with enterprise systems; typically event-driven (not continuous)
  • Edge Analytics Scenario: Sensors → MQTT local broker → Edge compute (analytics) → selective cloud upload via HTTPS; edge filtering reduces cloud data volume by 90%; TCP for reliability of aggregated results
  • Disaster Recovery Communications: Primary: 4G LTE + MQTT; Backup: satellite (Iridium SBD, 12-byte messages); Protocol fallback in firmware: if primary fails after 5 retries, switch to backup

11.1 Learning Objectives

By the end of this chapter, you will be able to:

  • Apply Selection Frameworks: Use decision trees to choose appropriate transport protocols for IoT scenarios
  • Evaluate Trade-offs: Balance reliability, latency, power consumption, and security in protocol selection
  • Design for Constraints: Select TCP, UDP, or DTLS based on device capabilities and application requirements
  • Analyze Real-World Scenarios: Map common IoT use cases (telemetry, control, streaming) to optimal protocols
  • Justify Protocol Choices: Provide technical rationale for transport layer decisions in system designs
  • Plan Hybrid Approaches: Combine multiple transport strategies for complex IoT deployments

What is this chapter? Guidance for selecting transport protocols based on IoT application requirements.

When to use:

  • When designing IoT systems
  • To compare and evaluate protocol trade-offs
  • For making informed architecture decisions

Selection Criteria:

Scenario Recommended
Low latency needed UDP
Reliable delivery TCP
Constrained devices UDP + app-level reliability
Secure communication TLS/DTLS

Recommended Path:

  1. Review Transport Fundamentals
  2. Study scenarios here
  3. Apply to your project requirements

“Every IoT scenario has different requirements,” said Max the Microcontroller. “A heart rate monitor needs reliability AND low latency. A weather station can tolerate some data loss. A security camera needs high throughput. Each demands a different transport approach.”

“Decision trees make this systematic,” explained Sammy the Sensor. “Start at the top: does data loss matter? If yes, go right toward TCP. If no, go left toward UDP. Then branch on security needs, latency requirements, and device constraints. You end up at the right protocol every time.”

“Real deployments often use hybrid approaches,” added Lila the LED. “A smart home hub might use TCP for commands to door locks, UDP for temperature readings from sensors, and DTLS for encrypted motion detector alerts. Different data flows, different protocols, same device.”

“The key insight is that there is no universal best protocol,” said Bella the Battery. “Anyone who says ‘always use TCP’ or ‘always use UDP’ does not understand the trade-offs. Match the protocol to the specific data flow requirements.”

Deep Dives:

Comparisons:

Security:

Learning:

11.2 Prerequisites

Before diving into this chapter, you should be familiar with:

  • Transport Protocols: Fundamentals: Explain TCP and UDP characteristics, distinguish connection-oriented from connectionless protocols, and identify how DTLS secures UDP — all of which are essential for making informed protocol selection decisions
  • Networking Basics: Apply basic networking concepts — including latency, throughput, packet loss, and reliability — to evaluate trade-offs in transport protocol choices
  • IoT Protocols Overview: Identify how application-layer protocols (CoAP, MQTT) interact with the transport layer so you can assess how protocol choices affect IoT application behavior

11.3 Transport Protocol Selection for IoT Scenarios

⏱️ ~15 min | ⭐⭐ Intermediate | 📋 P07.C34.U01

11.4 How It Works: The Scenario Analysis Framework

Protocol selection for real-world deployments follows a repeatable four-stage process:

Stage 1: Identify Requirements (Inputs)

  1. Reliability analysis: Can we tolerate data loss? If YES → UDP candidate. If NO → TCP or UDP+ACK
  2. Latency analysis: What’s the maximum acceptable delay? <100ms → UDP. <1s → either. >1s → TCP fine
  3. Power analysis: Battery-powered? If YES and frequent transmissions → UDP critical. If NO → TCP acceptable
  4. Security analysis: Encryption needed? If YES + UDP → DTLS. If YES + TCP → TLS

Stage 2: Evaluate Trade-offs (Analysis)

  1. Calculate overhead: For 10-byte payload: UDP = 24B total (42% efficiency), TCP = 244B (4.1% efficiency) for a single-shot connection
  2. Estimate battery impact: At 30-sec interval (2880 readings/day), UDP ≈ 44 years, TCP ≈ 7 years (6× difference) — sleep current dominates at low rates; protocol overhead matters most at high transmission rates
  3. Assess packet loss impact: At 1% loss, UDP loses 1/100 readings (acceptable?), TCP retransmits (minimum 1 second RTO per RFC 6298, exponentially increasing)
  4. Security cost: DTLS 1.2 adds 13B per record header + ~1–2 KB handshake (amortized per session), TLS 1.3 adds similar overhead but with faster handshake (~1 RTT vs 2 RTT for DTLS 1.2)

Stage 3: Select Protocol (Decision)

  • TCP + TLS: Reliable, secure, high overhead → firmware updates, configuration, critical commands
  • UDP + DTLS: Secure, low latency, moderate overhead → encrypted telemetry, secure CoAP
  • UDP + CoAP: Lightweight, optional reliability, lowest overhead → sensor telemetry, control commands
  • Plain UDP: Minimal overhead, no guarantees → broadcast announcements, video streaming

Stage 4: Validate Performance (Testing)

  1. Lab testing: Measure actual overhead (Wireshark), power consumption (power profiler), latency (ping)
  2. Field pilot: Deploy 10-100 units, monitor for 1 week, collect metrics (battery drain, packet loss, latency)
  3. Load testing: Simulate peak traffic (100× devices), verify no congestion/gateway overload
  4. Failure testing: Inject 10% packet loss, verify system remains functional

Why This Process Works:

  • Requirements-driven: Start with constraints, not protocols (avoid “solution looking for a problem”)
  • Quantitative: Calculate actual numbers (overhead, battery life) instead of intuition
  • Iterative: Validate assumptions, adjust if testing reveals issues
  • Multi-criteria: Balance competing constraints (security vs power, latency vs reliability)

Common Process Failures:

  • Skipping Stage 2 (evaluation) → choose TCP “because it’s reliable” without calculating battery impact
  • Skipping Stage 4 (validation) → assume 1% lab packet loss applies to field (real-world often 5-30% in industrial environments)
  • Single-criterion optimization → maximize battery life by choosing UDP, ignore reliability needs

11.5 Scenario Analysis

Flowchart showing transport protocol selection framework with four sequential stages: Identify Requirements (analyzing reliability, latency, power, and security needs), Evaluate Trade-offs (comparing protocol characteristics), Select Protocol (choosing from TCP+TLS, UDP+DTLS, UDP+CoAP, or plain UDP options), and Validate Performance (testing the chosen solution). Left branch shows requirement considerations including reliability, latency, power consumption, and security. Right branch shows protocol options with their characteristics like TCP+TLS for reliable secure communication, UDP+DTLS for secure constrained environments, UDP+CoAP for lightweight application-level reliability, and plain UDP for minimal overhead scenarios.
Figure 11.1: Transport Protocol Selection Framework

This variant shows the analysis framework as a time-based process, emphasizing when each step occurs during system design:

Timeline diagram showing the four phases of transport protocol selection (Requirements, Trade-off Evaluation, Protocol Selection, Validation) arranged chronologically along a project timeline, highlighting that selection is an iterative design activity rather than a one-time choice

Transport selection timeline view

This timeline view helps project managers understand that protocol selection is an iterative process requiring analysis, evaluation, selection, and validation phases.

This variant shows how different IoT scenarios prioritize trade-offs differently using a comparative view:

Radar-chart-style comparison diagram showing how four IoT scenario types (telemetry sensor, firmware update, video streaming, smart lock command) each prioritize different trade-off dimensions including reliability, latency, power efficiency, security, and bandwidth, with orange highlights indicating high-priority dimensions per scenario

Protocol trade-off comparison by IoT scenario type

Orange-highlighted requirements indicate high priority for that scenario, showing how different priority profiles lead to different protocol choices.

11.5.1 Example Scenarios

Scenario 1: Temperature Sensor (Battery-Powered)

Requirements:

  • Reports temperature every 5 minutes
  • Battery-powered (must last years)
  • Occasional reading loss acceptable
  • Low latency preferred

Protocol Selection: UDP (with CoAP)

Reasoning:

  • Low overhead: 8-byte UDP header
  • Low power: No connection state, no ACKs
  • Loss tolerable: Missing one reading OK (next reading in 5 min)
  • CoAP: Application-level confirmable messages if needed

Power Impact (802.15.4 at 250 kbps): - UDP: ~1 ms radio on time per reading (24-byte packet) - TCP: ~13 ms radio on (244 bytes across handshake + data + teardown) - 13× radio savings with UDP (protocol overhead dominates at faster transmission rates)

Scenario 2: Firmware Update (Battery or Mains)

Requirements:

  • 500 KB firmware image
  • Must be 100% reliable
  • Can tolerate latency (not time-critical)
  • Corrupted firmware = bricked device

Protocol Selection: TCP (with TLS for security)

Reasoning:

  • Reliability: Cannot tolerate any packet loss
  • Ordering: Firmware must be received in correct order
  • Error recovery: Automatic retransmission
  • Security: TLS prevents malicious firmware injection

Power Impact:

  • TCP overhead acceptable for infrequent operation (once per month)
  • Device can stay awake during update (user-initiated)
Scenario 3: Video Surveillance (Mains-Powered)

Requirements:

  • 1080p video stream (2-4 Mbps)
  • Real-time display (< 200 ms latency)
  • Mains-powered (no battery concern)
  • Some frame loss acceptable

Protocol Selection: UDP (with RTP/RTSP)

Reasoning:

  • Low latency: No retransmission delays
  • Real-time: Predictable latency
  • Bandwidth: High bitrate requires efficient protocol
  • Loss tolerable: Missing frame causes brief artifact, not fatal
  • TCP would cause: Buffering, variable latency (unacceptable for live video)

Security: Can add DTLS/SRTP if needed

Scenario 4: Smart Lock (Battery-Powered, Critical)

Requirements:

  • Lock/unlock commands
  • Must be 100% reliable (security critical)
  • Battery-powered
  • Low latency desired (user waiting)

Protocol Selection: UDP + DTLS (with CoAP confirmable)

Reasoning:

  • Security: DTLS encryption + authentication (prevent spoofing)
  • Reliability: CoAP confirmable messages (application-level ACK)
  • Power: UDP more efficient than TCP
  • Compromise: Slightly higher overhead than plain UDP, but necessary for security

Why not TCP + TLS?

  • TCP adds connection overhead (3-way handshake)
  • TLS handshake is heavy
  • UDP + DTLS with session resumption more efficient
Quick Check: Scenario Analysis

Try It: Protocol Selection Decision Guide

Answer three questions to get a recommended transport protocol for your IoT scenario.

11.6 Hands-On Lab: Protocol Selection Analysis

Lab Activity: Calculate Overhead and Power Consumption

Objective: Compare TCP vs UDP overhead and power impact for IoT sensor

Scenario: Temperature sensor sending 10-byte readings

Device: nRF52840 - TX current: 5 mA - RX current: 5 mA - Sleep current: 5 µA - Data rate: 250 kbps (802.15.4)

11.6.1 Task 1: Calculate Packet Overhead

Payload: 10 bytes (temperature + humidity + battery)

Calculate total packet size for: 1. UDP/IPv6 2. TCP/IPv6

Assume:

  • IPv6 header: 40 bytes (uncompressed) or 6 bytes (6LoWPAN compressed)
  • Use compressed headers
Click to see solution

UDP Packet:

IPv6 header (compressed): 6 bytes
UDP header: 8 bytes
Payload: 10 bytes
Total: 24 bytes

Overhead: 14 bytes / 24 bytes = 58% overhead
Efficiency: 10 bytes / 24 bytes = 42% efficiency

TCP Packet (data transmission):

IPv6 header (compressed): 6 bytes
TCP header (minimum): 20 bytes
Payload: 10 bytes
Total: 36 bytes

Overhead: 26 bytes / 36 bytes = 72% overhead
Efficiency: 10 bytes / 36 bytes = 28% efficiency

TCP Handshake (before data):

SYN: 6 (IPv6) + 20 (TCP) = 26 bytes
SYN-ACK: 26 bytes
ACK: 26 bytes
Total handshake: 78 bytes

TCP Data + ACK:

Data: 36 bytes (as calculated above)
ACK: 6 (IPv6) + 20 (TCP) = 26 bytes
Total: 62 bytes

TCP Connection Teardown:

FIN: 26 bytes
ACK: 26 bytes
FIN: 26 bytes
ACK: 26 bytes
Total teardown: 104 bytes

Total TCP (one reading):

Handshake: 78 bytes
Data + ACK: 62 bytes
Teardown: 104 bytes
Total: 244 bytes

Comparison:

  • UDP: 24 bytes total
  • TCP: 244 bytes total (10× more!)
If TCP connection kept open (multiple readings): - Handshake: 78 bytes (once) - Per reading: 62 bytes (data + ACK) - Much better, but still 2.6× overhead vs UDP

11.6.2 Task 2: Calculate Radio On Time

Using packet sizes from Task 1, calculate radio on time:

Data rate: 250 kbps = 31.25 KB/s

Calculate: 1. TX time for UDP packet 2. TX time for TCP (connection + data + teardown) 3. TX time for TCP (keep-alive, per reading)

Click to see solution

Radio on-time directly determines battery consumption. Let’s calculate the exact timing and energy for each protocol at 250 kbps:

UDP transmission time: \[T_{\text{TX}} = \frac{24\text{ bytes}}{31{,}250\text{ bytes/sec}} = 0.768\text{ ms} \approx 1\text{ ms (with overhead)}\]

TCP full connection time: \[\begin{align} T_{\text{handshake}} &= \frac{78\text{ B}}{31{,}250} + \text{RTT} = 2.5\text{ ms} + 50\text{ ms} = 52.5\text{ ms} \\ T_{\text{data+ACK}} &= \frac{62\text{ B}}{31{,}250} + \text{RTT} = 2\text{ ms} + 50\text{ ms} = 52\text{ ms} \\ T_{\text{teardown}} &= \frac{104\text{ B}}{31{,}250} + \text{RTT} = 3.3\text{ ms} + 50\text{ ms} = 53.3\text{ ms} \\ T_{\text{total}} &= 52.5 + 52 + 53.3 = 157.8\text{ ms} \approx 160\text{ ms} \end{align}\]

Energy consumption per message (5 mA TX/RX, 2880 readings/day = every 30 seconds):

Energy in mAh = current (mA) × time (s) ÷ 3600

\[\begin{align} E_{\text{UDP/day}} &= \frac{5\text{ mA} \times 1\text{ ms} \times 2{,}880}{3{,}600{,}000} = 0.004\text{ mAh/day (radio only)} \\ E_{\text{TCP/day}} &= \frac{5\text{ mA} \times 160\text{ ms} \times 2{,}880}{3{,}600{,}000} = 0.64\text{ mAh/day (radio only)} \end{align}\]

Adding 5 µA sleep current for a 2000 mAh battery (sleep ≈ 0.12 mAh/day dominates):

Protocol Radio Sleep Total Battery Life
UDP 0.004 mAh 0.120 mAh 0.124 mAh/day ~44 years
TCP 0.64 mAh 0.120 mAh 0.76 mAh/day ~7 years
  • UDP: Radio time negligible vs sleep current (0.004 vs 0.12 mAh/day)
  • TCP: 160× more radio time starts to matter — 0.64 mAh/day makes sleep less dominant

Key insight: Protocol overhead becomes significant when radio energy approaches or exceeds sleep energy — at 30-second intervals with this device, TCP already increases daily consumption 6×.

UDP (24 bytes):

TX time = 24 bytes / 31.25 KB/s
        = 24 / 31,250 bytes/s
        = 0.768 ms

Total radio on: ~1 ms (including processing)

TCP (full connection, 244 bytes):

TX time = 244 bytes / 31.25 KB/s
        = 244 / 31,250
        = 7.8 ms

RX time (waiting for ACKs): ~5 ms

Total radio on: ~13 ms (TX + RX + processing)

TCP (keep-alive, 62 bytes per reading):

TX+RX time = 62 bytes / 31.25 KB/s
           = 2 ms

Total radio on: ~3 ms per reading

Comparison:

  • UDP: 1 ms
  • TCP (full): 13 ms (13× longer)
  • TCP (keep-alive): 3 ms (3× longer)

11.6.3 Task 3: Calculate Power Consumption

Sensor reports every 5 minutes. Calculate daily power consumption.

Assumptions:

  • Radio TX/RX: 5 mA
  • Sleep: 5 µA
  • Readings per day: 24 × 60 / 5 = 288
Click to see solution

UDP:

Active time per reading: 1 ms
Active time per day: 288 × 1 ms = 288 ms = 0.288 s

Active power: 5 mA × 0.288 s = 1.44 mA·s = 0.4 µA·h
Sleep power: 5 µA × (86,400 - 0.288) s / 3600 = 119.99 µA·h

Total per day: 0.4 + 120 = 120.4 µA·h = 0.12 mA·h

TCP (full connection per reading):

Active time per reading: 13 ms
Active time per day: 288 × 13 ms = 3.744 s

Active power: 5 mA × 3.744 s = 18.72 mA·s = 5.2 µA·h
Sleep power: 5 µA × (86,400 - 3.744) s / 3600 = 119.97 µA·h

Total per day: 5.2 + 120 = 125.2 µA·h = 0.125 mA·h

TCP (keep-alive):

Active time per reading: 3 ms
Active time per day: 288 × 3 ms = 864 ms = 0.864 s

Active power: 5 mA × 0.864 s = 4.32 mA·s = 1.2 µA·h
Sleep power: 5 µA × (86,400 - 0.864) s / 3600 = 119.99 µA·h

Total per day: 1.2 + 120 = 121.2 µA·h = 0.121 mA·h

Battery Life (2000 mAh battery): - UDP: 2000 / 0.12 = 16,667 days = 45.7 years (limited by self-discharge) - TCP (full): 2000 / 0.125 = 16,000 days = 43.8 years (similar, limited by self-discharge) - TCP (keep-alive): 2000 / 0.121 = 16,529 days = 45.3 years

Analysis: For this scenario (infrequent transmission, small payload), sleep current dominates. Radio on time is < 0.01% of day, so protocol overhead has minimal impact on battery life.

However: If transmitting every 10 seconds (8,640 readings per day):

UDP (8640 × 1 ms = 8.64 s active/day):
  Radio:  5 mA × 8.64 s / 3600 = 0.012 mAh/day
  Sleep:  5 µA × (86,400 - 8.64) s / 3600 / 1000 = 0.120 mAh/day
  Total:  0.132 mAh/day → 2000 / 0.132 = 15,152 days = ~41 years

TCP full (8640 × 13 ms = 112.3 s active/day):
  Radio:  5 mA × 112.3 s / 3600 = 0.156 mAh/day
  Sleep:  5 µA × (86,400 - 112.3) s / 3600 / 1000 = 0.120 mAh/day
  Total:  0.276 mAh/day → 2000 / 0.276 = 7,246 days = ~20 years

TCP keep-alive (8640 × 3 ms = 25.9 s active/day):
  Radio:  5 mA × 25.9 s / 3600 = 0.036 mAh/day
  Sleep:  0.120 mAh/day
  Total:  0.156 mAh/day → 2000 / 0.156 = 12,821 days = ~35 years

At 10-second intervals: UDP (41 yr) vs TCP-full (20 yr) — 2× difference. Sleep current still dominates for this low-power radio; the ratio would be more dramatic with a higher-power radio (e.g., a Wi-Fi device at 70 mA active, where TCP-full would drain the battery in months).

Conclusion: For low-power radios at moderate rates, protocol choice affects battery life measurably but sleep current limits the impact. For high-power radios or very high transmission rates, TCP’s connection overhead can reduce battery life by 5–10× compared to UDP.

11.8 Protocol Trade-off Analysis

Understanding the key trade-offs between transport protocols helps you make informed decisions for IoT deployments.

Tradeoff: Reliable Delivery vs Low Latency

Option A (TCP Reliable Delivery): Guarantees packet delivery through acknowledgments and retransmissions. Adds 1-3 RTT latency (50-300ms on typical IoT networks) for connection setup. Retransmission timeout (RTO) starts at 1 second, doubles on each retry. Total overhead: 244 bytes for single 10-byte sensor reading (handshake + data + teardown).

Option B (UDP Best-Effort): Zero connection latency, immediate transmission. Single packet overhead: 8-byte UDP header + 6-byte compressed IPv6 (6LoWPAN) = 14-byte header, 24 bytes total for a 10-byte reading. Average latency: 1-5ms on local networks. Packet loss rate depends on network: 0.1-1% on Wi-Fi, 5-30% on lossy wireless sensor networks.

Decision Factors: Choose TCP when data loss is unacceptable (firmware updates, financial transactions, actuator commands) and latency tolerance exceeds 100ms. Choose UDP when fresh data arrives frequently (sensor telemetry every 5-60 seconds) and occasional loss is acceptable. For critical IoT alerts over UDP, add application-layer reliability with CoAP Confirmable messages (2s initial timeout, exponential backoff).

Tradeoff: Connection State vs Connectionless Simplicity

Option A (TCP Connection State): Maintains per-connection state: sequence numbers, acknowledgment tracking, congestion window, receive buffer. Memory cost: 280-500 bytes per connection on constrained devices. Connection timeout: 2-4 minutes idle before TCP keepalive triggers. Maximum connections limited by device RAM (e.g., ESP8266: 5-8 concurrent TCP connections).

Option B (UDP Connectionless): Zero connection state, zero memory per “connection.” Each datagram independent: source/destination ports, length, checksum only. Supports unlimited concurrent communication partners. Memory footprint: single 8-byte header structure reused for all transmissions.

Decision Factors: Choose TCP when you need long-lived bidirectional communication (persistent MQTT connections, command channels). Choose UDP when devices communicate with many endpoints (multicast sensor networks), have severe memory constraints (<10KB RAM), or need to support hundreds of simultaneous peers. For battery devices sleeping between transmissions, UDP avoids connection re-establishment overhead on each wake cycle.

Try It: Protocol Overhead and Battery Life Calculator

Adjust the sliders to see how payload size, transmission interval, and radio current affect protocol overhead and battery life.

Scenario: Manufacturing plant needs to monitor bearing vibration on 200 production machines to predict failures. System must detect anomalies within 100ms to prevent damage.

Requirements Analysis:

Sampling rate: 1000 Hz (1 sample per millisecond)
Sample size: 12 bytes (timestamp: 4B, X-axis: 2B, Y-axis: 2B, Z-axis: 2B, machine ID: 2B)
Data rate per machine: 1000 samples/sec × 12 bytes = 12 KB/sec
Total system: 200 machines × 12 KB/sec = 2.4 MB/sec

Real-time requirement: Detect anomaly within 100ms
Network: Gigabit Ethernet (low latency, <1ms)
Processing: Edge gateway runs anomaly detection locally
Alert delivery: To maintenance dashboard when threshold exceeded

Step 1: Evaluate Protocol Options for Sensor → Gateway

Option A: TCP

Pros:
- Reliable delivery (no lost samples)
- In-order delivery (critical for time-series)
- Flow control (prevents gateway overload)

Cons:
- Head-of-line blocking: Lost packet at T=0 delays packets at T=1,2,3...
- With 1000 Hz sampling, HOL blocking causes 10-50ms spikes
- Retransmission timeout: 1 second minimum (RFC 6298) = far exceeds 100ms alert window
- Connection overhead: 200 machines × 500 bytes TCP state = 100 KB RAM

Latency analysis:
- Normal: <1ms (Ethernet)
- With 0.1% packet loss: 1 packet/sec lost per machine
- Retransmit timeout: 1 second minimum (RFC 6298 default RTO)
- Alert delay: 200ms >> 100ms requirement FAILED

Option B: UDP

Pros:
- No head-of-line blocking (lost packet doesn't delay future)
- Consistent latency: 1ms (no retransmission delays)
- No connection state: 0 bytes RAM overhead
- Packet loss: 0.1% means 1 sample/sec lost = acceptable for anomaly detection

Cons:
- Lost samples create data gaps
- Out-of-order delivery (Ethernet reordering rare but possible)
- No flow control (gateway must buffer bursts)

Latency analysis:
- Best case: <1ms
- Worst case: 1ms (no retransmit delays)
- With 0.1% loss: 999/1000 samples arrive on time
- Alert latency: 1-2ms << 100ms requirement PASSED

Decision for Sensor → Gateway: UDP

Reasoning: Real-time 100ms requirement makes TCP’s head-of-line blocking and retransmit timeouts unacceptable. 0.1% sample loss doesn’t prevent anomaly detection (999/1000 samples sufficient).

Step 2: Evaluate Protocol Options for Gateway → Dashboard (Alerts)

Alert characteristics:

Frequency: 5-10 alerts/hour per machine (normal operation)
Criticality: HIGH (must not lose alert)
Latency: <1 second acceptable (human response time)
Size: 200 bytes (machine ID, timestamp, severity, vibration metrics, predicted failure time)

Option A: UDP (for consistency with sensor path)

Pros:
- Low latency: ~1ms
- Consistent with sensor protocol

Cons:
- Alert loss risk: 0.1% loss means 1 in 1000 alerts lost
- For critical alerts, loss unacceptable
- No confirmation of delivery

Option B: TCP

Pros:
- Guaranteed delivery: No alert loss
- In-order delivery: Alerts arrive in sequence
- Connection persistence: Dashboard maintains single TCP connection

Cons:
- Connection setup: 3ms (one-time cost, then persistent)
- Head-of-line blocking: Not relevant (alerts are infrequent)
- Latency: <10ms (acceptable for human-scale response)

Cost analysis:
- Handshake: 3ms (once per dashboard connection)
- Per alert: 200 bytes + 40 bytes ACK = 240 bytes
- Latency: ~5ms average

Decision for Gateway → Dashboard: TCP

Reasoning: Alert loss is unacceptable (maintenance window missed = $10K+ machine damage). TCP’s 5ms latency is far below the 1-second requirement. Connection overhead amortizes across multiple alerts.

Step 3: Calculate Overall System Performance

Data Path: Sensor → Gateway (UDP) → Dashboard (TCP)

End-to-end latency for alert:
1. Sensor samples vibration: 0ms (continuous)
2. UDP transmission to gateway: 1ms
3. Gateway anomaly detection: 10ms (sliding window analysis)
4. TCP alert to dashboard: 5ms
Total: 16ms << 100ms requirement PASSED

Packet loss handling:
- Sensor UDP loss: 0.1% = 1 sample/sec lost per machine
- 999/1000 samples arrive, sufficient for anomaly detection
- Alert TCP loss: 0% (reliable delivery)
- Critical alerts always reach dashboard

Resource consumption:
- Gateway RAM: 0 bytes for UDP connections, 1 × 4KB for dashboard TCP = 4KB total
- vs full TCP: 200 × 4KB = 800KB saved
- Gateway CPU: UDP socket receive + anomaly detection

Step 4: Verify Under Load

Stress test results:

Normal operation:
- 200 machines × 12 KB/sec = 2.4 MB/sec inbound UDP
- 10 alerts/hour = 1 TCP message every 360 seconds
- Total bandwidth: 2.4 MB/sec (no issues on Gigabit)

Peak failure scenario (10 machines failing simultaneously):
- Same 2.4 MB/sec sensor data (continuous)
- 10 alerts in <1 second via TCP
- Alert burst: 10 × 240 bytes = 2.4 KB in 1 second
- No congestion, no dropped alerts

Alert delivery verification:
- 30-day test: 200 machines × 10 alerts/hour × 24 hours/day × 30 days = 1,440,000 alerts sent
- Alerts received: 1,440,000 (100% delivery via TCP)
- False negatives: 0 (UDP sample loss did not prevent detection)

Key Insight: Hybrid protocol strategy (UDP for high-frequency telemetry, TCP for critical alerts) achieves both real-time performance and reliability. Using TCP for 1000 Hz sensor data would violate the 100ms requirement due to retransmit delays, while using UDP for alerts would risk losing critical failure notifications.

Lesson Learned: Different data flows in the same system have different requirements. High-frequency telemetry prioritizes latency (UDP), while infrequent critical alerts prioritize reliability (TCP). Don’t force a single protocol choice across heterogeneous data streams.

:

11.9 Concept Relationships

Builds on:

Applies Concepts From:

Real-World Integration:

System Design Impact:

Related Decision Frameworks:

Common Misconceptions:

  • “One scenario = one protocol” - Wrong: Medical wearable example uses three protocols simultaneously
  • “Selection is permanent” - Wrong: Firmware updates can change protocol (CoAP → MQTT migration)
  • “Lab testing is sufficient” - Wrong: Field packet loss often 5-10× higher than lab (interference, obstacles)

Key Insight: The scenario analysis framework prevents two common failures: (1) choosing a protocol before understanding requirements, and (2) optimizing for one constraint (e.g., battery) while ignoring others (e.g., reliability). Always analyze the full requirement set.

11.10 See Also

Related Selection Topics:

Scenario-Specific Deep Dives:

Implementation Guides:

Validation Tools:

Specifications:

  • RFC 7252: CoAP - UDP-based RESTful protocol design
  • RFC 8323: CoAP over TCP - TCP variant for firewall traversal
  • RFC 3551: RTP Profile for Audio/Video - UDP streaming protocol

External Resources:

11.11 Try It Yourself

Hands-On Exercise: Multi-Protocol Smart Building Design

Scenario: Design transport protocols for a 10-story office building with five systems.

System Requirements:

  1. HVAC sensors (500 units): Temperature/humidity every 60 seconds, battery-powered, 5-year life target
  2. Access control (200 door readers): Badge swipes (security-critical), mains-powered, <1s latency
  3. Occupancy sensors (500 units): PIR motion detectors, battery-powered, report every 10 seconds when occupied
  4. Fire alarms (100 units): Smoke/heat detectors, mains-powered with battery backup, life-safety critical
  5. Security cameras (50 units): 1080p video, mains-powered, 2 Mbps average, <200ms latency

Your Task: For each system: 1. Select TCP or UDP (or hybrid) 2. Add DTLS/TLS if needed 3. Calculate overhead and justify 4. Identify failure modes and mitigations

Template:

System: [Name]
Protocol: [TCP/UDP/DTLS/TLS]
Rationale:
  - Reliability: [critical/important/tolerable]
  - Latency: [<100ms/<1s/>1s]
  - Power: [battery/mains]
  - Security: [yes/no]
Overhead: [calculate bytes per message]
Failure modes: [what happens if packet lost?]
Mitigation: [how to handle failures?]
Click for solution

1. HVAC Sensors: UDP + CoAP NON

  • Rationale: Battery-powered (5 years = UDP essential), 60-sec interval (loss tolerable), no security (internal network)
  • Overhead: 18-byte header (6B IPv6 + 8B UDP + 4B CoAP) + 4-byte reading = 22 bytes total (82% overhead, 18% efficiency)
  • Failure: 1% loss = 1 reading/100 lost, averaged hourly (acceptable)
  • Mitigation: Cloud detects missing readings >5 min, flags sensor for maintenance

2. Access Control: TCP + TLS

  • Rationale: Security-critical (prevent unauthorized entry), mains-powered, <1s latency acceptable
  • Overhead: 200 bytes (badge data) + 40 bytes (TCP+TLS headers) = 240 bytes
  • Failure: Lost authorization = user denied (unacceptable) → TCP retransmit ensures delivery
  • Mitigation: TCP persistent connection (keep-alive), local cache (reader stores last 1000 badges for offline operation)

3. Occupancy Sensors: UDP + CoAP NON (when occupied), UDP + CoAP CON (on state change)

  • Rationale: Battery-powered + 10-sec interval = UDP critical, state change (occupied→vacant) must arrive
  • Overhead: 18 bytes for periodic “still occupied”, 26 bytes for confirmable “now vacant”
  • Failure: Lost “still occupied” → OK (next reading in 10s), lost “vacant” → lights stay on (wasteful)
  • Mitigation: CoAP CON for state transitions (occupied↔︎vacant), NON for periodic heartbeats

4. Fire Alarms: TCP + UDP dual-path

  • Rationale: Life-safety = zero loss tolerance, <1s latency for evacuation
  • Overhead: 50 bytes alarm data sent via both TCP (reliable) and UDP (low latency)
  • Failure: If TCP path slow (congestion), UDP arrives first. If UDP lost, TCP delivers.
  • Mitigation: Dual-path redundancy, dedicated VLAN for fire alarm traffic

5. Security Cameras: UDP + RTP

  • Rationale: Real-time display (<200ms), frame loss acceptable (brief glitch), high bitrate (TCP retransmit unacceptable)
  • Overhead: RTP adds 12 bytes per packet, <1% overhead at 2 Mbps
  • Failure: Lost frame = brief pixelation, auto-recovers on next I-frame
  • Mitigation: Adaptive bitrate (reduce quality if loss >5%), local recording (TCP for playback)
Key Insight: All five systems use different protocols based on specific requirements. Forcing TCP or UDP everywhere would fail critical constraints (battery life, latency, reliability).

Challenge: Add a sixth system - elevator position sensors (50 units). What protocol and why?

Common Pitfalls

IoT systems that use only one transport path (e.g., MQTT over LTE with no backup) become completely unavailable when the primary path fails. Design for graceful degradation: if cloud connectivity fails, continue local data collection and buffering; if TCP fails, attempt UDP; if primary APN fails, try backup APN. Document the failover logic and test it explicitly: disconnect the primary path and verify the system falls back to secondary without human intervention.

Scenarios involving edge compute (local analytics, pre-processing) require protocols that can communicate with both constrained devices (CoAP/MQTT) and enterprise systems (HTTP/REST, OPC-UA). A scenario design that uses CoAP device-to-cloud without considering edge integration will require a full protocol translation layer when edge analytics is added later. Design the protocol architecture end-to-end from sensor to enterprise system, identifying translation points at the start.

Cross-border IoT data flows are subject to: EU GDPR (data must not leave EU without adequate protection), China PIPL (personal data processing requirements), India PDPB (data localization for sensitive data). A global fleet tracking scenario transmitting location data through a single cloud region may violate regulations in multiple jurisdictions. Design multi-region scenarios with regional data processing and storage before cross-region aggregation, and review protocol endpoints for compliance with data residency requirements.

Network partitions (split brain: device and server both running but disconnected) cause different protocol behavior: MQTT broker maintains QoS 1 message queue for reconnection; CoAP requests time out and fail; TCP connections enter half-open state. Scenario designs must specify: what happens to each message type during a 1-hour partition? Which messages are buffered, which are dropped, which trigger alarms? Test network partition scenarios explicitly rather than assuming smooth reconnection.

11.12 What’s Next

Build on your transport protocol selection knowledge:

Next Chapter What You Will Learn Why It Matters
Transport Optimizations and Implementation Configure TCP keep-alive, implement UDP reliability, tune DTLS session resumption Apply the selection decisions from this chapter in real device firmware
CoAP Protocol Fundamentals Design CoAP request/response and observe patterns over UDP Construct the application-layer reliability layer that makes UDP viable for critical IoT data
MQTT Protocol Fundamentals Implement publish-subscribe messaging over TCP with QoS levels Compare MQTT’s TCP-based reliability model against CoAP’s UDP-based approach
DTLS and Security Configure DTLS handshake, session resumption, and cipher suite selection Secure the UDP data flows identified in this chapter without sacrificing efficiency
Industrial IoT Case Studies Analyze real factory automation protocol stacks Evaluate how industrial deployments combine multiple transport strategies across heterogeneous data flows
Protocol Testing and Validation Capture and analyze protocol packets with Wireshark; measure actual overhead Validate your protocol selection decisions against real-world traffic measurements