6  TCP vs UDP: Comparison and Selection

In 60 Seconds

The TCP vs UDP choice for IoT comes down to three trade-offs: TCP’s guaranteed delivery costs 20-60 byte headers and 3-4x more energy per transmission for small payloads (widening under real-world packet loss and keep-alives); persistent TCP connections consume 4KB RAM each (40MB+ for 10,000 devices); and transport-layer reliability (TCP) retransmits everything blindly, while application-layer reliability (CoAP over UDP) lets you selectively retry only critical messages. Use TCP for firmware updates and commands; use UDP for periodic sensor telemetry.

Key Concepts
  • TCP Header Size: 20 bytes minimum (without options); connection-oriented, provides: sequence numbers, acknowledgments, window-based flow control, congestion control, error detection
  • UDP Header Size: 8 bytes fixed; connectionless, provides: source/dest port, length, checksum only; all reliability, ordering, and flow control must be implemented at application layer
  • Throughput vs Latency Trade-off: TCP congestion control reduces throughput during loss but prevents network collapse; UDP delivers at line rate but can cause congestion collapse without application-layer rate limiting
  • NAT Traversal: TCP: straightforward (NAT maintains state for each TCP connection); UDP: challenging (no connection to maintain NAT binding; requires STUN/TURN for peer-to-peer UDP)
  • TCP Half-Open Connection: TCP connection state where one side sent FIN but has not received FIN+ACK; half-open sockets consume server resources; mitigated by TCP keepalive and application timeouts
  • UDP Broadcast/Multicast: UDP supports broadcast (255.255.255.255) and multicast (224.0.0.0/4) natively; TCP does not support broadcast or multicast; key advantage of UDP for IoT device discovery
  • TCP Nagle Algorithm: Optimization buffering small TCP writes until ACK received or buffer reaches MSS; reduces packet count but adds latency for interactive applications; disable with TCP_NODELAY for real-time IoT
  • SCTP (Stream Control Transmission Protocol): Alternative transport providing: message-oriented delivery (like UDP), multi-streaming (like multiplexed TCP), and reliability (like TCP); rarely used in IoT but relevant for comparison

6.1 Learning Objectives

By the end of this chapter, you will be able to:

  • Evaluate TCP vs UDP trade-offs for IoT scenarios based on reliability, battery, and bandwidth costs
  • Calculate the battery life impact of protocol choice for battery-powered sensor deployments
  • Select the appropriate transport protocol by applying a structured decision framework to real IoT deployment scenarios
  • Distinguish between transport-layer and application-layer reliability for different IoT data types
  • Justify architecture decisions by quantifying connection overhead, memory costs, and gateway scalability limits
  • Diagnose common protocol misconfigurations such as incorrect TCP keep-alive timers and MTU fragmentation errors

TCP and UDP are two different ways to send data across a network. TCP is like sending a registered letter – you get confirmation it arrived, and missing pieces are resent. UDP is like broadcasting a radio signal – it is faster but there is no guarantee every listener hears every word. IoT devices use both depending on the situation.

“Let me settle the TCP vs UDP debate once and for all,” said Max the Microcontroller. “It depends on three things: can you tolerate lost data, are you battery-powered, and how often do you transmit?”

“If you are a smoke detector sending a fire alarm, use TCP,” said Sammy the Sensor. “That message MUST arrive. But if you are a temperature sensor sending readings every 30 seconds, use UDP. Losing one reading is not a crisis.”

“Connection overhead matters more than people think,” added Lila the LED. “Each TCP connection keeps 4 KB of state in memory. If your gateway handles 10,000 devices with persistent TCP connections, that is 40 megabytes of RAM just for connection state! UDP devices need zero connection memory on the gateway.”

“And there is a middle ground,” said Bella the Battery. “CoAP runs over UDP but adds selective reliability – you can mark individual messages as confirmable. That way you get UDP’s efficiency for routine readings and TCP-like reliability for critical commands, all without the full TCP overhead.”

6.2 Engineering Tradeoffs

Tradeoff: TCP Reliability vs UDP Efficiency

Option A: TCP - Guaranteed delivery via acknowledgments and retransmission, ordered packet delivery, flow/congestion control. Cost: 20-60 byte header, 3-way handshake overhead, 3-4x more energy per transmission for small payloads

Option B: UDP - Minimal 8-byte header, no connection state, immediate transmission, 60-78% less header overhead than TCP (8 bytes vs 20-36 bytes with options). Cost: No delivery guarantees, no ordering, application must handle reliability if needed

Decision factors: Use TCP when data loss is catastrophic (firmware updates, financial transactions, configuration). Use UDP when data is replaceable (periodic sensor readings where next value arrives soon), real-time delivery matters more than completeness (video streaming), or battery life is critical. Consider CoAP over UDP for selective reliability without full TCP overhead.

Tradeoff: Connection-Oriented vs Connectionless Communication

Option A: Persistent TCP Connections - Keep-alive maintains connection state, reduces handshake latency for subsequent messages, enables bidirectional push notifications. Cost: Memory for connection state (4KB per connection), keep-alive battery drain (2-10% overhead), gateway scalability limits

Option B: Connectionless UDP Messages - No state maintained between transmissions, each message independent, scales to unlimited devices. Cost: No push notifications without polling, must re-establish context per message, no automatic retry

Decision factors: Use persistent connections for devices requiring bidirectional communication (thermostats receiving commands) or high-frequency data streams. Use connectionless for battery-powered sensors sending periodic telemetry, especially at intervals >1 minute where connection overhead exceeds data overhead. Gateway with 10,000 persistent TCP connections needs 40MB+ RAM just for state.

Tradeoff: Transport-Layer vs Application-Layer Reliability

Option A: Transport-Layer Reliability (TCP) - Automatic retransmission of all lost packets, in-order delivery guaranteed, transparent to application. Cost: Head-of-line blocking delays fresh data behind lost packets, blind retry of stale data, overkill for replaceable telemetry

Option B: Application-Layer Reliability (CoAP CON over UDP) - Selective reliability per message, exponential backoff retries, application decides what matters. Cost: Must implement retry logic, message IDs for deduplication, more complex application code

Decision factors: TCP works best for bulk transfers (firmware) where every byte matters equally. Application-layer reliability (CoAP Confirmable) works best for IoT telemetry where some messages are critical (alarms) and others expendable (routine readings). In lossy networks (10%+ packet loss), application-layer selective retry often achieves better practical reliability than TCP’s head-of-line blocking.


Worked Example: Calculating Battery Life Impact of TCP vs UDP

Scenario: A soil moisture sensor in a smart agriculture deployment sends 50-byte readings every 15 minutes. The sensor runs on a 2000 mAh battery at 3.3V. Calculate how protocol choice affects battery life.

Given:

  • Payload size: 50 bytes
  • Transmission interval: 15 minutes (96 times/day)
  • Battery capacity: 2000 mAh at 3.3V
  • Radio TX current: 120 mA (typical for Wi-Fi module)
  • Radio RX current: 80 mA
  • Sleep current: 10 uA
  • Transmission rate: 1 Mbps

Step 1: Calculate UDP overhead per transmission

UDP packet structure:
- IP header: 20 bytes
- UDP header: 8 bytes
- Payload: 50 bytes
- Total: 78 bytes = 624 bits

Transmission time: 624 bits / 1 Mbps = 0.624 ms
Energy per TX: 120 mA × 0.624 ms = 0.075 mAs

Daily energy (UDP):
- 96 transmissions × 0.075 mAs = 7.2 mAs/day

Step 2: Calculate TCP overhead per transmission

TCP per-message exchange:
1. SYN: 40 bytes (IP 20 + TCP 20 header, no payload)
2. SYN-ACK: 40 bytes (receive)
3. ACK: 40 bytes (send, acknowledgment of SYN-ACK)
4. DATA: 90 bytes (IP 20 + TCP 20 header + 50-byte payload)
5. DATA-ACK: 40 bytes (receive)
6. FIN: 40 bytes (send, connection teardown)
7. FIN-ACK: 40 bytes (receive)

Total TX: 40 + 40 + 90 + 40 = 210 bytes = 1680 bits
Total RX: 40 + 40 + 40 = 120 bytes = 960 bits

TX time: 1680 bits / 1 Mbps = 1.68 ms
RX time: 960 bits / 1 Mbps = 0.96 ms

Energy per exchange:
- TX: 120 mA × 1.68 ms = 0.202 mAs
- RX: 80 mA × 0.96 ms = 0.077 mAs
- Total: 0.279 mAs per transmission

Daily energy (TCP):
- 96 transmissions × 0.279 mAs = 26.8 mAs/day

Step 3: Calculate battery life comparison

Sleep energy (both protocols):
- 23.9 hours/day in sleep (accounting for TX time)
- 10 uA × 23.9 hours = 0.239 mAh/day = 860.4 mAs/day

UDP total daily consumption:
- TX energy: 7.2 mAs = 0.002 mAh
- Sleep: 860.4 mAs = 0.239 mAh
- Total: 0.241 mAh/day
- Battery life: 2000 mAh / 0.241 mAh/day = 8,299 days = 22.7 years

TCP total daily consumption:
- TX/RX energy: 26.8 mAs = 0.007 mAh
- Sleep: 860.4 mAs = 0.239 mAh
- Total: 0.246 mAh/day
- Battery life: 2000 mAh / 0.246 mAh/day = 8,130 days = 22.3 years

Step 4: Factor in real-world overhead (retransmissions, keep-alives)

With 5% packet loss requiring retransmission:
- UDP + CoAP CON: TX overhead × 1.05 = 7.56 mAs/day
  Total: (7.56 + 860.4) / 3600 ≈ 0.241 × 1.002 ≈ 0.253 mAh/day -> 21.7 years
- TCP: TX/RX overhead × 1.15 = 30.8 mAs/day (head-of-line blocking adds cascading retransmits)
  Total: (30.8 + 860.4) / 3600 ≈ 0.248 × 1.013 ≈ 0.283 mAh/day -> 19.4 years

With TCP keep-alive every 30 minutes (if connection persisted):
- 48 keep-alives/day × 40 bytes × 120 mA = additional 0.005 mAh/day
- New TCP total: 0.288 mAh/day -> 19.0 years

Result:

Protocol Daily Energy Battery Life Difference
UDP (ideal) 0.241 mAh 22.7 years Baseline
TCP (ideal) 0.246 mAh 22.3 years -2%
UDP + CoAP (5% loss) 0.253 mAh 21.7 years -4%
TCP (5% loss + keepalive) 0.288 mAh 19.0 years -16%

Key Insight: Sleep current dominates total energy consumption for low duty-cycle sensors, which compresses the apparent protocol difference. However, real-world conditions amplify the gap: TCP’s handshake overhead, retransmission behavior, and keep-alive requirements combine to reduce battery life by 16% compared to UDP with application-layer reliability. For a 1,000-sensor deployment, this 3.7-year difference per sensor translates to 3,700 fewer battery replacements over the fleet lifetime. The calculation also reveals that optimizing wake time and sleep current has greater impact than protocol choice alone for very low duty-cycle sensors.

Adjust the parameters below to explore how protocol choice affects battery life for your IoT deployment.

6.3 Common Pitfalls

Pitfall: Using Default TCP Keep-Alive Timers for IoT

The Mistake: Developers deploy MQTT or HTTP connections with default TCP keep-alive settings (typically 2 hours on Linux, 7200 seconds), assuming the OS handles connection maintenance appropriately for IoT devices.

Why It Happens: TCP keep-alive is often configured at the OS level, not the application level. Developers focus on application logic and assume “keep-alive” means the connection stays alive automatically. They don’t realize the default 2-hour interval is designed for desktop computers, not battery-powered sensors or NAT-traversing IoT devices.

The Fix: Configure keep-alive intervals based on your deployment constraints:

  • NAT timeout avoidance: Set keep-alive to 30-60 seconds (most consumer NATs timeout idle connections at 60-300 seconds)
  • Battery-powered devices: Set keep-alive to 5-15 minutes and accept occasional reconnections, or use UDP-based protocols instead
  • Cellular IoT: Set keep-alive to 25-28 seconds (many carrier NATs timeout at 30 seconds)
// Linux: Set TCP keep-alive to 60 seconds
int keepalive = 1;
int keepidle = 60;    // Start probes after 60s idle (not 7200s)
int keepintvl = 10;   // Probe every 10s (not 75s)
int keepcnt = 3;      // 3 failed probes = dead (not 9)

setsockopt(sock, SOL_SOCKET, SO_KEEPALIVE, &keepalive, sizeof(keepalive));
setsockopt(sock, IPPROTO_TCP, TCP_KEEPIDLE, &keepidle, sizeof(keepidle));
setsockopt(sock, IPPROTO_TCP, TCP_KEEPINTVL, &keepintvl, sizeof(keepintvl));
setsockopt(sock, IPPROTO_TCP, TCP_KEEPCNT, &keepcnt, sizeof(keepcnt));

Impact: With default 2-hour keep-alive, NAT-traversing IoT connections silently die within minutes. Users report “random disconnections” that are actually predictable NAT timeout failures. Proper configuration (60s keep-alive) adds only ~1,440 small packets/day but maintains reliable connectivity.

Pitfall: Ignoring MTU and Fragmentation for IoT Payloads

The Mistake: Developers send large UDP datagrams (500-1500 bytes) without considering the path MTU, causing IP fragmentation that dramatically increases packet loss rates on lossy wireless networks.

Why It Happens: On wired Ethernet, the 1500-byte MTU rarely causes problems. Developers test on Wi-Fi (also 1500 MTU at IP layer) and assume it works everywhere. They don’t realize that 6LoWPAN has 127-byte frames, LoRaWAN has 51-222 byte limits, and LTE-M/NB-IoT have varying MTUs. When a packet exceeds the path MTU, IP fragmentation occurs silently.

The Fix: Design payloads to fit within the smallest expected MTU on your network path:

Network Type IP MTU Safe Payload (UDP) Safe Payload (TCP)
Ethernet/Wi-Fi 1500 bytes 1472 bytes 1460 bytes
6LoWPAN (uncompressed) 1280 bytes 1252 bytes 1240 bytes
6LoWPAN (compressed) ~100-200 bytes ~80-180 bytes ~60-160 bytes
LoRaWAN SF7 222 bytes 214 bytes 202 bytes
LoRaWAN SF12 51 bytes 43 bytes 31 bytes
NB-IoT ~1358 bytes 1330 bytes 1318 bytes

Fragmentation math: A single 1000-byte UDP packet over 6LoWPAN (127-byte frames) requires 8+ fragments. With 5% per-frame loss rate, delivery probability drops to (0.95)^8 = 66%. The same data sent as 8 independent 125-byte packets with application-level ACKs achieves 95%+ delivery because each packet can be individually retransmitted.

Best practice: For constrained networks, keep UDP payloads under 100 bytes when possible. Use CoAP block-wise transfer for larger data, which handles fragmentation at the application layer with proper retransmission per block.

Explore how packet size and per-fragment loss rate affect end-to-end delivery probability across different IoT network types.

6.4 Summary

6.4.1 Protocol Selection Quick Reference

Scenario Protocol Reasoning Battery Impact
Temperature sensor (every 5 min) UDP Next reading arrives soon, loss tolerable 45+ years
Firmware update (500 KB) TCP+TLS Must be 100% reliable and secure Acceptable (infrequent)
Smart lock (command) UDP+DTLS+CoAP CON Security critical, low latency 20+ years
Video stream (1080p) UDP+RTP Real-time matters more than perfection N/A (mains)
Health alert (critical) TCP+TLS Cannot lose alarm data Acceptable (rare)

Decision Framework:

1. Is data loss catastrophic? → YES: TCP
2. Is transmission frequent (>1/min)? → YES: UDP
3. Is security required? → YES: Add DTLS/TLS
4. Is latency critical (<100ms)? → YES: UDP
5. Is device battery-powered? → YES: Prefer UDP

Key Metrics (10-byte payload, every 10 seconds, 2000 mAh battery at 120 mA TX): - UDP: 38 bytes (IP 20 + UDP 8 + payload 10), 0.3ms radio, ~45-year battery - TCP (per exchange): 210 bytes TX + 120 bytes RX = 330 bytes total, 1.7ms TX + 1.0ms RX, ~8-year battery - DTLS over UDP: ~53 bytes (UDP + DTLS record header), ~0.4ms radio, ~22-year battery

6.5 Key Takeaways

Summary

Core Concepts:

  • Transport protocols provide end-to-end communication between applications: UDP (connectionless, 8-byte header) vs TCP (connection-oriented, 20-60 byte header)
  • UDP offers minimal overhead and low power consumption but provides no delivery guarantees, ordering, or flow control
  • TCP ensures reliable, ordered delivery through acknowledgments and retransmissions but requires 3-way handshake and connection state
  • DTLS (Datagram TLS) adds security to UDP while maintaining its low-overhead characteristics for IoT applications

Practical Applications:

  • Use UDP for periodic sensor telemetry where fresh data replaces old (temperature readings every 60 seconds, video streaming)
  • Use TCP for critical operations requiring reliability (firmware updates, configuration changes, smart lock commands)
  • CoAP adds application-layer reliability to UDP through Confirmable messages without TCP’s connection overhead
  • Hybrid approach: UDP for routine telemetry, TCP for critical operations in the same IoT system

Design Considerations:

  • UDP checksum is mandatory in IPv6 and critical for detecting bit errors from radio interference in wireless IoT networks
  • TCP persistent connections drain batteries through keep-alive packets (2-10% battery overhead) and connection state maintenance
  • For small payloads, UDP’s 8-byte header vs TCP’s 20-byte represents 40% overhead savings on 20-byte sensor readings
  • Application-level reliability on UDP (CoAP CON messages) balances efficiency with selective reliability better than full TCP

Common Pitfalls:

  • Choosing TCP for periodic telemetry with acceptable loss (wastes energy on connection overhead for replaceable data)
  • Assuming UDP always means unreliable (CoAP Confirmable messages add reliability with exponential backoff retransmission)
  • Maintaining persistent TCP connections for thousands of IoT devices (gateway memory exhaustion, keep-alive battery drain)
  • Using TCP for real-time applications where retransmission delays cause head-of-line blocking (video, voice)

6.6 Emerging Analysis: QUIC as a Third Option for IoT Transport

While TCP and UDP have dominated IoT transport for two decades, QUIC (RFC 9000) is emerging as a potential third option that addresses specific pain points of both protocols. Understanding where QUIC fits – and where it does not – helps future-proof IoT architecture decisions.

What QUIC Offers IoT:

QUIC combines UDP’s low-latency connection setup with TCP’s reliability, plus built-in TLS 1.3 encryption. For IoT, three features are particularly relevant:

  1. 0-RTT Connection Resumption: A device that has previously connected to a server can send data in the very first packet of a reconnection – zero round trips of handshake delay. For a sensor that wakes hourly, this eliminates both the TCP 3-way handshake (1.5 RTT) and the TLS handshake (1-2 RTT), saving 2.5-3.5 round trips per connection.

  2. Multiplexed Streams Without Head-of-Line Blocking: QUIC allows multiple independent streams within a single connection. A lost packet on stream 1 (telemetry) does not block stream 2 (firmware chunk). TCP’s single byte-stream model blocks all data behind any lost packet.

  3. Connection Migration: Like DTLS Connection ID, QUIC sessions survive IP address changes. A mobile sensor moving between Wi-Fi access points maintains its QUIC connection without re-handshaking.

Quantitative Comparison:

Metric TCP+TLS 1.3 UDP+DTLS 1.3 QUIC
Initial connection 2 RTT (TCP + TLS) 1 RTT (DTLS) 1 RTT
Resumed connection 1 RTT (TLS resumption) 0-1 RTT (session ticket) 0 RTT
Per-packet overhead 20-60 bytes (TCP+TLS) 21-29 bytes (UDP+DTLS) 16-28 bytes
Head-of-line blocking Yes (single stream) No (independent datagrams) No (per-stream)
Built-in encryption No (separate TLS) No (separate DTLS) Yes (mandatory)
NAT traversal Keep-alive required Connection ID Connection ID
IoT library maturity Excellent (mbedTLS, wolfSSL) Good (tinydtls, mbedTLS) Early (quiche, msquic)
Minimum RAM footprint ~25 KB (TCP+TLS) ~18 KB (UDP+DTLS) ~40 KB (current implementations)

Where QUIC Fits in IoT Today:

  • IoT Gateways (>256 KB RAM): QUIC’s multiplexing benefit is strongest on gateways aggregating data from hundreds of sensors and forwarding to cloud. One QUIC connection replaces hundreds of TCP connections.
  • Connected Vehicles: 0-RTT reconnection and connection migration match the requirements of high-mobility, latency-sensitive applications.
  • Edge Computing Nodes: QUIC’s stream multiplexing enables simultaneous firmware download, telemetry upload, and command channel on a single connection.

Where QUIC Does Not Fit (Yet):

  • Constrained Sensors (<32 KB RAM): Current QUIC implementations require 40+ KB RAM – double DTLS’s 18 KB. Until lightweight QUIC stacks mature, CoAP/DTLS remains the better choice for Class 1-2 devices.
  • LPWAN: LoRaWAN and NB-IoT payloads (51-250 bytes) cannot accommodate QUIC’s connection establishment packets. UDP with minimal CoAP headers remains necessary.
  • Real-Time Industrial Control: QUIC’s congestion control adds variable latency unsuitable for deterministic industrial protocols (OPC UA, Modbus).

Key Insight: QUIC is not a replacement for UDP+DTLS on constrained devices, but it is becoming the preferred transport for IoT gateways and edge nodes communicating with cloud platforms. Google, Cloudflare, and AWS already serve IoT API endpoints over QUIC (HTTP/3). As lightweight QUIC implementations emerge (targeting 15-20 KB RAM), expect QUIC to gradually replace TCP+TLS for medium-constrained IoT devices within 3-5 years, while CoAP/UDP/DTLS remains dominant on the most constrained sensors.

6.7 Knowledge Check

6.8 See Also

Prerequisites:

Security:

Application Protocols:

  • MQTT: Why MQTT uses TCP (QoS 1/2 depends on TCP reliability)
  • CoAP: Why CoAP uses UDP (Confirmable messages for selective reliability)
  • HTTP for IoT: HTTP over TCP

System Design:

6.9 Try It Yourself

Exercise 1: Persistent vs Per-Message TCP

Measure the overhead difference between keeping a TCP connection open vs opening/closing for each message. Use the battery calculator from the worked example.

Exercise 2: NAT Timeout Testing

Configure keep-alive intervals (30s, 60s, 120s) and measure which survives your NAT router’s timeout. Use ping or simple TCP echo server.

Exercise 3: MTU Fragmentation Analysis

Send UDP packets of different sizes (100B, 500B, 1000B, 1500B) over 6LoWPAN or LoRaWAN. Measure fragmentation impact on delivery rate.

6.10 What’s Next

After mastering TCP vs UDP comparison and selection, continue with these related chapters:

Chapter Topic Why It Matters
DTLS and Security Datagram TLS for UDP Add TLS-grade encryption without TCP overhead for constrained IoT devices
Transport Comprehensive Review Full transport layer recap Consolidate knowledge of UDP, TCP, DTLS, and QUIC across all transport topics
Reliability and Error Handling IoT reliability pillars Explore the five strategies for handling packet loss at the transport layer
MQTT Fundamentals Application protocol on TCP See how MQTT’s QoS levels depend on TCP’s reliable ordered delivery
CoAP Fundamentals Application protocol on UDP Understand how CoAP adds selective reliability to UDP with Confirmable messages
Transport Optimizations Implementation tuning Apply TCP keep-alive, MTU optimization, and connection pooling in real IoT systems