17  Network Traffic Analysis

17.1 Learning Objectives

  • Capture and filter network traffic using Wireshark, tcpdump, and tshark for IoT protocol debugging
In 60 Seconds

IoT network traffic analysis uses protocol analyzers (Wireshark, tcpdump, specialized RF sniffers) to capture and decode actual device communications, revealing protocol timing issues, unexpected retransmissions, and security anomalies that are invisible to simulation but critical to production reliability.

  • Analyse MQTT, CoAP, Zigbee, and LoRaWAN traffic patterns to validate protocol implementations
  • Diagnose connectivity issues, packet loss, and latency bottlenecks through traffic inspection
  • Apply traffic analysis techniques to detect security threats and anomalous device behaviour

Network traffic analysis is like being a detective investigating how your IoT devices communicate. Just as you might watch security camera footage to understand what happened in a building, traffic analysis tools like Wireshark let you “watch” the data packets flowing through your network. You can see exactly what your temperature sensor sent to the cloud, whether the message arrived successfully, and if there were any problems along the way. This is essential for diagnosing connectivity issues, validating your protocols work correctly, and detecting security threats.

“Traffic analysis is like being a postal inspector who opens letters to check what is inside,” said Max the Microcontroller. “Except we are opening network packets! Tools like Wireshark let you see every message flowing through your IoT network – what Sammy is sending, where it is going, and how long it takes to arrive.”

Sammy the Sensor demonstrated: “Say I send a temperature reading using MQTT. Traffic analysis shows me the exact packet: timestamp, source address, destination address, protocol, and the data itself. If the reading never arrives at the cloud, I can trace exactly where it got lost – like following footprints in the snow!”

“Traffic analysis is also a security tool,” added Lila the LED. “If a hacker has compromised one of our devices, you might see strange traffic patterns – like a sensor suddenly sending data to an unknown server at 3 AM. That is a red flag!” Bella the Battery noted, “Understanding your traffic also helps me last longer – if you discover your device is sending unnecessary data, you can optimize and save energy!”

17.2 Overview

Network traffic analysis is the process of capturing, examining, and interpreting data packets flowing through a network. For IoT systems, traffic analysis serves multiple critical purposes: validating protocol implementations, diagnosing connectivity issues, measuring performance, detecting security threats, and understanding device behavior.

This comprehensive topic is covered across four focused chapters:

17.2.1 Chapter Guide

Chapter Focus Key Topics
Traffic Analysis Fundamentals Capture concepts and strategy Capture points, promiscuous mode, capture vs. display filters
Traffic Capture Tools Tool usage and configuration Wireshark, tcpdump, tshark, specialized IoT sniffers
Analyzing IoT Protocols Protocol-specific analysis MQTT, CoAP, Zigbee, LoRaWAN traffic patterns
Traffic Analysis Testing Testing and monitoring HIL testing, load generation, anomaly detection, worked examples

17.3 Learning Path

Recommended Reading Order

Beginner Path (2-3 hours): 1. Start with Traffic Analysis Fundamentals to understand core concepts 2. Continue to Traffic Capture Tools for hands-on tool usage

Complete Path (4-5 hours): 1. Traffic Analysis Fundamentals - Capture strategy 2. Traffic Capture Tools - Wireshark, tcpdump, tshark 3. Analyzing IoT Protocols - Protocol-specific patterns 4. Traffic Analysis Testing - Production testing and monitoring

17.4 Key Concepts

Why Traffic Analysis Matters for IoT:

  • Protocol Validation: Verify correct MQTT, CoAP, Zigbee implementations
  • Performance Troubleshooting: Identify latency, packet loss, bandwidth bottlenecks
  • Security Monitoring: Detect unauthorized access, DDoS attacks, compromised devices
  • Debugging Device Behavior: Understand connection failures and erratic behavior
  • Capacity Planning: Measure actual traffic volumes for infrastructure scaling
  • Compliance Verification: Ensure encryption and data transmission requirements

17.5 Prerequisites

Before diving into traffic analysis, you should be familiar with:

17.6 Quick Reference

Interpretation: Lower complexity is easier to learn. Higher remote-friendliness means better for SSH/headless access. Higher automation means better for CI/CD pipelines.

Essential Wireshark Filters:

mqtt                            # All MQTT traffic
coap                            # All CoAP traffic
ip.addr == 192.168.1.100       # Specific device
tcp.analysis.retransmission    # Retransmissions only
frame.time_delta > 1           # Packets with >1s gap

Essential tcpdump Commands:

# Capture MQTT traffic to file
sudo tcpdump -i eth0 port 1883 -w mqtt_capture.pcap

# Remote capture with live Wireshark
ssh user@gateway "tcpdump -w -" | wireshark -k -i -

Scenario: An IoT temperature sensor publishes to AWS IoT Core every 60 seconds. After 2-3 hours, the connection drops and never reconnects. System logs show “MQTT connection lost” but no root cause.

Step 1: Capture Traffic

# Capture MQTT traffic on port 8883 (MQTT over TLS)
sudo tcpdump -i wlan0 port 8883 -w mqtt_failure.pcap -v

Run for 4 hours to capture the failure.

Step 2: Open in Wireshark

Apply filter: mqtt || tcp.port == 8883

Step 3: Analyze MQTT Session

Initial connection (timestamp 0:00:05):

CLIENT → BROKER: CONNECT (ClientID: sensor-01, KeepAlive: 60s, CleanSession: true)
BROKER → CLIENT: CONNACK (Session Present: 0, Return Code: 0 - Accepted)

Periodic publishes every 60 seconds:

0:01:05 - PUBLISH (Topic: iot/sensors/temp, QoS: 1, Payload: {"temp":22.5})
0:02:05 - PUBLISH (Topic: iot/sensors/temp, QoS: 1, Payload: {"temp":22.7})
...
2:45:05 - PUBLISH (Topic: iot/sensors/temp, QoS: 1, Payload: {"temp":23.1})

Step 4: Identify the Problem

At timestamp 2:45:35 (30 seconds after last publish):

BROKER → CLIENT: TCP FIN (connection close initiated by broker)
CLIENT → BROKER: TCP ACK

No PINGREQ from client!

Expected behavior: - KeepAlive = 60s - Client should send PINGREQ every 45-60 seconds (1.5× keepalive / 1.0× keepalive) - Last PUBLISH at 2:45:05, next PINGREQ expected by 2:46:05

Calculate MQTT keepalive timing for this connection failure scenario:

\[t_{keepalive} = 60 \text{ sec (from CONNECT packet)}\]

MQTT spec requires client to send PINGREQ before keepalive expires:

\[t_{max\_silence} = 1.5 \times t_{keepalive} = 1.5 \times 60 = 90 \text{ sec}\]

Broker timeout (allows grace period):

\[t_{broker\_timeout} = t_{keepalive} + t_{grace} = 60 + 60 = 120 \text{ sec}\]

Actual timeline:

  • Last PUBLISH: \(t_0 = 0\) (2:45:05)
  • Expected PINGREQ by: \(t_0 + 60 = 60\) sec (2:46:05)
  • Broker expects activity by: \(t_0 + 120 = 120\) sec (2:47:05)
  • No PINGREQ received → broker closes at \(t = 120\) sec

Energy impact of fixing with mqttClient.loop() instead of deep sleep: \[P_{active} = 80 \text{ mW}, \quad t_{publish} = 3 \text{ sec}, \quad t_{period} = 60 \text{ sec}\] \[E_{correct} = 80 \text{ mW} \times 60 \text{ sec} = 4,800 \text{ mJ (continuous wake)}\] \[E_{broken} = 80 \text{ mW} \times 3 \text{ sec} + 0.01 \text{ mW} \times 57 \text{ sec} = 240.57 \text{ mJ (deep sleep)}\]

Keepalive protocol compliance costs 20× more energy, but it’s the only way to maintain the connection!

Actual behavior: - Last packet from client: 2:45:05 (PUBLISH) - Broker timeout: 2:47:05 (60s keepalive + 60s grace = 120s) - Broker closes connection: 2:47:05

Root Cause: Client firmware enters deep sleep after publishing and never wakes to send PINGREQ. Broker assumes client died and closes connection.

Step 5: Fix the Firmware

// WRONG CODE (causes problem)
void loop() {
    publishTemperature();
    ESP.deepSleep(60e6);  // Sleep 60 seconds - NO KEEPALIVE!
}

// CORRECT CODE (maintains connection)
void loop() {
    static unsigned long lastPublish = 0;

    if (millis() - lastPublish >= 60000) {
        publishTemperature();
        lastPublish = millis();
    }

    // Let MQTT library handle keepalive automatically
    mqttClient.loop();  // Sends PINGREQ when needed
    delay(100);
}

Step 6: Verify Fix

New traffic capture shows:

3:00:05 - PUBLISH
3:00:50 - PINGREQ (45 seconds after publish, within keepalive window)
3:00:50 - PINGRESP
3:01:05 - PUBLISH
3:01:50 - PINGREQ
...

Connection stays alive indefinitely!

Lesson: Traffic analysis revealed the client violated MQTT keepalive protocol. Without Wireshark, this would have been impossible to diagnose from logs alone.

When diagnosing IoT network issues, select the appropriate tool based on your access level and requirements:

Tool Use Case Pros Cons When to Choose
Wireshark (GUI) Deep protocol analysis • Rich protocol dissectors
• Visual timeline
• Follow TCP streams
• Requires GUI
• Resource-intensive
Interactive debugging at your desk
tcpdump Remote capture • Command-line (SSH-friendly)
• Lightweight
• Always available
• No GUI
• Cryptic output
Gateway/edge device capture
tshark Automated analysis • Command-line + filters
• Scriptable
• CI/CD integration
• Steep learning curve Production monitoring, automated tests
Zigbee Sniffer 802.15.4 wireless • Captures over-the-air
• Sees all mesh traffic
• Expensive ($400+)
• Needs decryption keys
Zigbee/Thread troubleshooting
BLE Sniffer Bluetooth Low Energy • Over-the-air capture
• Pairing analysis
• Expensive ($100-300)
• Channel hopping complexity
BLE connection issues
Logic Analyzer UART/SPI/I2C • Precise timing
• Multi-channel
• Hardware probe needed
• Limited to wired protocols
Sensor communication debugging

Decision Tree:

Q1: Can you access the network traffic path?
├─ YES: Is it wired (Ethernet/Wi-Fi)?
│   ├─ YES: Can you use GUI?
│   │   ├─ YES: Use **Wireshark** (best for learning/exploring)
│   │   └─ NO: Use **tcpdump** → save .pcap → analyze later on desktop
│   └─ NO: Is it Zigbee/Thread?
│       ├─ YES: Use **Zigbee Sniffer** (TI CC2531, Nordic nRF Sniffer)
│       └─ NO: Use **BLE Sniffer** (Nordic, Adafruit)
└─ NO: Is it sensor-to-MCU communication (I2C/SPI/UART)?
    └─ YES: Use **Logic Analyzer** (Saleae, DSLogic)

Q2: Do you need automation/scripting?
├─ YES: Use **tshark** with filters
└─ NO: Use **Wireshark** GUI

Real-World Example:

Scenario: Smart home with 50 Zigbee devices, intermittent light switch failures.

Step 1: Use Zigbee sniffer to capture over-the-air traffic - Tool: TI CC2531 USB dongle ($25) + Wireshark Zigbee plugin - Capture all 802.15.4 frames on channel 15

Step 2: Filter for failed switch

Display Filter: zbee_aps.src_endpoint == 0x01 && zbee_nwk.src16 == 0x4A2F

Step 3: Identify pattern - Switch sends command → Coordinator ACKs → No response from light - RSSI shows -85 dBm (weak signal) - Packet retry count: 3/3 (maxed out)

Decision: Signal strength issue, not protocol problem. Move coordinator or add Zigbee router.

Key Takeaway: Right tool (Zigbee sniffer) revealed RF issue that application-level logs couldn’t show.

Common Mistake: Ignoring TLS Encryption

The Mistake: An engineer tries to analyze MQTT traffic with Wireshark but sees only encrypted gibberish because the IoT device uses MQTT over TLS (port 8883), not plain MQTT (port 1883).

Why It Happens:

Modern IoT protocols use TLS encryption for security: - MQTT over TLS: Port 8883 (encrypted) - CoAP over DTLS: Port 5684 (encrypted) - HTTPS: Port 443 (encrypted)

Wireshark shows:

TLSv1.2 Application Data (encrypted)
TLSv1.2 Application Data (encrypted)
...

No MQTT dissection, no readable payloads!

The Problem:

Without decryption, you can only see: - Connection establishment (TCP handshake) - TLS handshake (negotiation) - Encrypted application data (opaque)

You CANNOT see: - MQTT CONNECT packets - MQTT PUBLISH payloads - CoAP requests - HTTP headers/bodies

Solutions (Ranked by Difficulty):

1. Use a Test Broker Without TLS (Easiest)

For development/debugging only:

// TEMPORARY DEBUG CODE - Remove before production
#define MQTT_SERVER "test.mosquitto.org"
#define MQTT_PORT 1883  // Plain MQTT (no TLS)

Run Wireshark, apply filter mqtt, see everything unencrypted.

Warning: Never use unencrypted MQTT in production!

2. Capture TLS Keys from Client (Medium Difficulty)

Export TLS session keys from your device:

// ESP32 - Enable key logging
#define MBEDTLS_SSL_SESSION_TICKETS 1
WiFiClientSecure client;
client.enableSSLSessionResumption();

Configure Wireshark: 1. Edit → Preferences → Protocols → TLS 2. “(Pre)-Master-Secret log filename”: /path/to/sslkeylog.txt 3. Restart capture

Wireshark will decrypt TLS using the key file.

3. Use mitmproxy (Man-in-the-Middle) (Hard)

Set up a TLS-intercepting proxy:

# Install mitmproxy
pip install mitmproxy

# Run proxy on port 8080
mitmweb --mode reverse:https://mqtt.example.com:8883 --listen-port 8080

Configure device to connect to proxy:

#define MQTT_SERVER "192.168.1.100"  // mitmproxy host
#define MQTT_PORT 8080

mitmproxy decrypts and re-encrypts traffic, showing plaintext.

Limitations: Requires installing custom CA certificate on device.

4. Use Built-in Logging (Easiest for Simple Cases)

Many MQTT libraries have debug modes:

// PubSubClient debug
#define DEBUG_ESP_PORT Serial
#define DEBUG_ESP_SSL

Output shows:

MQTT: Sending CONNECT
MQTT: Received CONNACK
MQTT: Publishing to topic iot/sensors/temp: {"temperature":22.5}

Not as detailed as Wireshark, but often sufficient.

Best Practice:

  1. Development: Use unencrypted connections for easy debugging
  2. Staging: Use TLS with key logging
  3. Production: Accept that traffic is opaque (monitor via application logs)

Remember: If you can’t decrypt TLS traffic, you can still analyze: - Connection timing and frequency - Packet sizes (revealing message patterns) - TCP-level errors (retransmissions, resets) - TLS handshake failures

17.7 Knowledge Check

17.8 Concept Relationships

Prerequisites:

Builds Toward:

Complements:

17.9 See Also

Common Pitfalls

A standard Wi-Fi adapter only captures packets addressed to its own MAC address. Analyzing broadcast collisions or neighbor communications requires a Wi-Fi adapter in monitor/promiscuous mode, which many built-in laptop adapters do not support. Use a dedicated USB Wi-Fi adapter known to support monitor mode.

Protocol analyzers show both original transmissions and retransmissions. Counting all packets without filtering retransmissions gives inflated traffic estimates. Conversely, ignoring the retransmission rate misses a key reliability indicator — high retransmission rates signal a degraded link.

Capturing all traffic on a busy network fills disk rapidly and makes analysis difficult. Use capture filters (not display filters) to limit captured packets to the specific device MAC address, IP range, or protocol port being analyzed. This reduces storage requirements by 10–100× on congested networks.

Analyze traffic under stress conditions too: during network congestion, node join/rejoin events, OTA firmware updates, and after gateway reboots. Bugs and performance degradations almost always appear under stress, not during steady-state normal operation.

17.10 What’s Next

Begin with Traffic Analysis Fundamentals to learn about capture points, promiscuous mode, and filter strategies essential for effective network analysis.

Previous Current Next
Network Design Exercises Network Traffic Analysis Reading a Spec Sheet