40  Zigbee Lab: Python Network Analyzer

In 60 Seconds

This lab builds a Python network analyzer that monitors Zigbee network health via Zigbee2MQTT and MQTT. The tool tracks device battery levels, link quality (LQI), and online status in real-time, generating automated alerts for critical battery (<10%), warning battery (<20%), and offline devices. Extensible with Slack notifications and InfluxDB logging for production deployments. Requires a Raspberry Pi with Zigbee2MQTT and MQTT broker.

40.1 Learning Objectives

By the end of this chapter, you will be able to:

  • Construct an MQTT integration: Interface Python with Zigbee networks via the Zigbee2MQTT bridge and paho-mqtt library
  • Evaluate device health metrics: Classify battery levels, link quality, and online status using tiered alert thresholds
  • Map network topology: Diagram mesh connections between coordinator, routers, and end devices from bridge data
  • Design automated health alerts: Build a tiered notification system for critical, warning, and offline device states
  • Analyze network traffic patterns: Calculate message rates, uptime statistics, and per-device activity trends

What is this lab? A Python-based monitoring tool that connects to Zigbee2MQTT to provide real-time visibility into your Zigbee network health.

When to use:

  • For production network monitoring and troubleshooting
  • When you need battery and signal strength alerts
  • To visualize mesh topology and routing paths

Key Topics:

Topic Focus
Zigbee2MQTT Bridge between Zigbee and MQTT
MQTT Protocol Pub/sub messaging for device data
Health Monitoring Battery, LQI, online status
Alerting Automated notifications

Prerequisites:

40.2 Prerequisites

This lab requires:

  • Raspberry Pi with Zigbee2MQTT installed
  • MQTT broker running (Mosquitto recommended)
  • Zigbee coordinator (CC2531 or similar)
  • Paired Zigbee devices for monitoring
  • Python 3.8+ with paho-mqtt library
Zigbee2MQTT Overview

Zigbee2MQTT is an open-source project that bridges Zigbee devices to MQTT, allowing you to monitor and control devices without proprietary hubs. It publishes device data to MQTT topics like zigbee2mqtt/device_name.

40.3 Architecture Overview

Diagram showing Python network analyzer architecture with Zigbee2MQTT, MQTT broker, and device connections
Figure 40.1: Python network analyzer architecture. Zigbee devices communicate with the coordinator, which connects to Zigbee2MQTT. The MQTT broker receives device state updates, which the Python script subscribes to for monitoring and alerting.

40.4 Python Network Analyzer Implementation

40.4.1 Installation

# Install required packages
pip install paho-mqtt

# Verify Zigbee2MQTT is running
sudo systemctl status zigbee2mqtt

40.4.2 Complete Analyzer Code

#!/usr/bin/env python3
"""
Zigbee Network Analyzer
-----------------------
Monitors Zigbee network health via Zigbee2MQTT.
Tracks device status, battery levels, and link quality.
"""

import json
import time
from datetime import datetime
from dataclasses import dataclass
from typing import Dict, Optional
import paho.mqtt.client as mqtt


@dataclass
class ZigbeeDevice:
    """Represents a Zigbee device in the network."""
    friendly_name: str
    device_type: str = "Unknown"
    battery: Optional[int] = None
    link_quality: Optional[int] = None
    last_seen: Optional[datetime] = None
    message_count: int = 0
    is_online: bool = True

    def health_status(self) -> str:
        """Determine device health based on battery and connectivity."""
        if not self.is_online:
            return "offline"
        if self.battery is not None:
            if self.battery < 10:
                return "critical"
            if self.battery < 20:
                return "warning"
        return "healthy"


class ZigbeeNetworkAnalyzer:
    """Analyzes and monitors a Zigbee network via MQTT."""

    def __init__(self, mqtt_host: str = "localhost", mqtt_port: int = 1883):
        self.mqtt_host = mqtt_host
        self.mqtt_port = mqtt_port
        self.devices: Dict[str, ZigbeeDevice] = {}
        self.start_time = datetime.now()
        self.total_messages = 0
        self.network_map = {}

        # MQTT client setup (paho-mqtt v1 API; for v2.0+ use
        # mqtt.Client(mqtt.CallbackAPIVersion.VERSION1))
        self.client = mqtt.Client()
        self.client.on_connect = self._on_connect
        self.client.on_message = self._on_message

    def _on_connect(self, client, userdata, flags, rc):
        """Handle MQTT connection."""
        if rc == 0:
            print("[OK] Connected to MQTT broker")
            # Subscribe to all Zigbee2MQTT topics
            client.subscribe("zigbee2mqtt/#")
            print("[OK] Subscribed to zigbee2mqtt/#")
        else:
            print(f"[ERROR] Connection failed with code {rc}")

    def _on_message(self, client, userdata, msg):
        """Process incoming MQTT messages."""
        topic = msg.topic

        try:
            payload = json.loads(msg.payload.decode())
        except json.JSONDecodeError:
            return

        # Handle network map updates
        if topic == "zigbee2mqtt/bridge/networkmap/raw":
            self._process_network_map(payload)
            return

        # Handle device state updates
        if topic.startswith("zigbee2mqtt/") and "/bridge" not in topic:
            device_name = topic.replace("zigbee2mqtt/", "")
            self._update_device(device_name, payload)

    def _process_network_map(self, payload):
        """Process network topology map."""
        self.network_map = payload
        print("\n[OK] Network map received")

        if "nodes" in payload:
            for node in payload["nodes"]:
                name = node.get("friendly_name", node.get("ieeeAddr"))
                if name not in self.devices:
                    self.devices[name] = ZigbeeDevice(
                        friendly_name=name,
                        device_type=node.get("type", "Unknown")
                    )
                else:
                    self.devices[name].device_type = node.get("type", "Unknown")

    def _update_device(self, name: str, payload: dict):
        """Update device state from payload."""
        if name not in self.devices:
            self.devices[name] = ZigbeeDevice(friendly_name=name)

        device = self.devices[name]
        device.last_seen = datetime.now()
        device.message_count += 1
        device.is_online = True
        self.total_messages += 1

        # Extract common fields
        if "battery" in payload:
            device.battery = payload["battery"]
        if "linkquality" in payload:
            device.link_quality = payload["linkquality"]
        if "device_type" in payload:
            device.device_type = payload["device_type"]

    def get_health_alerts(self) -> list:
        """Generate health alerts for problematic devices."""
        alerts = []

        for name, device in self.devices.items():
            status = device.health_status()

            if status == "critical":
                alerts.append(f"[CRITICAL] {name} - Battery {device.battery}%")
            elif status == "warning":
                alerts.append(f"[WARNING] {name} - Battery {device.battery}%")
            elif status == "offline":
                alerts.append(f"[OFFLINE] {name}")

        return alerts

    def print_status(self):
        """Print current network status."""
        uptime = (datetime.now() - self.start_time).seconds

        print("\n" + "=" * 80)
        print("ZIGBEE NETWORK STATUS")
        print("=" * 80)
        print(f"\nDevices: {len(self.devices)} total, "
              f"{sum(1 for d in self.devices.values() if d.is_online)} online")
        print(f"Uptime: {uptime} seconds")
        print(f"Messages: {self.total_messages}")

        print("\n" + "-" * 80)
        print(f"{'Device':<25} {'Type':<12} {'Battery':<10} "
              f"{'Link':<8} {'Status':<12} {'Messages':<10}")
        print("-" * 80)

        for name, device in sorted(self.devices.items()):
            battery = f"{device.battery}%" if device.battery else "N/A"
            link = str(device.link_quality) if device.link_quality else "N/A"

            status_icons = {
                "healthy": "[OK] healthy",
                "warning": "[!] warning",
                "critical": "[X] critical",
                "offline": "[ ] offline"
            }
            status = status_icons.get(device.health_status(), "unknown")

            print(f"{name:<25} {device.device_type:<12} {battery:<10} "
                  f"{link:<8} {status:<12} {device.message_count:<10}")

        print("=" * 80)

        # Print alerts
        alerts = self.get_health_alerts()
        if alerts:
            print("\n" + "=" * 80)
            print("HEALTH ALERTS")
            print("=" * 80)
            for alert in alerts:
                print(alert)
            print("=" * 80)

    def run(self, status_interval: int = 30):
        """Start the network analyzer."""
        print("=== Zigbee Network Analyzer ===")
        print(f"Connecting to MQTT broker: {self.mqtt_host}:{self.mqtt_port}")

        try:
            self.client.connect(self.mqtt_host, self.mqtt_port, 60)
            self.client.loop_start()

            print("Monitoring network... (Press Ctrl+C to stop)\n")

            while True:
                time.sleep(status_interval)
                self.print_status()

        except KeyboardInterrupt:
            print("\nShutting down...")
        finally:
            self.client.loop_stop()
            self.client.disconnect()


if __name__ == "__main__":
    analyzer = ZigbeeNetworkAnalyzer(
        mqtt_host="localhost",
        mqtt_port=1883
    )
    analyzer.run(status_interval=30)

40.5 Running the Analyzer

# Start the analyzer (connects to localhost:1883 by default)
python3 zigbee_analyzer.py

# To change the broker address, edit the __main__ block:
# analyzer = ZigbeeNetworkAnalyzer(mqtt_host="192.168.1.100", mqtt_port=1883)

40.6 Expected Output

=== Zigbee Network Analyzer ===
Connecting to MQTT broker: localhost:1883

[OK] Connected to MQTT broker
[OK] Subscribed to zigbee2mqtt/#
Monitoring network... (Press Ctrl+C to stop)

[OK] Network map received

================================================================================
ZIGBEE NETWORK STATUS
================================================================================

Devices: 8 total, 7 online
Uptime: 30 seconds
Messages: 127

--------------------------------------------------------------------------------
Device                    Type         Battery    Link     Status     Messages
--------------------------------------------------------------------------------
bedroom/motion            EndDevice    85%        145      [OK] healthy  12
bedroom/temp              EndDevice    92%        168      [OK] healthy  15
coordinator               Coordinator  N/A        N/A      [OK] healthy  0
kitchen/light             Router       N/A        255      [OK] healthy  8
living_room/door          EndDevice    18%        98       [!] warning   9
living_room/light         Router       N/A        255      [OK] healthy  11
office/temp               EndDevice    5%         142      [X] critical  7
patio/motion              EndDevice    N/A        N/A      [ ] offline   0
================================================================================

================================================================================
HEALTH ALERTS
================================================================================
[CRITICAL] office/temp - Battery 5%
[WARNING] living_room/door - Battery 18%
[OFFLINE] patio/motion
================================================================================

40.7 Mid-Lab Check

The analyzer subscribes to the MQTT topic zigbee2mqtt/#. What does the # wildcard accomplish?

  1. It filters messages to only the coordinator device
  2. It subscribes to all topics under zigbee2mqtt/, capturing every device update and bridge message
  3. It limits the subscription to exactly one level of subtopics
  4. It encrypts the MQTT connection for secure monitoring

B) It subscribes to all topics under zigbee2mqtt/, capturing every device update and bridge message – In MQTT, the multi-level wildcard # matches any number of topic levels. So zigbee2mqtt/# receives messages from zigbee2mqtt/bedroom/temp, zigbee2mqtt/bridge/networkmap/raw, and every other subtopic. This is essential for a network analyzer because it needs visibility into all devices and bridge status messages. The single-level wildcard + would match only one level (e.g., zigbee2mqtt/+ would match zigbee2mqtt/device1 but not zigbee2mqtt/bridge/state).

40.8 Extending the Analyzer

40.8.1 Adding Slack Notifications

import requests

def send_slack_alert(webhook_url: str, message: str):
    """Send alert to Slack channel."""
    payload = {"text": message}
    requests.post(webhook_url, json=payload)

# In the analyzer class:
def check_and_alert(self):
    alerts = self.get_health_alerts()
    for alert in alerts:
        if "[CRITICAL]" in alert:
            send_slack_alert(WEBHOOK_URL, f":warning: {alert}")

40.8.2 Logging to InfluxDB

from influxdb import InfluxDBClient

def log_to_influx(self, device: ZigbeeDevice):
    """Log device metrics to InfluxDB."""
    point = {
        "measurement": "zigbee_device",
        "tags": {
            "device": device.friendly_name,
            "type": device.device_type
        },
        "fields": {
            "battery": device.battery or 0,
            "link_quality": device.link_quality or 0,
            "online": int(device.is_online)
        }
    }
    self.influx_client.write_points([point])

40.9 Troubleshooting

Common Issues
Problem Solution
Connection refused Verify MQTT broker is running
No messages received Check Zigbee2MQTT is publishing
Devices show offline May be sleepy end devices
Missing battery data Not all devices report battery

Sammy the Sensor asks: “How does the network manager know if I’m healthy or running low on battery?”

Max the Microcontroller explains: “We use a Python analyzer connected to Zigbee2MQTT! It listens to every message on the MQTT broker and keeps track of each device – battery level, signal strength (link quality), and when you last checked in.”

Lila the LED adds: “If your link quality drops below 50 or your battery falls below 20%, the analyzer raises an alert – like a doctor sending a warning when your vital signs look bad. It can even send Slack messages to the maintenance team!”

Bella the Battery warns: “And if I haven’t sent a message in over 60 minutes, the analyzer marks me as OFFLINE. That’s how the team knows something is wrong before users notice any problems.”

Key ideas for kids:

  • Network analyzer = A monitoring tool that watches over all devices in the network
  • Link quality = How strong the radio signal is between two devices
  • Health alerting = Automatic warnings when a device’s battery is low or signal is weak
  • MQTT bridge = A translator that makes Zigbee messages readable by Python programs

Scenario: Your Python network analyzer tracks 45 Zigbee sensors in a warehouse. A motion sensor’s LQI has been declining steadily over 3 weeks: Week 1 (LQI 210) → Week 2 (LQI 165) → Week 3 (LQI 98). Battery is at 78%. Should you take action now or wait for the sensor to fail?

Data Analysis:

Week LQI RSSI (dBm) Packet Loss (%) Battery (%) Events
1 210 -38 0.2% 85% Normal operation
2 165 -52 1.8% 82% None
3 98 -68 8.5% 78% 3 offline alerts
4 (projected) 45 -78 25%+ 75% Frequent failures

Root Cause Investigation:

  1. Battery NOT the issue (78% is healthy)
  2. LQI decline indicates RF path degradation
    • Possible causes:
      • New metal shelving blocking signal
      • Changed warehouse layout
      • Moisture/dust accumulation on antenna
      • Temperature affecting radio performance
  3. RSSI decline confirms physical obstruction:
    • -38 dBm = excellent (Week 1)
    • -68 dBm = marginal (Week 3)
    • Threshold for reliable operation: -70 dBm
    • Device is 2 dBm from failure zone

Decision Matrix:

Metric Current Threshold Status Action Required
LQI 98 <80 critical, <120 warning WARNING Monitor closely
RSSI -68 dBm <-70 critical, <-60 warning WARNING Investigate
Packet Loss 8.5% >5% critical CRITICAL Immediate action
Trend -56/week N/A CRITICAL Predict failure in 7 days

Predictive Calculation:

LQI Decline Rate = (210 - 98) / 3 weeks = 37.3 per week
Time to Critical (LQI 50) = (98 - 50) / 37.3 = 1.3 weeks
Time to Failure (LQI 30) = (98 - 30) / 37.3 = 1.8 weeks

Confidence: High (consistent linear decline)

Action Plan:

Immediate (This Week):

  1. Add temporary router within 10m of sensor
  2. Monitor if LQI stabilizes above 150
  3. If LQI improves → permanent router needed
  4. If LQI stays low → device or antenna hardware issue

Permanent Fix (identified after testing): - New router installed 8m from sensor (was 18m) - Result: LQI recovered to 185, RSSI -42 dBm - Cost: $25 router vs $200 downtime from sensor failures

Python Analyzer Enhancement:

Add trend detection to your analyzer:

def detect_lqi_trends(self, device_name: str, history_days: int = 21) -> str:
    """Detect declining LQI trends and predict failures."""

    # Fetch historical LQI values (last 21 days)
    lqi_history = self.get_lqi_history(device_name, history_days)

    if len(lqi_history) < 7:
        return "Insufficient data"

    # Calculate linear regression
    from scipy.stats import linregress
    days = list(range(len(lqi_history)))
    slope, intercept, r_value, _, _ = linregress(days, lqi_history)

    # Predict days until critical LQI (50)
    current_lqi = lqi_history[-1]
    if slope >= 0:
        return f"[OK] LQI stable or improving (trend: {slope:+.2f}/day)"

    days_to_critical = (50 - current_lqi) / slope if slope < 0 else float('inf')

    if days_to_critical < 7:
        return f"[CRITICAL] LQI failing in {days_to_critical:.1f} days"
    elif days_to_critical < 14:
        return f"[WARNING] LQI declining (critical in {days_to_critical:.1f} days)"
    else:
        return f"[MONITOR] Slow LQI decline ({slope:.2f}/day, critical in {days_to_critical:.0f} days)"

# Usage in print_status():
for name, device in self.devices.items():
    trend_status = self.detect_lqi_trends(name)
    print(f"{name}: LQI {device.link_quality} - {trend_status}")

Expected Output:

bedroom/motion: LQI 210 - [OK] LQI stable or improving (trend: +1.2/day)
warehouse/motion_3: LQI 98 - [CRITICAL] LQI failing in 9.2 days
kitchen/temp: LQI 165 - [MONITOR] Slow LQI decline (-2.1/day, critical in 54 days)

Key Insight: Don’t wait for devices to fail – LQI trends are early warning indicators. A consistent decline of >5 LQI points per day predicts failure within 2-3 weeks. Proactive router placement (triggered by LQI warnings) prevents 90% of network outages and costs far less than emergency troubleshooting after failures.

40.10 Knowledge Check

Q1: Why is monitoring link quality important in a Zigbee network?

  1. Higher link quality increases battery life
  2. Low link quality indicates a device may lose connectivity or need repositioning
  3. Link quality determines the device’s Zigbee channel
  4. Link quality only matters for the Coordinator

B) Low link quality indicates a device may lose connectivity or need repositioning – Link quality (LQI) measures how strong and reliable the radio signal is between two devices. When LQI drops below a threshold (e.g., 50), it indicates the device is at the edge of communication range and may experience intermittent message failures. Monitoring LQI helps identify devices that need to be moved closer to a Router or that need an additional Router installed nearby.

LQI decline predicts failure with linear regression. For a sensor with weekly LQI readings [210, 165, 98], calculate time to critical threshold (LQI = 50):

\[\text{Decline rate} = \frac{\Delta LQI}{\Delta t} = \frac{210 - 98}{3 \text{ weeks}} = 37.3 \text{ per week}\]

\[\text{Weeks to failure} = \frac{\text{Current LQI} - \text{Critical LQI}}{\text{Decline rate}} = \frac{98 - 50}{37.3} = 1.3 \text{ weeks}\]

Predicted failure: 9 days from now. Add router within 10m immediately to prevent downtime. Historical data from 45-sensor deployment shows LQI decline >30/week predicts 95% failure rate within 14 days — proactive router addition reduces emergency maintenance calls from 12/month to <1/month.

Common Pitfalls

Zigbee network analyzers must be configured to the same channel as the network under analysis. Capturing on a neighboring channel produces empty or irrelevant captures. Always verify the network’s operating channel before starting capture.

Without the Zigbee network key, captured frames show only MAC-layer headers with encrypted NWK/APS payloads. Configure the analyzer with the network key to see full protocol decodes.

Zigbee packet captures require understanding the expected protocol flow to identify anomalies. Compare captures against the expected message sequence for the operation being debugged rather than analyzing frames in isolation.

40.11 Summary

This lab demonstrated building a Python network analyzer for Zigbee:

  • MQTT Integration: Connected to Zigbee2MQTT via paho-mqtt library
  • Device Tracking: Monitored battery, link quality, and online status
  • Health Alerting: Implemented critical/warning/offline detection
  • Network Statistics: Tracked message counts and uptime
  • Extensibility: Added hooks for Slack and InfluxDB integration

40.12 Knowledge Check

::

::

Key Concepts

  • Network Analyzer: A software tool capturing, decoding, and displaying Zigbee protocol messages with timing and addressing information for network debugging.
  • Packet Capture: The process of recording raw IEEE 802.15.4 frames using a sniffer device (TI CC2531, nRF Sniffer, Ubiqua) for post-analysis.
  • LQI (Link Quality Indicator): An IEEE 802.15.4 metric (0–255) indicating received frame quality; used by Zigbee routing to prefer higher-quality links.
  • RSSI (Received Signal Strength Indicator): Signal strength measurement in dBm at the receiver; correlated with range and link quality in Zigbee networks.
  • Frame Sequence Number: An 802.15.4 field in each frame used for duplicate detection and acknowledgment matching; wraps around at 255.

40.13 Concept Relationships

Concept Related To How They Connect
Zigbee2MQTT MQTT Bridge Translates Zigbee device messages to MQTT topics for monitoring
LQI (Link Quality) Signal Strength Indicates RF path quality, predicts connectivity issues
Battery Level Device Health Low battery triggers alerts before device goes offline
Network Topology Device Discovery Map shows mesh connections between coordinator, routers, end devices
Health Alerting Predictive Maintenance Automated warnings enable proactive device replacement
MQTT Broker Pub/Sub Messaging Central hub distributes device state updates to subscribers

40.14 What’s Next

Chapter Focus
Zigbee Lab: Mesh Simulator Interactive Wokwi simulation of mesh routing, self-healing, and hop counting
Zigbee Lab: Temperature Network Build a multi-sensor temperature monitoring network with hardware
Zigbee Network Topologies Star, tree, and mesh topology trade-offs for production deployments
Zigbee Security Encryption keys, trust centres, and securing Zigbee2MQTT communications
Zigbee Comprehensive Review End-to-end protocol reference covering PHY through application layer