40 Zigbee Lab: Python Network Analyzer
40.1 Learning Objectives
By the end of this chapter, you will be able to:
- Construct an MQTT integration: Interface Python with Zigbee networks via the Zigbee2MQTT bridge and paho-mqtt library
- Evaluate device health metrics: Classify battery levels, link quality, and online status using tiered alert thresholds
- Map network topology: Diagram mesh connections between coordinator, routers, and end devices from bridge data
- Design automated health alerts: Build a tiered notification system for critical, warning, and offline device states
- Analyze network traffic patterns: Calculate message rates, uptime statistics, and per-device activity trends
What is this lab? A Python-based monitoring tool that connects to Zigbee2MQTT to provide real-time visibility into your Zigbee network health.
When to use:
- For production network monitoring and troubleshooting
- When you need battery and signal strength alerts
- To visualize mesh topology and routing paths
Key Topics:
| Topic | Focus |
|---|---|
| Zigbee2MQTT | Bridge between Zigbee and MQTT |
| MQTT Protocol | Pub/sub messaging for device data |
| Health Monitoring | Battery, LQI, online status |
| Alerting | Automated notifications |
Prerequisites:
- Zigbee Fundamentals
- Basic Python programming
- MQTT broker running
40.2 Prerequisites
This lab requires:
- Raspberry Pi with Zigbee2MQTT installed
- MQTT broker running (Mosquitto recommended)
- Zigbee coordinator (CC2531 or similar)
- Paired Zigbee devices for monitoring
- Python 3.8+ with paho-mqtt library
Zigbee2MQTT is an open-source project that bridges Zigbee devices to MQTT, allowing you to monitor and control devices without proprietary hubs. It publishes device data to MQTT topics like zigbee2mqtt/device_name.
40.3 Architecture Overview
40.4 Python Network Analyzer Implementation
40.4.1 Installation
# Install required packages
pip install paho-mqtt
# Verify Zigbee2MQTT is running
sudo systemctl status zigbee2mqtt40.4.2 Complete Analyzer Code
#!/usr/bin/env python3
"""
Zigbee Network Analyzer
-----------------------
Monitors Zigbee network health via Zigbee2MQTT.
Tracks device status, battery levels, and link quality.
"""
import json
import time
from datetime import datetime
from dataclasses import dataclass
from typing import Dict, Optional
import paho.mqtt.client as mqtt
@dataclass
class ZigbeeDevice:
"""Represents a Zigbee device in the network."""
friendly_name: str
device_type: str = "Unknown"
battery: Optional[int] = None
link_quality: Optional[int] = None
last_seen: Optional[datetime] = None
message_count: int = 0
is_online: bool = True
def health_status(self) -> str:
"""Determine device health based on battery and connectivity."""
if not self.is_online:
return "offline"
if self.battery is not None:
if self.battery < 10:
return "critical"
if self.battery < 20:
return "warning"
return "healthy"
class ZigbeeNetworkAnalyzer:
"""Analyzes and monitors a Zigbee network via MQTT."""
def __init__(self, mqtt_host: str = "localhost", mqtt_port: int = 1883):
self.mqtt_host = mqtt_host
self.mqtt_port = mqtt_port
self.devices: Dict[str, ZigbeeDevice] = {}
self.start_time = datetime.now()
self.total_messages = 0
self.network_map = {}
# MQTT client setup (paho-mqtt v1 API; for v2.0+ use
# mqtt.Client(mqtt.CallbackAPIVersion.VERSION1))
self.client = mqtt.Client()
self.client.on_connect = self._on_connect
self.client.on_message = self._on_message
def _on_connect(self, client, userdata, flags, rc):
"""Handle MQTT connection."""
if rc == 0:
print("[OK] Connected to MQTT broker")
# Subscribe to all Zigbee2MQTT topics
client.subscribe("zigbee2mqtt/#")
print("[OK] Subscribed to zigbee2mqtt/#")
else:
print(f"[ERROR] Connection failed with code {rc}")
def _on_message(self, client, userdata, msg):
"""Process incoming MQTT messages."""
topic = msg.topic
try:
payload = json.loads(msg.payload.decode())
except json.JSONDecodeError:
return
# Handle network map updates
if topic == "zigbee2mqtt/bridge/networkmap/raw":
self._process_network_map(payload)
return
# Handle device state updates
if topic.startswith("zigbee2mqtt/") and "/bridge" not in topic:
device_name = topic.replace("zigbee2mqtt/", "")
self._update_device(device_name, payload)
def _process_network_map(self, payload):
"""Process network topology map."""
self.network_map = payload
print("\n[OK] Network map received")
if "nodes" in payload:
for node in payload["nodes"]:
name = node.get("friendly_name", node.get("ieeeAddr"))
if name not in self.devices:
self.devices[name] = ZigbeeDevice(
friendly_name=name,
device_type=node.get("type", "Unknown")
)
else:
self.devices[name].device_type = node.get("type", "Unknown")
def _update_device(self, name: str, payload: dict):
"""Update device state from payload."""
if name not in self.devices:
self.devices[name] = ZigbeeDevice(friendly_name=name)
device = self.devices[name]
device.last_seen = datetime.now()
device.message_count += 1
device.is_online = True
self.total_messages += 1
# Extract common fields
if "battery" in payload:
device.battery = payload["battery"]
if "linkquality" in payload:
device.link_quality = payload["linkquality"]
if "device_type" in payload:
device.device_type = payload["device_type"]
def get_health_alerts(self) -> list:
"""Generate health alerts for problematic devices."""
alerts = []
for name, device in self.devices.items():
status = device.health_status()
if status == "critical":
alerts.append(f"[CRITICAL] {name} - Battery {device.battery}%")
elif status == "warning":
alerts.append(f"[WARNING] {name} - Battery {device.battery}%")
elif status == "offline":
alerts.append(f"[OFFLINE] {name}")
return alerts
def print_status(self):
"""Print current network status."""
uptime = (datetime.now() - self.start_time).seconds
print("\n" + "=" * 80)
print("ZIGBEE NETWORK STATUS")
print("=" * 80)
print(f"\nDevices: {len(self.devices)} total, "
f"{sum(1 for d in self.devices.values() if d.is_online)} online")
print(f"Uptime: {uptime} seconds")
print(f"Messages: {self.total_messages}")
print("\n" + "-" * 80)
print(f"{'Device':<25} {'Type':<12} {'Battery':<10} "
f"{'Link':<8} {'Status':<12} {'Messages':<10}")
print("-" * 80)
for name, device in sorted(self.devices.items()):
battery = f"{device.battery}%" if device.battery else "N/A"
link = str(device.link_quality) if device.link_quality else "N/A"
status_icons = {
"healthy": "[OK] healthy",
"warning": "[!] warning",
"critical": "[X] critical",
"offline": "[ ] offline"
}
status = status_icons.get(device.health_status(), "unknown")
print(f"{name:<25} {device.device_type:<12} {battery:<10} "
f"{link:<8} {status:<12} {device.message_count:<10}")
print("=" * 80)
# Print alerts
alerts = self.get_health_alerts()
if alerts:
print("\n" + "=" * 80)
print("HEALTH ALERTS")
print("=" * 80)
for alert in alerts:
print(alert)
print("=" * 80)
def run(self, status_interval: int = 30):
"""Start the network analyzer."""
print("=== Zigbee Network Analyzer ===")
print(f"Connecting to MQTT broker: {self.mqtt_host}:{self.mqtt_port}")
try:
self.client.connect(self.mqtt_host, self.mqtt_port, 60)
self.client.loop_start()
print("Monitoring network... (Press Ctrl+C to stop)\n")
while True:
time.sleep(status_interval)
self.print_status()
except KeyboardInterrupt:
print("\nShutting down...")
finally:
self.client.loop_stop()
self.client.disconnect()
if __name__ == "__main__":
analyzer = ZigbeeNetworkAnalyzer(
mqtt_host="localhost",
mqtt_port=1883
)
analyzer.run(status_interval=30)40.5 Running the Analyzer
# Start the analyzer (connects to localhost:1883 by default)
python3 zigbee_analyzer.py
# To change the broker address, edit the __main__ block:
# analyzer = ZigbeeNetworkAnalyzer(mqtt_host="192.168.1.100", mqtt_port=1883)40.6 Expected Output
=== Zigbee Network Analyzer ===
Connecting to MQTT broker: localhost:1883
[OK] Connected to MQTT broker
[OK] Subscribed to zigbee2mqtt/#
Monitoring network... (Press Ctrl+C to stop)
[OK] Network map received
================================================================================
ZIGBEE NETWORK STATUS
================================================================================
Devices: 8 total, 7 online
Uptime: 30 seconds
Messages: 127
--------------------------------------------------------------------------------
Device Type Battery Link Status Messages
--------------------------------------------------------------------------------
bedroom/motion EndDevice 85% 145 [OK] healthy 12
bedroom/temp EndDevice 92% 168 [OK] healthy 15
coordinator Coordinator N/A N/A [OK] healthy 0
kitchen/light Router N/A 255 [OK] healthy 8
living_room/door EndDevice 18% 98 [!] warning 9
living_room/light Router N/A 255 [OK] healthy 11
office/temp EndDevice 5% 142 [X] critical 7
patio/motion EndDevice N/A N/A [ ] offline 0
================================================================================
================================================================================
HEALTH ALERTS
================================================================================
[CRITICAL] office/temp - Battery 5%
[WARNING] living_room/door - Battery 18%
[OFFLINE] patio/motion
================================================================================
40.7 Mid-Lab Check
The analyzer subscribes to the MQTT topic zigbee2mqtt/#. What does the # wildcard accomplish?
- It filters messages to only the coordinator device
- It subscribes to all topics under
zigbee2mqtt/, capturing every device update and bridge message - It limits the subscription to exactly one level of subtopics
- It encrypts the MQTT connection for secure monitoring
B) It subscribes to all topics under zigbee2mqtt/, capturing every device update and bridge message – In MQTT, the multi-level wildcard # matches any number of topic levels. So zigbee2mqtt/# receives messages from zigbee2mqtt/bedroom/temp, zigbee2mqtt/bridge/networkmap/raw, and every other subtopic. This is essential for a network analyzer because it needs visibility into all devices and bridge status messages. The single-level wildcard + would match only one level (e.g., zigbee2mqtt/+ would match zigbee2mqtt/device1 but not zigbee2mqtt/bridge/state).
40.8 Extending the Analyzer
40.8.1 Adding Slack Notifications
import requests
def send_slack_alert(webhook_url: str, message: str):
"""Send alert to Slack channel."""
payload = {"text": message}
requests.post(webhook_url, json=payload)
# In the analyzer class:
def check_and_alert(self):
alerts = self.get_health_alerts()
for alert in alerts:
if "[CRITICAL]" in alert:
send_slack_alert(WEBHOOK_URL, f":warning: {alert}")40.8.2 Logging to InfluxDB
from influxdb import InfluxDBClient
def log_to_influx(self, device: ZigbeeDevice):
"""Log device metrics to InfluxDB."""
point = {
"measurement": "zigbee_device",
"tags": {
"device": device.friendly_name,
"type": device.device_type
},
"fields": {
"battery": device.battery or 0,
"link_quality": device.link_quality or 0,
"online": int(device.is_online)
}
}
self.influx_client.write_points([point])40.9 Troubleshooting
| Problem | Solution |
|---|---|
| Connection refused | Verify MQTT broker is running |
| No messages received | Check Zigbee2MQTT is publishing |
| Devices show offline | May be sleepy end devices |
| Missing battery data | Not all devices report battery |
Sammy the Sensor asks: “How does the network manager know if I’m healthy or running low on battery?”
Max the Microcontroller explains: “We use a Python analyzer connected to Zigbee2MQTT! It listens to every message on the MQTT broker and keeps track of each device – battery level, signal strength (link quality), and when you last checked in.”
Lila the LED adds: “If your link quality drops below 50 or your battery falls below 20%, the analyzer raises an alert – like a doctor sending a warning when your vital signs look bad. It can even send Slack messages to the maintenance team!”
Bella the Battery warns: “And if I haven’t sent a message in over 60 minutes, the analyzer marks me as OFFLINE. That’s how the team knows something is wrong before users notice any problems.”
Key ideas for kids:
- Network analyzer = A monitoring tool that watches over all devices in the network
- Link quality = How strong the radio signal is between two devices
- Health alerting = Automatic warnings when a device’s battery is low or signal is weak
- MQTT bridge = A translator that makes Zigbee messages readable by Python programs
Scenario: Your Python network analyzer tracks 45 Zigbee sensors in a warehouse. A motion sensor’s LQI has been declining steadily over 3 weeks: Week 1 (LQI 210) → Week 2 (LQI 165) → Week 3 (LQI 98). Battery is at 78%. Should you take action now or wait for the sensor to fail?
Data Analysis:
| Week | LQI | RSSI (dBm) | Packet Loss (%) | Battery (%) | Events |
|---|---|---|---|---|---|
| 1 | 210 | -38 | 0.2% | 85% | Normal operation |
| 2 | 165 | -52 | 1.8% | 82% | None |
| 3 | 98 | -68 | 8.5% | 78% | 3 offline alerts |
| 4 (projected) | 45 | -78 | 25%+ | 75% | Frequent failures |
Root Cause Investigation:
- Battery NOT the issue (78% is healthy)
- LQI decline indicates RF path degradation
- Possible causes:
- New metal shelving blocking signal
- Changed warehouse layout
- Moisture/dust accumulation on antenna
- Temperature affecting radio performance
- Possible causes:
- RSSI decline confirms physical obstruction:
- -38 dBm = excellent (Week 1)
- -68 dBm = marginal (Week 3)
- Threshold for reliable operation: -70 dBm
- Device is 2 dBm from failure zone
Decision Matrix:
| Metric | Current | Threshold | Status | Action Required |
|---|---|---|---|---|
| LQI | 98 | <80 critical, <120 warning | WARNING | Monitor closely |
| RSSI | -68 dBm | <-70 critical, <-60 warning | WARNING | Investigate |
| Packet Loss | 8.5% | >5% critical | CRITICAL | Immediate action |
| Trend | -56/week | N/A | CRITICAL | Predict failure in 7 days |
Predictive Calculation:
LQI Decline Rate = (210 - 98) / 3 weeks = 37.3 per week
Time to Critical (LQI 50) = (98 - 50) / 37.3 = 1.3 weeks
Time to Failure (LQI 30) = (98 - 30) / 37.3 = 1.8 weeks
Confidence: High (consistent linear decline)
Action Plan:
Immediate (This Week):
- Add temporary router within 10m of sensor
- Monitor if LQI stabilizes above 150
- If LQI improves → permanent router needed
- If LQI stays low → device or antenna hardware issue
Permanent Fix (identified after testing): - New router installed 8m from sensor (was 18m) - Result: LQI recovered to 185, RSSI -42 dBm - Cost: $25 router vs $200 downtime from sensor failures
Python Analyzer Enhancement:
Add trend detection to your analyzer:
def detect_lqi_trends(self, device_name: str, history_days: int = 21) -> str:
"""Detect declining LQI trends and predict failures."""
# Fetch historical LQI values (last 21 days)
lqi_history = self.get_lqi_history(device_name, history_days)
if len(lqi_history) < 7:
return "Insufficient data"
# Calculate linear regression
from scipy.stats import linregress
days = list(range(len(lqi_history)))
slope, intercept, r_value, _, _ = linregress(days, lqi_history)
# Predict days until critical LQI (50)
current_lqi = lqi_history[-1]
if slope >= 0:
return f"[OK] LQI stable or improving (trend: {slope:+.2f}/day)"
days_to_critical = (50 - current_lqi) / slope if slope < 0 else float('inf')
if days_to_critical < 7:
return f"[CRITICAL] LQI failing in {days_to_critical:.1f} days"
elif days_to_critical < 14:
return f"[WARNING] LQI declining (critical in {days_to_critical:.1f} days)"
else:
return f"[MONITOR] Slow LQI decline ({slope:.2f}/day, critical in {days_to_critical:.0f} days)"
# Usage in print_status():
for name, device in self.devices.items():
trend_status = self.detect_lqi_trends(name)
print(f"{name}: LQI {device.link_quality} - {trend_status}")Expected Output:
bedroom/motion: LQI 210 - [OK] LQI stable or improving (trend: +1.2/day)
warehouse/motion_3: LQI 98 - [CRITICAL] LQI failing in 9.2 days
kitchen/temp: LQI 165 - [MONITOR] Slow LQI decline (-2.1/day, critical in 54 days)
Key Insight: Don’t wait for devices to fail – LQI trends are early warning indicators. A consistent decline of >5 LQI points per day predicts failure within 2-3 weeks. Proactive router placement (triggered by LQI warnings) prevents 90% of network outages and costs far less than emergency troubleshooting after failures.
40.10 Knowledge Check
Q1: Why is monitoring link quality important in a Zigbee network?
- Higher link quality increases battery life
- Low link quality indicates a device may lose connectivity or need repositioning
- Link quality determines the device’s Zigbee channel
- Link quality only matters for the Coordinator
B) Low link quality indicates a device may lose connectivity or need repositioning – Link quality (LQI) measures how strong and reliable the radio signal is between two devices. When LQI drops below a threshold (e.g., 50), it indicates the device is at the edge of communication range and may experience intermittent message failures. Monitoring LQI helps identify devices that need to be moved closer to a Router or that need an additional Router installed nearby.
LQI decline predicts failure with linear regression. For a sensor with weekly LQI readings [210, 165, 98], calculate time to critical threshold (LQI = 50):
\[\text{Decline rate} = \frac{\Delta LQI}{\Delta t} = \frac{210 - 98}{3 \text{ weeks}} = 37.3 \text{ per week}\]
\[\text{Weeks to failure} = \frac{\text{Current LQI} - \text{Critical LQI}}{\text{Decline rate}} = \frac{98 - 50}{37.3} = 1.3 \text{ weeks}\]
Predicted failure: 9 days from now. Add router within 10m immediately to prevent downtime. Historical data from 45-sensor deployment shows LQI decline >30/week predicts 95% failure rate within 14 days — proactive router addition reduces emergency maintenance calls from 12/month to <1/month.
Common Pitfalls
Zigbee network analyzers must be configured to the same channel as the network under analysis. Capturing on a neighboring channel produces empty or irrelevant captures. Always verify the network’s operating channel before starting capture.
Without the Zigbee network key, captured frames show only MAC-layer headers with encrypted NWK/APS payloads. Configure the analyzer with the network key to see full protocol decodes.
Zigbee packet captures require understanding the expected protocol flow to identify anomalies. Compare captures against the expected message sequence for the operation being debugged rather than analyzing frames in isolation.
40.11 Summary
This lab demonstrated building a Python network analyzer for Zigbee:
- MQTT Integration: Connected to Zigbee2MQTT via paho-mqtt library
- Device Tracking: Monitored battery, link quality, and online status
- Health Alerting: Implemented critical/warning/offline detection
- Network Statistics: Tracked message counts and uptime
- Extensibility: Added hooks for Slack and InfluxDB integration
40.12 Knowledge Check
::
::
Key Concepts
- Network Analyzer: A software tool capturing, decoding, and displaying Zigbee protocol messages with timing and addressing information for network debugging.
- Packet Capture: The process of recording raw IEEE 802.15.4 frames using a sniffer device (TI CC2531, nRF Sniffer, Ubiqua) for post-analysis.
- LQI (Link Quality Indicator): An IEEE 802.15.4 metric (0–255) indicating received frame quality; used by Zigbee routing to prefer higher-quality links.
- RSSI (Received Signal Strength Indicator): Signal strength measurement in dBm at the receiver; correlated with range and link quality in Zigbee networks.
- Frame Sequence Number: An 802.15.4 field in each frame used for duplicate detection and acknowledgment matching; wraps around at 255.
40.13 Concept Relationships
| Concept | Related To | How They Connect |
|---|---|---|
| Zigbee2MQTT | MQTT Bridge | Translates Zigbee device messages to MQTT topics for monitoring |
| LQI (Link Quality) | Signal Strength | Indicates RF path quality, predicts connectivity issues |
| Battery Level | Device Health | Low battery triggers alerts before device goes offline |
| Network Topology | Device Discovery | Map shows mesh connections between coordinator, routers, end devices |
| Health Alerting | Predictive Maintenance | Automated warnings enable proactive device replacement |
| MQTT Broker | Pub/Sub Messaging | Central hub distributes device state updates to subscribers |
40.14 What’s Next
| Chapter | Focus |
|---|---|
| Zigbee Lab: Mesh Simulator | Interactive Wokwi simulation of mesh routing, self-healing, and hop counting |
| Zigbee Lab: Temperature Network | Build a multi-sensor temperature monitoring network with hardware |
| Zigbee Network Topologies | Star, tree, and mesh topology trade-offs for production deployments |
| Zigbee Security | Encryption keys, trust centres, and securing Zigbee2MQTT communications |
| Zigbee Comprehensive Review | End-to-end protocol reference covering PHY through application layer |