49  Thread Network Ops

In 60 Seconds

Thread networks self-organize through automatic Leader election and Router promotion, then self-heal within seconds when devices fail. This chapter covers network formation, IPv6 addressing (RLOC, EID, Link-Local, Global), and power optimization that enables 7-10+ year battery life on coin cells through careful poll interval selection and deep sleep modes.

Sammy the Sensor wondered: “How does our Thread neighborhood get started?” Max the Microcontroller explained: “The first device to turn on becomes the block captain (Leader). When new neighbors move in, they get approved through a secret handshake (commissioning). The active neighbors help deliver messages (Routers), and sleepy ones like Bella just check their mailbox now and then.” Bella the Battery was proud: “I sleep 99.9% of the time and only wake up once a minute to check for mail – that is why I can last 10 years on a tiny coin battery!” Lila the LED added: “The coolest part is self-healing: if the block captain moves away, another neighbor automatically becomes the new captain in just seconds – no one even notices!”

49.1 Learning Objectives

By the end of this chapter, you will be able to:

  • Trace Network Formation: Diagram how Thread networks self-organize through Leader election, commissioning, and Router promotion sequences
  • Evaluate Self-Healing Behavior: Predict how Thread networks recover from specific device failures and estimate recovery timelines
  • Classify IPv6 Addressing: Distinguish between RLOC, EID, Link-Local, and Global addresses and justify when each is used in Thread communication
  • Design Power-Optimized Configurations: Select appropriate device roles and poll intervals to achieve target battery life for Sleepy End Devices
  • Calculate Battery Life: Apply duty-cycle power models to estimate device battery life across different Thread device configurations

49.2 Prerequisites

Before diving into this chapter, you should be familiar with:

  • Thread Fundamentals and Roles: Understanding Thread’s device types (Router, REED, SED, MED), network architecture, and Border Router functionality is essential before exploring operational details
  • 6LoWPAN Fundamentals and Architecture: Thread’s IPv6 addressing scheme, RLOC/EID concepts, and mesh routing build directly on 6LoWPAN principles

Deep Dives:

Comparisons:

Thread networks are like a self-organizing neighborhood watch. Devices automatically find each other, establish communication routes, and recover when something goes wrong—all without you having to configure anything manually.

Network Formation Analogy:

Imagine moving to a new neighborhood: 1. First person moves in and becomes the “block captain” (Leader) 2. New neighbors join by getting approved (commissioning) 3. Active neighbors help relay messages (Routers) 4. Less active neighbors just listen occasionally (Sleepy End Devices)

Key Concepts Made Simple:

Thread Term Simple Explanation Example
Leader The coordinator who manages the network First device powered on
Router Devices that forward messages for others Smart plugs, light switches
End Device Devices that only talk to their parent Motion sensors, buttons
Sleepy End Device Battery devices that wake up periodically Door/window sensors
Border Router The bridge to your home internet Special gateway device

Self-Healing Magic:

If the Leader fails, another device automatically takes over. If a Router loses power, messages find a new path. This happens automatically in seconds—no human intervention needed.

49.3 Thread Network Formation

⏱️ ~15 min | ⭐⭐ Intermediate | 📋 P08.C30.U01

49.3.1 Formation Sequence

When Thread devices power on, they follow a deterministic formation process:

Thread network formation sequence showing first device becoming Leader, second device joining as Router, and third device attaching as End Device through commissioning process
Figure 49.1: Thread Network Formation Sequence with Device Role Assignment

Formation Steps:

  1. First Device: Powers on, scans 802.15.4 channels 11-26, finds no existing network, creates new network as Leader
  2. Subsequent Devices: Scan for networks, find existing network, request to join via Commissioner-controlled process
  3. Role Assignment: Mains-powered devices become Routers (up to 32), battery devices become End Devices

49.3.2 Self-Healing Mesh

Thread networks automatically heal when devices fail or move:

Thread self-healing mesh showing network before Router 2 failure (End Device connected to Router 2) and after automatic recovery (End Device reconnected to Router 3, mesh routes recalculated)
Figure 49.2: Thread Self-Healing Mesh Network Recovery After Router Failure

Recovery Steps:

  1. Detection: End device detects parent (Router 2) is gone (missed keep-alives)
  2. Scan: End device scans for new parent routers
  3. Attach: End device attaches to new parent (Router 3)
  4. Update: Network data updated with new topology
  5. Convergence: Network stabilizes (typically < 1 second)
Diagram illustrating Thread Role Transitions
Figure 49.3: Thread device role state machine showing transitions between Detached, Child, Router, Leader, and sleep modes (SED/MED). Devices dynamically promote/demote based on network needs.

49.4 Thread Addressing

Thread uses multiple addressing schemes for different purposes:

Address Type Format Scope Purpose
Link-Local fe80::/64 Single hop Direct neighbor communication
Mesh-Local fd00::/64 Thread network Intra-network, stable addresses
Global 2000::/3 Internet Internet-reachable (via border router)
RLOC Mesh-Local + Router ID Thread network Routing locator (topology-based)
EID Mesh-Local Thread network Endpoint identifier (stable)
Address Types Explained

1. Link-Local Address (fe80::):

  • Used for direct neighbor communication
  • Not routed beyond single hop
  • Auto-configured from MAC address

2. Mesh-Local Address (fd00::):

  • Unique within Thread network
  • RLOC (Routing Locator): Changes if device role/position changes
  • EID (Endpoint Identifier): Stable address that doesn’t change

3. Global Address (2000::):

  • Provided by border router
  • Allows devices to communicate with internet
  • Uses NAT64 for IPv4 internet access

Example: A Thread temperature sensor might have: - Link-Local: fe80::1234:5678:9abc:def0 - RLOC: fd12:3456::1234:5678 (changes if topology changes) - EID: fd12:3456::abcd:ef01 (stable, doesn’t change) - Global: 2001:db8:1::abcd:ef01 (via border router)

49.5 Power Consumption and Battery Life

Thread’s power consumption depends on device role:

Device Type Typical Current Battery Life (2000 mAh) Use Case
Router 20-40 mA (always on) Days-weeks Mains powered only
FED 15-30 mA (rx on) Weeks Mains or large battery
MED (poll: 5s) 50-200 µA avg 1-2 years AA/AAA batteries
SED (poll: 60s) 10-50 µA avg 3-10 years Coin cell (CR2032)
Optimizing Thread Battery Life

For Sleepy End Devices:

  1. Increase Poll Interval: Poll every 60s instead of 5s → 10x battery life
  2. Use Data Polling: Poll only when expecting data (event-driven)
  3. Optimize Transmissions: Send multiple readings in one message
  4. Efficient Code: Minimize wake time (process quickly, return to sleep)
  5. Proper Sleep Modes: Use deep sleep, not light sleep

Example Calculation (SED with 60s poll):

Active time per hour:
- Wake + poll + sleep: 60 times/hour × 10ms = 600ms
- TX when data (once/hour): 20ms
- Total active: 620ms/hour

Sleep time per hour: 3600s - 0.62s = 3599.38s

Average current:
- Active (10ms): 20 mA
- Sleep (3599.99s): 10 µA

Avg = (20mA × 0.62s + 0.01mA × 3599.38s) / 3600s
    = (12.4 + 35.99) / 3600
    = 13.4 µA average

Battery life (2000 mAh AA battery):
= 2000 mAh / 0.0134 mA
= 149,254 hours
= 17 years (limited by battery self-discharge to ~10 years)

The relationship between poll interval and battery life is nearly linear for SEDs. General formula:

Average current for an SED: \[I_{\text{avg}} = \frac{n \cdot (I_{\text{active}} \cdot t_{\text{active}}) + I_{\text{sleep}} \cdot t_{\text{sleep}}}{t_{\text{total}}}\]

where \(n = \frac{3600}{T_{\text{poll}}}\) polls per hour.

Battery life in years: \[L = \frac{C_{\text{battery}}}{I_{\text{avg}} \times 8760}\]

For a CR2032 (225 mAh) with \(T_{\text{poll}} = 60\) s: \[I_{\text{avg}} = \frac{60 \times (20 \times 0.01) + 10 \times 10^{-3} \times 3599.4}{3600} = 13.4 \text{ µA}\]

\[L = \frac{225}{0.0134 \times 8760} = 1.92 \text{ years (practical limit ~10 years from self-discharge)}\]

Doubling poll interval from 60s to 120s nearly doubles battery life (18.5 years theoretical).

49.6 Interactive: Thread Battery Life Calculator

Use this tool to estimate battery life for Thread devices with different configurations:

Objective: Simulate a Thread Sleepy End Device (SED) on ESP32, demonstrating poll intervals, wake/sleep cycles, battery life estimation, and the dramatic power savings of longer poll periods.

Paste this code into the Wokwi editor:

#include <WiFi.h>

// Thread device role simulation
enum ThreadRole { ROUTER, FED, MED, SED };
const char* roleNames[] = {"Router", "FED", "MED", "SED"};

struct DeviceConfig {
  ThreadRole role;
  int pollIntervalSec;
  float activeCurrent_mA;
  float sleepCurrent_mA;
  float pollDuration_ms;
  float txDuration_ms;
};

// Simulated network state
int parentRLOC = 0x4C01;
int leaderRLOC = 0x0001;
uint32_t partitionId = 0x12345678;
int networkChannel = 15;

void setup() {
  Serial.begin(115200);
  delay(1000);

  Serial.println("=== Thread SED Power Mode Simulator ===\n");

  // Show Thread network formation
  Serial.println("--- Network Formation ---");
  Serial.printf("Channel: %d (802.15.4, 250 kbps)\n", networkChannel);
  Serial.printf("PAN ID: 0x%04X\n", 0xABCD);
  Serial.printf("Partition ID: 0x%08X\n", partitionId);
  Serial.printf("Leader RLOC: 0x%04X\n", leaderRLOC);
  Serial.println("Network key: [encrypted]\n");

  // Define device configurations
  DeviceConfig configs[] = {
    {SED, 60,  20.0, 0.010, 10.0, 20.0},  // 60s poll
    {SED, 10,  20.0, 0.010, 10.0, 20.0},  // 10s poll
    {SED, 5,   20.0, 0.010, 10.0, 20.0},  // 5s poll
    {MED, 5,   25.0, 0.050, 10.0, 20.0},  // MED
    {FED, 0,   30.0, 15.0,  0,    20.0},  // FED (rx always on)
    {ROUTER, 0, 35.0, 25.0, 0,    20.0}   // Router (always on)
  };
  int numConfigs = 6;

  // Battery analysis
  float batteryMah = 2000.0;  // CR2032 = 225 mAh, AA = 2000 mAh
  int msgsPerHour = 1;

  Serial.println("--- Battery Life Comparison (2000 mAh AA battery) ---");
  Serial.println("Role    Poll   Active(mA)  Sleep(mA)  Avg(uA)    Battery Life");
  Serial.println("---------------------------------------------------------------");

  for (int i = 0; i < numConfigs; i++) {
    DeviceConfig& c = configs[i];
    float avgCurrent;

    if (c.pollIntervalSec > 0) {
      // SED/MED: Calculate duty cycle
      float pollsPerHour = 3600.0 / c.pollIntervalSec;
      float activeMs = pollsPerHour * c.pollDuration_ms +
                       msgsPerHour * c.txDuration_ms;
      float activeSec = activeMs / 1000.0;
      float sleepSec = 3600.0 - activeSec;

      avgCurrent = (c.activeCurrent_mA * activeSec +
                    c.sleepCurrent_mA * sleepSec) / 3600.0;
    } else {
      // FED/Router: always on
      avgCurrent = c.sleepCurrent_mA;  // "sleep" is idle current
    }

    float batteryHours = (batteryMah * 1000.0) / (avgCurrent * 1000.0);
    float batteryYears = batteryHours / 8760.0;
    float effectiveYears = min(batteryYears, 10.0f);  // Self-discharge limit

    Serial.printf("%-7s %3ds    %5.1f       %5.3f     %7.1f    ",
                  roleNames[c.role], c.pollIntervalSec,
                  c.activeCurrent_mA, c.sleepCurrent_mA,
                  avgCurrent * 1000.0);

    if (effectiveYears >= 1.0) {
      Serial.printf("%.1f years", effectiveYears);
      if (batteryYears > 10.0) Serial.print(" (self-discharge limited)");
    } else {
      Serial.printf("%.0f days", effectiveYears * 365.0);
    }
    Serial.println();
  }

  // Simulate SED wake/sleep cycle
  Serial.println("\n--- SED Wake/Sleep Cycle Simulation (60s poll) ---");
  Serial.println("Time      State       Current   Duration   Action");
  Serial.println("----------------------------------------------------------");

  const char* events[][5] = {
    {"0.000s", "SLEEP",   "10 uA",  "59.97s", "Deep sleep (radio off)"},
    {"59.97s", "WAKE",    "20 mA",  "3 ms",   "Wake from sleep, init radio"},
    {"59.973", "POLL",    "20 mA",  "5 ms",   "Data Request to parent 0x4C01"},
    {"59.978", "RX",      "20 mA",  "2 ms",   "Receive: No pending data"},
    {"59.980", "SLEEP",   "10 uA",  "59.97s", "Return to deep sleep"},
    {"119.95", "WAKE",    "20 mA",  "3 ms",   "Wake from sleep"},
    {"119.95", "POLL",    "20 mA",  "5 ms",   "Data Request to parent"},
    {"119.96", "RX",      "20 mA",  "4 ms",   "Receive: Pending data (1 msg)"},
    {"119.96", "PROCESS", "20 mA",  "2 ms",   "Process CoAP GET response"},
    {"119.96", "TX",      "20 mA",  "8 ms",   "Send sensor reading (temp=22.5C)"},
    {"119.97", "ACK",     "20 mA",  "3 ms",   "Receive ACK from parent"},
    {"119.97", "SLEEP",   "10 uA",  "59.95s", "Return to deep sleep"}
  };

  for (int i = 0; i < 12; i++) {
    Serial.printf("%-9s %-10s %-9s %-10s %s\n",
                  events[i][0], events[i][1], events[i][2],
                  events[i][3], events[i][4]);
  }

  Serial.println("\nDuty cycle: 0.05% active (99.95% sleeping)");
  Serial.println("Average current: 13.4 uA");
  Serial.println("Battery life: ~10 years (self-discharge limited)\n");

  Serial.println("=== Key Takeaway ===");
  Serial.println("60s poll: 10 years | 10s poll: 8.5 years | 5s poll: 5.2 years");
  Serial.println("Doubling poll interval nearly doubles battery life for SEDs.");
}

void loop() {
  delay(10000);
}

What to Observe:

  1. SED with 60s poll uses only 13.4 uA average current (99.95% sleeping), achieving 10+ year battery life on a single AA battery
  2. Poll interval trade-off: 60s poll gives 10 years, but 5s poll gives only 5.2 years – response latency vs battery life
  3. Wake/sleep cycle shows the SED is active for only ~10ms per poll (radio init, data request, receive) then returns to 10 uA deep sleep
  4. Router and FED roles cannot run on batteries (always-on radios drain 15-35 mA continuously) – they must be mains-powered

49.7 Hands-On Lab: Thread Network Capacity

Lab Activity: Understanding Thread Network Capacity

Objective: Calculate Thread network capacity and design for scale

49.7.1 Task 1: Network Capacity Analysis

A Thread network operates on 2.4 GHz channel 15 (802.15.4, 250 kbps).

Given:

  • Data rate: 250 kbps
  • Efficiency: 40% (accounting for overhead, collisions, retries)
  • Average message size: 80 bytes
  • Message frequency: Varies by device type

Calculate maximum supportable devices for each scenario:

Scenario A: All devices are routers

  • Each router sends 1 message/second (status updates)

Scenario B: Mixed network

  • 16 routers (always-on, 1 msg/s each)
  • Remaining are SEDs (1 msg/minute each)

Scenario C: Realistic smart home

  • 1 border router (20 msgs/s routing overhead)
  • 10 routers (1 msg/s each)
  • Remaining SEDs (1 msg/5 minutes each)
Click to see solution

Solution:

Step 1: Calculate Available Bandwidth

  • Data rate: 250 kbps = 250,000 bps
  • Effective: 250,000 × 0.40 = 100,000 bps = 12,500 bytes/second

Step 2: Calculate Message Overhead

  • IPv6 header: 40 bytes (compressed to ~6 bytes with 6LoWPAN)
  • UDP header: 8 bytes (compressed to ~4 bytes)
  • 802.15.4 MAC: 25 bytes
  • Total overhead: ~35 bytes
  • Message + overhead: 80 + 35 = 115 bytes

Step 3: Calculate Messages per Second Capacity

  • Max msgs/s: 12,500 / 115 = 108 messages/second

Scenario A: All Routers (1 msg/s each)

  • Each router: 1 msg/s
  • Max devices: 108 / 1 = 108 routers
  • Problem: Thread limit is 32 routers!
  • Actual capacity: 32 routers max

Scenario B: Mixed Network

  • 16 routers: 16 msgs/s
  • Remaining bandwidth: 108 - 16 = 92 msgs/s
  • SEDs (1 msg/min = 0.0167 msg/s): 92 / 0.0167 = 5,509 SEDs
  • Total: 16 routers + 5,509 SEDs = 5,525 devices
  • Thread limit: 250 devices per network
  • Actual capacity: 16 routers + 234 SEDs = 250 devices

Scenario C: Realistic Smart Home

  • Border router: 20 msgs/s
  • 10 routers: 10 msgs/s
  • Overhead: 30 msgs/s
  • Remaining: 108 - 30 = 78 msgs/s
  • SEDs (1 msg / 5 min = 0.00333 msg/s): 78 / 0.00333 = 23,423 SEDs
  • Total: 1 BR + 10 routers + 23,423 SEDs
  • Thread limit: 250 devices
  • Actual capacity: 1 BR + 10 routers + 239 SEDs = 250 devices

Conclusion:

  • Bandwidth is not the bottleneck for Thread networks
  • 250 device limit is the constraint (by design)
  • Thread network can easily support 250 battery-powered sensors
  • For > 250 devices, use multiple Thread networks

Use this framework to select the optimal Thread device role based on power source, response requirements, and network participation:

Question Answer → Role Battery Life (typical) Use Cases
Power Source? Mains → Router N/A (always on) Smart plugs, bulbs, switches
Power Source? Battery → Next question
Response Time? Instant (<100ms) → FED Days-weeks Smoke alarms, critical sensors
Response Time? Fast (1-10s) → MED 1-3 years Motion sensors, leak detectors
Response Time? Moderate (30-300s) → SED 7-10+ years Door sensors, temp sensors
Routing Needed? Yes → REED Varies Smart locks, hybrid devices

Decision Rules:

  1. Always use mains power for Routers (mesh backbone requires 20-40 mA continuous)
  2. Use SED for maximum battery life (poll interval 60-300s achieves 7-10+ years on CR2032)
  3. Use MED when response time matters (poll interval 5-10s gives 1-3 years, acceptable for leak detection)
  4. Use FED only for safety-critical applications (smoke alarms, security sensors that cannot tolerate latency)
  5. Consider REED for devices that might plug in occasionally (smart locks can act as routers when USB-powered during firmware updates)

Trade-offs:

  • SED 60s poll: 10-year battery, 0-60s latency for incoming commands
  • MED 5s poll: 2-year battery, 0-5s latency for incoming commands
  • FED always-on: Days battery life, instant response

49.8 Worked Example: Thread SED Battery Life Calculation

Worked Example: Thread SED Battery Life Calculation

Scenario: You are selecting a battery for a Thread door/window sensor that will be deployed in a vacation home (infrequent door activity). Calculate expected battery life for different polling intervals to determine the optimal configuration.

Given:

  • Sensor: Thread Sleepy End Device (SED)
  • Battery options: CR2032 (240 mAh), CR2450 (620 mAh)
  • Active current (radio TX/RX): 15 mA
  • Sleep current (deep sleep): 2 uA
  • Poll duration: 8 ms per poll (wake, sync, poll, sleep)
  • TX event duration: 15 ms (door open/close report)
  • Door events: 10 per day (vacation home)
  • Target battery life: 3+ years

Steps:

  1. Calculate polling overhead for 30-second interval:
    • Polls per hour: 3600s / 30s = 120 polls/hour
    • Active time per hour: 120 polls x 8 ms = 960 ms = 0.96 seconds
    • Active current draw: 15 mA x 0.96s = 14.4 mAs per hour
  2. Calculate event transmission overhead:
    • Events per day: 10
    • TX time per event: 15 ms
    • Daily TX time: 10 x 15 ms = 150 ms
    • Hourly TX time: 150 ms / 24 = 6.25 ms
    • Event current draw: 15 mA x 0.00625s = 0.094 mAs per hour
  3. Calculate sleep current:
    • Sleep time per hour: 3600s - 0.96s - 0.00625s = 3599.03 seconds
    • Sleep current draw: 0.002 mA x 3599.03s = 7.2 mAs per hour
  4. Total hourly current consumption:
    • Total: 14.4 + 0.094 + 7.2 = 21.7 mAs per hour
    • Average current: 21.7 mAs / 3600s = 6.0 uA
  5. Calculate battery life:
    • CR2032 (240 mAh): 240,000 uAh / 6.0 uA = 40,000 hours = 4.6 years
    • CR2450 (620 mAh): 620,000 uAh / 6.0 uA = 103,333 hours = 11.8 years
  6. Optimize with 60-second polling interval:
    • Polls per hour: 60, Active time: 480 ms
    • Active current: 7.2 mAs/hour
    • Total: 7.2 + 0.094 + 7.2 = 14.5 mAs/hour = 4.0 uA average
    • CR2032 life: 240,000 / 4.0 = 60,000 hours = 6.8 years
    • CR2450 life: 620,000 / 4.0 = 155,000 hours = 17.7 years (limited by self-discharge to ~10 years)

Result: With 30-second polling, CR2032 provides 4.6 years battery life. Increasing to 60-second polling extends CR2032 life to 6.8 years. For vacation home use (latency-tolerant), 60-second polling recommended. CR2450 provides excellent margin but limited by lithium self-discharge (~10 year practical maximum).

Key Insight: Thread SED battery life scales nearly linearly with poll interval. Doubling the poll interval approximately doubles battery life. For infrequently accessed locations (vacation homes, storage areas), longer poll intervals (60-120 seconds) dramatically extend battery life with minimal impact on user experience. Event-driven transmissions (door opens) are nearly instantaneous regardless of poll interval because hardware interrupts trigger immediate wake-up.

49.9 How It Works: Thread Leader Election and Router Promotion

How It Works: Self-Organizing Mesh Formation

Thread networks automatically elect a Leader and promote Routers without central configuration:

Leader Election Process (Network Formation):

  1. Initial Network Scan: First device powers on and scans all 16 channels (11-26) for existing Thread networks
  2. No Network Found: Device creates new network partition with itself as Leader
  3. Leader Weighting: Leader role determined by configurable weight:
    • Each Router has a Leader Weight value (0-255, default 64)
    • Higher weight = higher priority for Leader role
    • Partition quality considers weight, active router count, and data version
    • Border Routers typically configured with higher weight (prefers BR as Leader)
  4. MLE Advertisement: Leader broadcasts MLE Advertisement every 32 seconds with Leader Data:
    • Partition ID (unique network identifier)
    • Leader Router ID
    • Network parameters (PAN ID, channel, security policy)
    • Router ID assignment table (which Router IDs are allocated)

Router Promotion Process (Joining Devices):

  1. Device Joins as Child: New device attaches to existing router as end device (child role)
  2. REED Status: If device is mains-powered, it becomes REED (Router Eligible End Device)
  3. Promotion Triggers:
    • Automatic: Leader promotes REEDs to Router when network needs more mesh coverage
    • Criteria: Leader monitors router count, hop count distribution, and mesh connectivity
    • Target: Maintain 16-32 active routers for optimal mesh density
  4. Promotion Sequence:
    • Leader sends MLE Router Promotion message to selected REED
    • REED accepts promotion, allocates Router ID from Leader’s table
    • New router sends MLE Advertisement to announce its router status
    • Network topology updates (other devices use new router as potential parent)

Leader Re-Election (Failure Recovery):

  1. Leader Failure: Routers detect missing Leader heartbeats (no MLE Advertisements for 96 seconds)
  2. Partition Merge: Each router promotes itself to Leader of separate partition
  3. Partition Discovery: Routers broadcast MLE Advertisement with their Partition ID
  4. Weight Comparison: Routers compare Leader Weight values (highest wins)
  5. Network Merge: Lower-weight partitions merge into higher-weight partition
  6. Stable Leader: Highest-weight router becomes network Leader
  7. Time to Stability: Leader re-election completes in 10-30 seconds (transparent to end devices)

Self-Healing Example: If Border Router (Leader, weight 128) fails: - Router A (weight 64) and Router B (weight 72) both promote to Leader of separate partitions - They discover each other via MLE Advertisement - Router B (higher weight) becomes new Leader after partition merge - Router A downgrades to Router role - Network continues operating with Router B as Leader

49.10 Try It Yourself: Thread Network Self-Healing Simulation

Scenario: You have a Thread network with 1 Border Router (Leader), 3 Routers, and 8 SEDs. Simulate what happens when Router 2 fails suddenly.

Initial Topology:

Border Router (RLOC 0x0000, Leader)
├── Router 1 (RLOC 0x2000)
│   ├── SED 1 (door sensor)
│   └── SED 2 (motion sensor)
├── Router 2 (RLOC 0x4000) [FAILS]
│   ├── SED 3 (temperature sensor)
│   ├── SED 4 (leak detector)
│   └── SED 5 (door sensor)
└── Router 3 (RLOC 0x6000)
    ├── SED 6 (motion sensor)
    ├── SED 7 (temperature sensor)
    └── SED 8 (door sensor)

Tasks:

  1. Predict Impact (T+0 seconds):
    • Which devices are immediately affected by Router 2 failure?
    • Which devices continue operating normally?
    • What happens to ongoing message transmission through Router 2?
  2. Parent Discovery (T+15-60 seconds):
    • SED 3, SED 4, and SED 5 lose their parent (Router 2)
    • They will scan for new parents (MLE Parent Request broadcast)
    • Which routers are likely to respond? (Router 1, Router 3, Border Router)
    • What criteria will SEDs use to select new parent?
  3. Calculate Recovery Time:
    • SED 3: Poll interval 60s, last poll was 30s ago → wakes in 30s
    • SED 4: Poll interval 120s, last poll was 90s ago → wakes in 30s
    • SED 5: Poll interval 30s, last poll was 10s ago → wakes in 20s
    • Which SED recovers first? Which recovers last?
  4. Network Topology After Recovery:
    • Assume SEDs redistribute based on signal strength:
      • SED 3 (temperature): RSSI -55 dBm to Router 1, -70 dBm to Router 3 → chooses Router 1
      • SED 4 (leak detector): RSSI -65 dBm to Router 1, -58 dBm to Router 3 → chooses Router 3
      • SED 5 (door sensor): RSSI -60 dBm to Router 1, -62 dBm to Router 3 → chooses Router 1
    • Draw new topology after recovery
    • What’s the new child distribution? (Router 1: 4 children, Router 3: 4 children)
  5. REED Promotion (T+120 seconds):
    • If Router 2 doesn’t return, Leader may promote a REED to Router
    • Assume SED 3 is actually a mains-powered temperature sensor (REED)
    • Leader promotes SED 3 → Router 4 (RLOC 0x8000)
    • SED 4 and SED 5 can now attach to Router 4 (closer than Router 1/3)
    • Draw final topology after REED promotion

Simulation Code (Python pseudo-code):

class ThreadDevice:
    def __init__(self, name, rloc, role, parent_rloc, poll_interval):
        self.name = name
        self.rloc = rloc
        self.role = role  # "Leader", "Router", "SED"
        self.parent_rloc = parent_rloc
        self.poll_interval = poll_interval
        self.last_poll_time = 0
        self.time_until_wake = poll_interval

    def detect_parent_failure(self, current_time):
        if self.role == "SED" and self.time_until_wake == 0:
            print(f"{self.name} wakes for poll, detects parent failure")
            return True
        return False

    def find_new_parent(self, available_routers):
        # Simulate parent selection based on RSSI
        best_router = max(available_routers, key=lambda r: r['rssi'])
        self.parent_rloc = best_router['rloc']
        print(f"{self.name} selects new parent {best_router['name']} (RSSI {best_router['rssi']} dBm)")

# Simulate network
network = [
    ThreadDevice("Border Router", "0x0000", "Leader", None, 0),
    ThreadDevice("Router 1", "0x2000", "Router", "0x0000", 0),
    ThreadDevice("Router 2", "0x4000", "Router", "0x0000", 0),  # Will fail
    ThreadDevice("Router 3", "0x6000", "Router", "0x0000", 0),
    ThreadDevice("SED 3", "0x4001", "SED", "0x4000", 60),
    ThreadDevice("SED 4", "0x4002", "SED", "0x4000", 120),
    ThreadDevice("SED 5", "0x4003", "SED", "0x4000", 30),
]

# T+0: Router 2 fails
print("T+0: Router 2 (0x4000) fails")
failed_router = [d for d in network if d.rloc == "0x4000"][0]
failed_router.role = "FAILED"

# T+20-120: SEDs wake and discover failure
for t in range(0, 121):
    for device in network:
        if device.role == "SED":
            device.time_until_wake -= 1
            if device.time_until_wake <= 0:
                device.detect_parent_failure(t)
                # Find new parent
                available = [
                    {"name": "Router 1", "rloc": "0x2000", "rssi": -55},
                    {"name": "Router 3", "rloc": "0x6000", "rssi": -70}
                ]
                device.find_new_parent(available)
                device.time_until_wake = device.poll_interval

What to Observe:

  • Recovery time depends on SED poll intervals (not instant)
  • Network continues operating (other devices unaffected)
  • SEDs automatically redistribute across remaining routers
  • REED promotion provides long-term mesh coverage restoration

49.11 Concept Check

49.12 Concept Relationships

Concept Relationship Connected Concept
Leader Election Uses Configurable Leader Weight (0-255) with partition quality comparison
Router Promotion Triggered By Leader monitoring mesh coverage and router count (16-32 optimal)
RLOC (Routing Locator) Encodes Router ID (bits 10-15) and Child ID (bits 0-9) for topology-based routing
SED Poll Interval Balances Battery life (longer = better) vs response latency (shorter = better)
MLE Advertisement Broadcasts Network topology and Leader status (every 32 seconds)

49.13 See Also

Common Pitfalls

Thread partitions can form silently when connectivity between groups of devices is lost. Applications that do not detect partition events may continue operating in an isolated sub-network, producing stale or missing data without any error indication.

Thread automatically assigns device roles (leader, router, end device) based on network conditions. Attempting to manually force specific role assignments through application code fights against Thread’s dynamic topology management and often causes instability.

Thread networks require time to reconverge after planned maintenance (border router restart, firmware updates). Test that applications handle the reconvergence period gracefully with appropriate timeouts and retry logic.

49.14 Summary

This chapter covered Thread network operations and power management:

  • Network Formation: Leader election, router promotion, and automatic self-healing mesh topology ensure reliable device-to-device communication without single points of failure
  • Self-Healing: Automatic rerouting occurs within seconds when routers fail—orphaned children find new parents, REEDs promote to routers if needed, and leader re-election maintains network stability
  • IPv6 Addressing: Thread devices use multiple address types—Link-Local (fe80::), Mesh-Local (fd::), RLOC (topology-dependent routing), EID (stable identifier), and Global (internet access)
  • Power Optimization: SEDs achieve 7-10 year battery life on CR2032 through deep sleep (10 µA), infrequent polling (60s-5min intervals), and minimized active time (<0.1% duty cycle)
  • Network Capacity: Thread networks support 250 devices maximum with 16-32 routers forming the mesh backbone; bandwidth is not the limiting factor

49.15 Knowledge Check

::

::

49.16 What’s Next

Chapter Focus
Thread Development and Integration OpenThread SDK, device configuration, and Matter protocol integration
Thread Deployment Guide Border Router setup, production deployment, and troubleshooting
Zigbee Comprehensive Review Compare Thread’s mesh-local IPv6 approach with Zigbee’s cluster-based architecture
6LoWPAN Fundamentals and Architecture IPv6 header compression and adaptation layer that Thread builds upon
802.15.4 Comprehensive Review Physical and MAC layer details underlying Thread’s radio communication