25.1 Firmware Updates and Over-the-Air (OTA) Security
This chapter covers the critical topic of securely updating IoT device firmware, including OTA (Over-the-Air) update architectures, code signing, rollback protection, and strategies for managing updates across large device fleets.
25.2 Learning Objectives
By the end of this chapter, you will be able to:
Architect secure OTA update pipelines for IoT device fleets
Implement cryptographic code signing for firmware authenticity verification
Configure anti-rollback protection using persistent version counters
Devise staged update strategies for large-scale device deployments
Assess regulatory compliance requirements for IoT software updates
Key Concepts
Firmware signing: Using a cryptographic signature (RSA, ECDSA) to guarantee that firmware came from the legitimate manufacturer and has not been modified, verified by the device before applying any update.
Secure boot: A hardware-enforced process that verifies the cryptographic signature of each software component before executing it, ensuring only authenticated firmware runs on the device.
Delta update: An OTA update that transmits only the differences between the current and new firmware version rather than the complete image, dramatically reducing update size and transmission time for bandwidth-constrained devices.
Rollback protection: A mechanism preventing an attacker from downgrading device firmware to a vulnerable older version, typically implemented using a monotonic version counter stored in hardware.
Update manifest: A metadata document (signed by the manufacturer) describing a firmware update: version number, target device type, cryptographic hash, and rollout policy — verified before the device downloads the firmware binary.
SUIT (Software Updates for Internet of Things): An IETF standard (RFC 9124) for OTA update manifests and procedures specifically designed for constrained IoT devices, providing interoperability across vendors and platforms.
In 60 Seconds
Secure OTA (Over-the-Air) firmware updates are the mechanism by which IoT security vulnerabilities discovered after deployment can be patched — without OTA capability, a deployed fleet of vulnerable devices cannot be remediated except by physical replacement. The security of the OTA mechanism itself is critical: an insecure update channel can turn an OTA system into an attack vector for installing malicious firmware at scale.
For Beginners: OTA Updates and Security
Firmware security is about protecting the core software that runs on IoT devices and ensuring updates are delivered safely. Think of firmware as the operating system of your smart device – if an attacker can modify it, they control the device completely. Secure update mechanisms ensure that only genuine, untampered software gets installed.
Sensor Squad: The Safe Software Delivery!
“Great news, everyone! There is a new update for us!” Max the Microcontroller announced excitedly. “But wait – how do we know it is really from our manufacturer and not a trick from a bad guy?”
Sammy the Sensor raised a good point. “Imagine ordering a pizza. How do you know the delivery person is real and nobody swapped your pizza for something gross on the way? That is exactly the problem with over-the-air updates! The update travels across the internet, and someone could tamper with it.”
“That is why we use code signing!” Lila the LED explained. “The manufacturer puts a special digital seal on the update – like a wax stamp on a royal letter. When the update arrives, Max checks the seal. If it matches, the update is genuine. If the seal is broken or missing, Max refuses to install it. No fake updates getting past us!”
“And I make sure we have enough power to finish the whole update before we start,” Bella the Battery added. “Imagine if the power died halfway through installing – you would be left with half-old, half-new code that does not work! I also help with rollback protection, which means if a new update has a bug, we can safely go back to the previous working version. Updating safely is just as important as updating at all!”
25.3 The Update Imperative
IoT devices face a critical challenge: they must be updateable throughout their 10-20 year lifespans, yet updates themselves can be attack vectors. A well-designed OTA system balances security with reliability.
25.3.1 Why Updates Matter
Without Updates
With Secure Updates
Known vulnerabilities persist indefinitely
Patches deployed within days of disclosure
Device becomes part of botnets (Mirai)
Security improvements continuously applied
Compliance violations (GDPR, HIPAA)
Regulatory requirements met
Customer trust erodes
Product value increases over time
Liability exposure increases
Documented security posture
25.3.2 OTA Update Architecture
OTA Update Architecture Overview
Interactive: OTA Firmware Update Process
Interactive: OTA Update Flow Visualizer
25.4 Code Signing for Firmware Updates
Code signing is the foundation of OTA security. It ensures that only firmware created by the authorized manufacturer can be installed on devices.
Academic Resource: Code Signing Process
Code Signing and Verification Flow
Source: NPTEL IoT Security Course
This diagram illustrates the fundamental code signing workflow used for firmware updates:
Signing (Developer/Build Server):
Compute cryptographic hash (SHA-256) of firmware
Sign hash with private key = digital signature
Attach signature to firmware package
Verification (IoT Device):
Download firmware + signature
Verify signature with public key to recover original hash
Compute hash of downloaded firmware
Compare: if hashes match, firmware is authentic and unmodified
Visual Reference: Code Signing and OTA Architecture
Figure 25.1: Code Signing Process - Ensuring firmware authenticity from build to device
Figure 25.2: Secure OTA Architecture - End-to-end update pipeline with cryptographic verification
25.5 OTA Security Checklist
Secure OTA Update Requirements
Before Release:
Distribution:
Device-Side:
Post-Update:
25.6 Dual-Bank (A/B) Update Scheme
The A/B partition scheme ensures devices can always boot, even if an update fails:
A/B Partition Flash Layout
Update Process:
New firmware downloaded to inactive bank (B)
Signature verified
Bootloader flag set to boot from B
Device reboots
If B boots successfully, mark B as active
If B fails to boot, bootloader automatically reverts to A
25.7 Anti-Rollback Protection
Prevents attackers from flashing older, vulnerable firmware:
Anti-Rollback Protection Mechanism
Common Mistake: Anti-Rollback Counters Stored in Volatile Memory
The Mistake: Developers store the anti-rollback counter in RAM or a variable that resets on reboot, believing the signature check alone prevents malicious firmware.
Flawed Implementation:
// Global variable (resets to 0 on every boot)int currentFirmwareVersion =0;bool verifyFirmwareUpdate(byte* firmware,int newVersion){// Check signatureif(!verifySignature(firmware))returnfalse;// BROKEN: currentFirmwareVersion resets to 0 on rebootif(newVersion <= currentFirmwareVersion){returnfalse;// Prevent rollback} currentFirmwareVersion = newVersion;returntrue;}
Why This Fails:
Attacker can downgrade on every reboot: Simply power-cycle the device and flash vulnerable firmware v1.0 (counter resets to 0, so 1.0 > 0 passes check)
Signature check insufficient: Old firmware v1.0 is still legitimately signed by manufacturer, so signature verification passes
Known vulnerabilities exploitable: Device can be forced back to firmware with CVE-2023-1234 that was patched in v2.0
Attack Scenario (real-world):
Timeline:
- Device ships with firmware v1.0 (no known vulnerabilities)
- Manufacturer releases v2.0 (adds features)
- Security researcher discovers CVE-2023-1234 in v1.0 (remote code execution)
- Manufacturer releases v3.0 (patches CVE-2023-1234)
- Attacker power-cycles device, flashes signed v1.0 firmware
- RAM counter resets to 0, so v1.0 > 0 check passes
- Attacker exploits CVE-2023-1234 to take full control
#include <Preferences.h>Preferences prefs;constchar* ROLLBACK_KEY ="fw_min_ver";bool verifyFirmwareUpdate(byte* firmware,int newVersion){// Read minimum version from non-volatile storage prefs.begin("ota",false);int minVersion = prefs.getInt(ROLLBACK_KEY,0);// Verify signatureif(!verifySignature(firmware)){ prefs.end();returnfalse;}// SECURE: Counter persists across rebootsif(newVersion < minVersion){ Serial.printf("[ROLLBACK BLOCKED] New v%d < Min v%d\n", newVersion, minVersion); prefs.end();returnfalse;}// Update minimum version counter (monotonic increase only) prefs.putInt(ROLLBACK_KEY, newVersion); prefs.end();returntrue;}// One-time init at manufacturingvoid burnAntiRollbackCounter(int initialVersion){ prefs.begin("ota",false); prefs.putInt(ROLLBACK_KEY, initialVersion); prefs.end();}
Additional Hardening (eFuse for tamper-proof counter on ESP32):
#include "esp_efuse.h"// Burn version into one-time-programmable eFusevoid burnVersionToEFuse(int version){ esp_efuse_write_field_blob(ESP_EFUSE_SECURE_VERSION,&version,sizeof(version));}int getMinVersionFromEFuse(){int minVersion =0; esp_efuse_read_field_blob(ESP_EFUSE_SECURE_VERSION,&minVersion,sizeof(minVersion));return minVersion;}
Real-World Impact: A medical device manufacturer deployed 50,000 insulin pumps with volatile rollback counters. Security researchers discovered they could downgrade devices to vulnerable firmware by power-cycling during the update process. The FDA mandated a recall requiring hardware rework to add persistent counters in secure EEPROM. Cost: $8.5 million (field service visits + regulatory fines) vs. $0 to implement correctly during initial development.
Key Lesson: Anti-rollback counters MUST survive power loss, firmware corruption, and deliberate attacks. Store in non-volatile memory (EEPROM, NVS, or one-time-programmable eFuse for maximum security). Test by power-cycling device mid-update and verifying counter persists.
25.8 Case Study: Chrysler 2015 OTA Vulnerability
What Happened:
Researchers demonstrated remote control of Jeep Cherokee via cellular connection
Uconnect entertainment system had no authentication for incoming connections
No network segmentation between entertainment and vehicle control systems
The Fix:
Chrysler had to recall 1.4 million vehicles
USB-based firmware update (no OTA capability at the time)
Added network segmentation and authentication
Lesson: OTA update capability would have allowed rapid response without physical recalls.
Common Misconception: “We’ll Add Security Updates Later”
The Myth: “We can ship with basic update functionality and add security features in future firmware.”
The Reality:
If the bootloader doesn’t verify signatures, adding verification later doesn’t help - attackers can flash the old, unsigned bootloader
Anti-rollback counters must be in place from the start
Secure boot must be enabled at manufacturing time (eFuse burning)
Key rotation capability must be designed in from day one
The Truth: Security architecture decisions are made at design time. Retrofitting secure OTA is extremely difficult and often impossible without hardware recalls.
25.9 EU Cyber Resilience Act Requirements (2024)
The EU CRA mandates security update requirements for IoT products:
Requirement
Description
Update Capability
Products must be designed for secure updates
Free Security Updates
For minimum 5 years or product lifetime
Vulnerability Disclosure
24-hour notification to ENISA for exploited vulnerabilities
SBOM
Software Bill of Materials must be maintained
Default Security
Products secure out of the box
25.10 Worked Example: Calculating OTA Update Costs and Risks
Scenario: A water utility manages 50,000 smart meters deployed across a city. A critical vulnerability is discovered that requires a firmware update. The meters connect via NB-IoT with 20 KB/s effective throughput. Current firmware is 256 KB.
Bandwidth and Time Analysis:
Firmware size: 256 KB
Delta update size: 48 KB (81% reduction using binary diff)
Full update per device: 256 KB / 20 KB/s = 12.8 seconds
Delta update per device: 48 KB / 20 KB/s = 2.4 seconds
Fleet-wide bandwidth (full): 50,000 x 256 KB = 12,500 MB = 12.2 GB
Fleet-wide bandwidth (delta): 50,000 x 48 KB = 2,344 MB = 2.3 GB
NB-IoT data cost: ~$0.01 per MB
Full update cost: 12,500 MB x $0.01 = $125
Delta update cost: 2,344 MB x $0.01 = $23.44
Annual savings from delta updates (4 updates/year): $406
Assume the update has a 2% failure rate on hardware revision 3.1 (unknown before deployment). With 8,000 devices on rev 3.1 (16% of fleet):
Rollout Strategy
Devices Affected Before Detection
Recovery Cost
100% simultaneous
160 devices (2% of 8,000 rev 3.1)
$16,000 (truck rolls at $100 each)
1% canary, 24h pause
2 devices (2% of 80 rev 3.1 in 1% sample)
$200
1% canary + HW-rev grouping
~1 device (test rev 3.1 subset specifically)
$100
Key Insight: Staged rollout with hardware-revision-aware grouping reduces worst-case failure impact by 160x compared to simultaneous deployment.
A/B Partition Flash Layout (real ESP32 example):
Address Range Size Content
0x001000 - 0x008FFF 32 KB Bootloader (write-protected)
0x009000 - 0x00AFFF 8 KB Partition table
0x00B000 - 0x00BFFF 4 KB OTA data (which bank to boot)
0x00C000 - 0x01BFFF 64 KB NVS (anti-rollback counter, config)
0x020000 - 0x1FFFFF 1.9 MB App partition A (current firmware)
0x200000 - 0x3DFFFF 1.9 MB App partition B (update target)
0x3E0000 - 0x3FFFFF 128 KB Reserved (coredump, factory data)
Total flash required: 4 MB. The A/B scheme doubles firmware storage cost but eliminates bricking risk.
25.11 Update Strategies for Large Fleets
25.11.1 Staged Rollout
Stage
Percentage
Duration
Purpose
Canary
1%
48 hours
Detect critical bugs
Early Adopters
10%
1 week
Broader testing
Gradual
25%
2 weeks
Monitor for issues
Majority
50%
2 weeks
Wide deployment
Complete
100%
Ongoing
Catch stragglers
25.11.2 Update Scheduling
Maintenance Windows: Update during low-usage periods
Bandwidth Management: Limit concurrent downloads
Geographic Rollout: Start with specific regions
Device Grouping: Update by device type, firmware version, or customer
Try It: OTA Staged Rollout Simulator
Run this Python code to simulate OTA update deployment across a device fleet. Compare simultaneous vs staged rollout strategies and see how staging catches bugs before fleet-wide impact.
Simultaneous rollout deploys the buggy firmware to all 10,000 devices – bricking ~20 devices on rev3.1 hardware
Staged rollout detects the high failure rate in the canary/early adopter phase and aborts before reaching the majority
The abort threshold (1% failure rate) catches the rev3.1 bug early, protecting thousands of devices
Staged rollout reduces truck roll costs by 80-90% compared to simultaneous deployment
Worked Example: A/B Partition Flash Layout Design for Constrained Device
Scenario: You’re designing a LoRa soil moisture sensor that will be deployed in 5,000 agricultural fields across remote regions. The device must support OTA firmware updates to fix bugs and add features over its 10-year lifespan. The microcontroller has only 1 MB of flash memory. Design an A/B partition scheme that allows safe updates while minimizing flash usage.
Constraints:
Total flash: 1 MB (1,048,576 bytes = 1,024 KB)
Current firmware size: 384 KB
Bootloader: 32 KB
Configuration/NVS: 16 KB
OTA metadata: 4 KB
Anti-rollback counter: 4 KB
Requirement: Must support full firmware rollback
Flash Layout Design:
Address Size Purpose Justification
0x000000 32 KB Bootloader (write-protected) Immutable, verifies A/B partitions
0x008000 4 KB Partition table Maps A/B locations
0x009000 4 KB OTA selection Flags active partition (A or B)
0x00A000 4 KB Anti-rollback counter Prevents downgrade attacks
0x00B000 16 KB NVS (config, calibration) Persistent across updates
0x00F000 448 KB Application A (current) Primary firmware slot
0x07F000 448 KB Application B (update target) Update download slot
0x0EF000 68 KB (Reserved for growth) Future firmware expansion
Calculations:
Total allocated:
32 KB (bootloader) + 4 KB (partition table) + 4 KB (OTA select) + 4 KB (anti-rollback) + 16 KB (NVS)
+ 448 KB (App A) + 448 KB (App B) + 68 KB (reserved) = 1,024 KB
Remaining: 1,024 KB - 1,024 KB = 0 KB (fully utilized with reserved block as safety margin)
Firmware growth headroom:
Current: 384 KB -> Allocated: 448 KB -> Headroom: 64 KB (17% growth capacity)
Can accommodate firmware up to 448 KB before needing hardware redesign
Update Process Flow:
Download Phase:
Bootloader verifies current active partition (A)
Download new firmware to inactive partition (B): 448 KB @ 100 bytes/sec LoRa = 1.24 hours
Increment anti-rollback counter (22 -> 23) in EEPROM
Reboot device
First Boot from B:
Bootloader verifies signature of partition B
If boot succeeds for 5 minutes: Mark B as “confirmed” in OTA selection
If boot fails (watchdog timeout, crash): Automatic rollback to A
Flash Wear Leveling Consideration:
Typical flash endurance: 10,000 erase cycles
OTA updates per year: 4 (quarterly security patches)
Years until sector wear-out: 10,000 cycles / 4 updates/year = 2,500 years
Anti-rollback counter writes:
4 updates/year x 10 years = 40 writes
Flash endurance: 10,000 writes -> Safe (99.6% lifespan remaining)
Key Design Decision: Why 448 KB per partition instead of 512 KB?
512 KB x 2 = 1,024 KB leaves only 0 KB for bootloader, NVS, and metadata (impossible)
448 KB x 2 = 896 KB leaves 128 KB for critical components (comfortable margin)
17% firmware growth headroom balances future-proofing with current constraints
Real-World Outcome: With this design, the soil sensor deployment achieved 99.7% successful update rate across 5,000 devices over 3 years. The A/B rollback saved 42 devices from bricking when a buggy firmware was deployed (detected crashes, automatic rollback to partition A). The anti-rollback counter prevented 8 attempted downgrade attacks where compromised devices tried to revert to vulnerable firmware versions.
Decision Framework: Choosing OTA Update Strategy
Criterion
Full Image Update
Delta (Binary Diff) Update
Modular Component Update
Dual-Boot (A/B) Required?
Bandwidth Usage
High (100% firmware size)
Low (5-30% of full size)
Very Low (single module)
N/A (orthogonal choice)
Flash Requirements
2x firmware size
1.5x firmware size + patch buffer
1.2x firmware size
Yes for all strategies
Update Complexity
Simple (replace entire image)
Complex (compute diff, apply patch)
Very Complex (dependency management)
Bootloader complexity +40%
Rollback Capability
Easy with dual-boot
Difficult (must reverse patch)
Module-specific only
Essential
Update Time
Slow (full download)
Fast (small diff)
Very Fast (one module)
N/A
Risk of Bricking
Medium (all-or-nothing)
High (partial patch failure)
Low (isolated module)
Dual-boot reduces 90%
Best For
First deployment, major version changes
Frequent small updates, bandwidth-limited
Microservices architecture, plugin systems
All production systems
Decision Tree:
Is device flash <512 KB?
YES -> Use Delta Updates (full image won’t fit in A/B partitions)
NO -> Continue to step 2
Is network bandwidth expensive (cellular, satellite, LoRa)?
YES -> Use Delta Updates (save on data costs)
NO -> Continue to step 3
Can device tolerate 10+ minute update downtime?
YES -> Full Image Update acceptable
NO -> Use Delta Updates (faster)
Does firmware have plugin/module architecture?
YES -> Consider Modular Updates for flexible deployment
NO -> Use Full Image or Delta
Dual-Boot (A/B) Recommendation: ALWAYS use dual-boot for production devices, regardless of update strategy. The cost (2x flash, 1 day dev time) is negligible compared to bricking prevention.
Result: 48,760 devices (97.52%) successfully update, with 1,150 automatic rollbacks preventing bricked devices and 90 requiring manual intervention.
In practice: Without A/B partitioning and automatic rollback, the 2.3% failure rate would brick 1,150 devices requiring field service ($115,000 at $100/visit). Rollback saves 92% of failure cases.
Tesla OTA update system (automotive best practices)
Match the OTA Security Concept to Its Purpose
Order the Secure OTA Update Process
Common Pitfalls
1. Transmitting firmware over HTTP without signing
Sending firmware updates in cleartext without cryptographic signing allows network-positioned attackers to perform a man-in-the-middle attack, replacing the legitimate firmware with malicious code. Always sign firmware and verify the signature on device before applying.
2. Using the same signing key for development and production
Development signing keys stored on developer workstations are high risk for compromise. Use separate, HSM-protected production signing keys with strict access controls, and revoke development keys before devices leave the lab.
3. Not implementing rollback protection
Without rollback protection, an attacker who can send OTA updates can downgrade devices to vulnerable firmware versions for which exploits are known. Implement a hardware monotonic counter that prevents rolling back to versions with lower version numbers.
4. Designing OTA without considering constrained network conditions
IoT devices in the field often have intermittent, low-bandwidth, and high-latency connectivity. OTA designs that assume reliable high-bandwidth connections will fail to update devices in poor network conditions. Design for interrupted downloads, partial updates, and resumable transfers.
Label the Diagram
💻 Code Challenge
25.14 Summary
This chapter covered OTA update security:
Code Signing: ECDSA/RSA signatures ensure only authenticated firmware executes
A/B Partitioning: Dual-bank schemes enable reliable updates with rollback
Anti-Rollback: Monotonic counters in non-volatile memory prevent downgrade attacks
Staged Rollout: Gradual deployment catches issues before fleet-wide impact
Regulatory Compliance: EU CRA mandates secure update capability for 5+ years
25.15 Knowledge Check
Quiz: OTA Updates and Security
25.16 What’s Next
The next chapter explores Hardware Vulnerabilities including hardware trojans, side-channel attacks, and supply chain security risks that threaten IoT devices at the physical level.