Error detection ensures IoT data arrives uncorrupted. Simple checksums (sum of bytes) catch basic errors but miss byte-swap and burst corruption. CRC (Cyclic Redundancy Check) uses polynomial division to detect all single-bit, double-bit, and burst errors up to the CRC width – CRC-32 detects all bursts up to 32 consecutive bits. IoT protocols choose between these based on the trade-off: checksums use minimal CPU/memory while CRC provides stronger guarantees for critical sensor data.
Learning Objectives
By the end of this section, you will be able to:
Explain error detection fundamentals: Justify why IoT networks require mechanisms to detect data corruption caused by wireless interference and noise
Compare CRC and checksums: Distinguish between simple checksums and CRC algorithms in terms of error detection capability and computational cost
Calculate CRC values: Apply the polynomial division process to compute CRC check values for given data sequences
Identify error detection limitations: Classify the types of errors each mechanism can and cannot detect, including burst errors and byte swaps
Select appropriate algorithms: Evaluate reliability requirements and resource constraints to choose the correct error detection method for a given IoT deployment
For Beginners: Error Detection
When data travels across a network, bits can get flipped by interference, like static on a phone line garbling words. Error detection methods like CRC and checksums are mathematical tricks that let the receiving device check whether the data arrived intact. Think of it as a self-checking code, like the last digit on a credit card number.
Sensor Squad: Catching Corrupted Data!
“Wireless signals are noisy,” said Sammy the Sensor. “Electromagnetic interference can flip bits during transmission. Without error detection, my temperature reading of 25°C could arrive as 125°C – and nobody would know it was wrong!”
“Simple checksums are like adding up all the digits,” explained Max the Microcontroller. “If I send ‘1, 2, 3’ with a checksum of 6, the receiver adds them up and checks. If they get 6, the data is probably correct. But checksums miss some errors – like if two numbers get swapped.”
“CRC is much more powerful,” added Lila the LED. “It uses polynomial division to create a mathematical fingerprint of the data. CRC-32 can detect ALL single-bit errors, ALL double-bit errors, and ALL burst errors up to 32 bits long. It is the gold standard for data integrity.”
“The trade-off is simple,” said Bella the Battery. “Checksums use minimal CPU and memory – great for tiny sensors sending routine readings. CRC uses more processing power but catches more errors – essential for critical data like medical sensors or industrial controls. Pick based on how important the data is.”
19.1 Prerequisites
Before diving into this chapter, you should be familiar with:
Transport Fundamentals: Understanding TCP vs UDP trade-offs and basic acknowledgment concepts provides essential context for error detection
Reliability Overview: The parent chapter introducing all five pillars of IoT reliability
Binary and Hexadecimal: Familiarity with bitwise operations is essential for understanding checksum calculations
Why Error Detection Matters
Wireless channels are inherently noisy. Electromagnetic interference from motors, microwaves, and other devices can flip bits during transmission. Signal fading in multipath environments corrupts data randomly. Without error detection, your temperature sensor might report 125°C instead of 25°C due to a single bit flip – and your system would have no way to know the data is wrong.
19.2 Error Detection Fundamentals
Time: ~15 min | Level: Intermediate | Unit: P07.REL.U02
Error detection identifies when data has been corrupted during transmission. The sender computes a mathematical function over the data and appends the result (checksum or CRC). The receiver performs the same computation - if the results differ, corruption occurred.
19.2.1 The Error Detection Process
Figure 19.1: Error detection workflow: sender calculates check value, channel may corrupt data, receiver verifies by recalculating.
19.2.2 Types of Transmission Errors
Error Type
Description
Example
Detection Difficulty
Single bit
One bit flipped
0x48 becomes 0x49
Easy
Burst error
Multiple consecutive bits flipped
0x4845 becomes 0x3912
Moderate
Byte swap
Two bytes exchange positions
0x1234 becomes 0x3412
Hard for checksums
Insertion
Extra byte added
“AB” becomes “ACB”
Moderate
Deletion
Byte removed
“ABC” becomes “AC”
Moderate
19.3 Simple Checksum Algorithms
The simplest approach is to sum all bytes in the message. Several variants exist with different trade-offs.
19.3.1 Sum-of-Bytes Checksum
uint8_t calculateChecksum(constuint8_t* data,size_t length){uint8_t sum =0;for(size_t i =0; i < length; i++){ sum += data[i];}return sum;}
Advantages:
Extremely fast (single addition per byte)
Minimal code size
Works on any processor
Disadvantages:
Cannot detect byte reordering (addition is commutative)
Used in IP, TCP, and UDP headers, this algorithm treats data as 16-bit words and performs one’s complement addition:
uint16_t internetChecksum(constuint16_t* data,size_t wordCount){uint32_t sum =0;for(size_t i =0; i < wordCount; i++){ sum += data[i];}// Fold 32-bit sum to 16 bits (add carry bits)while(sum >>16){ sum =(sum &0xFFFF)+(sum >>16);}return~sum;// One's complement}
Key property: The checksum of the data plus its checksum equals 0xFFFF (all ones), making verification simple.
Checksum Weakness: Byte Swap
Simple checksums cannot detect when bytes are transposed:
This is why TCP/IP uses checksums for headers (where errors are unlikely) but CRC for link-layer frames (where bit errors are common).
19.4 Cyclic Redundancy Check (CRC)
CRC provides much stronger error detection by treating the data as a polynomial and dividing by a fixed generator polynomial. The remainder becomes the CRC value.
Worked Example: Checksum vs CRC on a Corrupted Sensor Reading
A temperature sensor sends the 4-byte payload [0x19, 0x00, 0xE8, 0x03] representing temperature 25.0°C encoded as 0x0019 (25 raw units) and humidity 1,000 encoded as 0x03E8 in big-endian format.
Now suppose the two humidity bytes swap during transmission: [0x19, 0x00, 0x03, 0xE8] New checksum = 0x19 + 0x00 + 0x03 + 0xE8 = 0x04 – identical!
The receiver accepts the corrupted data. Humidity reads as 59,395 instead of 1,000 – completely wrong, but the integrity check passes silently.
With CRC-16-CCITT: CRC-16 treats data as a polynomial where bit positions matter. The original and byte-swapped versions produce different CRC values, so the corruption is detected and the receiver rejects the packet.
Lesson: For IoT sensor data where byte-order corruption has real consequences (medical devices, industrial controls), CRC is essential. Simple checksums are acceptable only for non-critical telemetry on reliable links.
19.4.1 How CRC Works Conceptually
Figure 19.2: CRC calculation process: message treated as polynomial, divided by generator polynomial, remainder is the CRC.
19.4.2 CRC-16 Implementation
#define CRC16_POLYNOMIAL 0x1021// CRC-CCITT#define CRC16_INITIAL 0xFFFFuint16_t calculateCRC16(constuint8_t* data,size_t length){uint16_t crc = CRC16_INITIAL;for(size_t i =0; i < length; i++){ crc ^=((uint16_t)data[i]<<8);for(int bit =0; bit <8; bit++){if(crc &0x8000){ crc =(crc <<1)^ CRC16_POLYNOMIAL;}else{ crc = crc <<1;}}}return crc;}
This means you’d expect less than one undetected corruption across a billion transmissions. CRC-16 provides \(2^{-16} \approx 1.5 \times 10^{-5}\), which would give approximately 15,000 undetected errors per billion packets – demonstrating why CRC-32 is preferred for critical data.
Practical Reliability
For CRC-32 with random errors: - Probability of undetected error = 2^(-32) ≈ 0.0000000233% (about 2.3 per 10 billion) - For 1 billion messages, expect ~0.23 undetected corruptions
This is why CRC is combined with other mechanisms (sequence numbers, ACKs) for critical applications.
Try It: CRC Undetected Error Probability Calculator
While not an IoT case, this incident illustrates why data integrity matters in every system with constrained arithmetic. The Patriot missile system tracked time using a 24-bit fixed-point register with 0.1-second resolution. After 100 hours of continuous operation, cumulative rounding error reached 0.34 seconds – enough for the system to miscalculate an incoming Scud missile’s position by 687 meters. The missile struck a military barracks, killing 28 soldiers.
The lesson for IoT: Sensors accumulating readings over long periods (smart meters, environmental monitors) face the same class of error. A 16-bit temperature sensor with 0.01°C resolution accumulates rounding drift of 0.006°C per hour – after 30 days, that is 4.3°C of uncorrected drift. CRC detects transmission corruption but cannot detect computation drift; IoT systems need both error detection on the wire AND periodic calibration validation at the application layer.
19.9.2 Toyota Unintended Acceleration (2005-2010)
NASA’s investigation of Toyota’s electronic throttle control system found that single-event upsets (cosmic ray-induced bit flips in SRAM) could corrupt throttle position variables. The investigation identified insufficient software fault tolerance: safety-critical RAM variables lacked robust integrity verification, and the task scheduler had defects that could cause stack overflow, corrupting memory unpredictably. The system’s integrity checks were inadequate to detect the specific bit-flip patterns that could unlock the throttle.
Quantified impact: 89 deaths attributed to unintended acceleration incidents over the investigation period. Remediation required upgrading integrity verification of safety-critical RAM variables using stronger checksums, adding redundant variable storage with cross-checking (store each value twice and compare), and implementing a hardware watchdog timer – demonstrating that application-layer integrity checks are essential even when lower-level hardware appears reliable.
19.9.3 Choosing Error Detection for IoT: A Practical Decision Framework
Cost insight: The difference between “no detection” and “CRC-32 with hardware acceleration” is essentially zero for modern MCUs – STM32L0 series ($0.89, includes CRC peripheral) computes CRC-32 over a 128-byte sensor payload in 2 microseconds. There is no technical or economic reason to skip CRC on any wireless IoT link.
19.10 Choosing the Right Error Detection for Your IoT Protocol
Different IoT protocols use different error detection algorithms, each optimized for specific constraints. Understanding why each protocol made its choice helps you make the right choice for custom implementations.
19.10.1 Error Detection Across the IoT Protocol Stack
Protocol
Algorithm
Strength
CPU Cost (8-bit MCU)
Why This Choice
IEEE 802.15.4 (Zigbee, Thread)
CRC-16 (ITU-T / 0x1021)
All bursts <=16 bits
Hardware offloaded to radio chip
Short frames (127 bytes max), noisy RF channel; radio hardware computes CRC
Ethernet
CRC-32
All bursts <=32 bits
Hardware offloaded
Large frames (1500 bytes), copper/fiber reliability
If you are designing a custom binary protocol for sensor data (common in industrial IoT), choose your error detection based on three factors:
Factor 1: Bit Error Rate (BER) of the channel
BER < 10^-6 (wired Ethernet, fiber):
Simple checksum is adequate. Errors are extremely rare.
BER 10^-6 to 10^-4 (Wi-Fi, BLE, indoor RF):
CRC-16 minimum. Burst errors from interference are common.
BER > 10^-4 (LoRa at SF12, noisy industrial environments):
CRC-32 plus application-level integrity (HMAC or MIC).
At BER=10^-3, a 100-byte packet has 55% chance of containing an error.
Factor 2: Consequence of undetected corruption
Low consequence (ambient temperature, lighting level):
Checksum-16 is sufficient. Worst case: one bad reading displayed.
Medium consequence (HVAC control, inventory tracking):
CRC-16 recommended. Bad data triggers incorrect actions.
High consequence (medical dosing, industrial valve position):
CRC-32 + authenticated integrity check (HMAC-SHA256 or AES-CMAC).
Undetected corruption could cause physical harm.
Factor 3: Available CPU and ROM
8-bit MCU, <8 KB ROM (ATtiny85):
Checksum-8 or Fletcher-16 (table-free, ~20 bytes of code)
8-bit MCU, >8 KB ROM (ATmega328P):
CRC-16-CCITT with 256-byte lookup table
32-bit MCU (ESP32, nRF52840):
CRC-32 with hardware acceleration or 1 KB lookup table
HMAC-SHA256 via hardware crypto peripheral
Practical example: A custom RS-485 Modbus device sending 16-bit pressure readings over a 200 m cable in a factory with welding machines nearby (high EMI):
BER: ~10^-5 (RS-485 with good shielding in EMI environment)
Consequence: Medium (controls a pressure relief valve)
MCU: ATmega328P (8-bit, 32 KB ROM)
Choice: CRC-16 (Modbus already uses CRC-16 in its standard; consistent with protocol spec and adequate for the BER and consequence level)
19.11 How It Works: CRC-16-CCITT Step-by-Step
Understanding the polynomial division behind CRC error detection:
Step 1: Append 16 zero bits to message
Message bits: 01001000 01101001 0000000000000000
(0x48 )(0x69 )(appended zeros)
Step 2: Initialize CRC register to 0xFFFF
CRC = 1111111111111111
Step 3: Process first byte (0x48 = 01001000)
XOR message byte into upper 8 bits of CRC:
CRC = 1111111111111111 XOR 0100100000000000
= 1011011111111111
For each bit (8 times):
If MSB = 1:
Shift left, XOR with 0x1021
Else:
Shift left only
Bit 1 (MSB=1): 0110111111111110 XOR 0001000000100001 = 0111111111011111
Bit 2 (MSB=0): 1111111110111110
...
(Continue for all 8 bits)
Step 4: Process second byte (0x69)
XOR into upper 8 bits, repeat shift-XOR for 8 bits
Step 5: Final CRC value
CRC = 0x64E5 (after processing all bits with init=0xFFFF)
Why It Works:
Each bit position is weighted by the polynomial
Any single-bit error changes the CRC value
Burst errors up to 16 bits are guaranteed detected
The polynomial 0x1021 has special mathematical properties that maximize error detection
What to Observe: Byte-swap preserves the sum but changes the CRC. This demonstrates why IoT protocols use CRC for wireless links where byte-order corruption is possible.
Exercise 2: Burst Error Detection Threshold
Find the maximum burst length CRC-32 can detect:
# burst_error_test.pyimport randomimport zlibdef inject_burst_error(data, burst_start_bit, burst_length):"""Flip burst_length consecutive bits starting at burst_start_bit.""" data_bits =bytearray(data)for i inrange(burst_length): bit_pos = burst_start_bit + i byte_idx = bit_pos //8 bit_idx = bit_pos %8 data_bits[byte_idx] ^= (1<< (7- bit_idx))returnbytes(data_bits)def test_burst_detection(burst_length, trials=1000): detected =0for _ inrange(trials):# Generate random 100-byte packet original =bytes([random.randint(0, 255) for _ inrange(100)]) original_crc = zlib.crc32(original) &0xFFFFFFFF# Inject burst error at random position burst_start = random.randint(0, len(original)*8- burst_length) corrupted = inject_burst_error(original, burst_start, burst_length) corrupted_crc = zlib.crc32(corrupted) &0xFFFFFFFFif original_crc != corrupted_crc: detected +=1return detected / trials *100# Test different burst lengthsfor burst_len in [1, 2, 5, 10, 16, 17, 20, 32]: detection_rate = test_burst_detection(burst_len) status ="GUARANTEED"if burst_len <=32else"PROBABLE"print(f"Burst length {burst_len:2d} bits: {detection_rate:5.1f}% detected ({status})")
Up to 32 bits: 100% detection (guaranteed by CRC-32 theory)
Beyond 32 bits: Still very high detection rate (probability = 1 - 2^-32)
Exercise 3: Hardware vs Software CRC Performance
Measure the speed difference on ESP32:
CRC Performance Test Code (click to expand)
// crc_performance_test.ino (ESP32)#include "esp_rom_crc.h"uint8_t test_data[1000];void setup(){ Serial.begin(115200);// Generate random test datafor(int i =0; i <1000; i++){ test_data[i]= random(256);}}void loop(){// Software CRC-16 (table-driven)unsignedlong t1 = micros();uint16_t crc_sw = crc16_ccitt_software(test_data,1000);unsignedlong t2 = micros(); Serial.printf("Software CRC-16: 0x%04X in %lu µs\n", crc_sw, t2 - t1);// Hardware CRC-16 (ESP32 ROM function)unsignedlong t3 = micros();uint16_t crc_hw = esp_rom_crc16_le(0xFFFF, test_data,1000);unsignedlong t4 = micros(); Serial.printf("Hardware CRC-16: 0x%04X in %lu µs\n", crc_hw, t4 - t3); Serial.printf("Speedup: %.1fx\n\n",(float)(t2-t1)/(t4-t3)); delay(5000);}// Software implementation (for comparison)uint16_t crc16_ccitt_software(constuint8_t* data,size_t len){uint16_t crc =0xFFFF;for(size_t i =0; i < len; i++){ crc ^=(data[i]<<8);for(int j =0; j <8; j++){if(crc &0x8000){ crc =(crc <<1)^0x1021;}else{ crc <<=1;}}}return crc;}
Expected Output (representative timing on ESP32 at 240 MHz):
Software CRC-16: 0xB3F2 in 320 µs
Hardware CRC-16: 0x6D4A in 4 µs
Speedup: 80.0x
Note: the CRC values differ because esp_rom_crc16_le uses initial value 0 while the software implementation uses 0xFFFF. Both compute valid CRC-16-CCITT checksums – the key observation is the 80x speedup, not the specific values.
What to Observe: Hardware CRC is typically 50-100x faster. Always use hardware when available (check MCU datasheet for CRC peripheral).
Common Pitfalls
1. Using CRC-16 for Data Where CRC-32 is Required
CRC-16 has a 1-in-65,536 chance of failing to detect a random error — acceptable for small 64-byte packets but dangerous for 1 KB firmware images. CRC-32 reduces the false-pass probability to 1-in-4,294,967,296. For firmware integrity verification, OTA image validation, and anything larger than 256 bytes, use CRC-32 (polynomial 0xEDB88320). Never use simple 8-bit checksums (1-in-256 false-pass) for any security-relevant data integrity check.
2. Calculating CRC Over Wrong Data Boundaries
A common CRC bug: computing the CRC over a buffer that includes the CRC field itself (initialized to 0), then writing the CRC into that field, then transmitting. The receiver computes the CRC over the data including the CRC field — if the CRC polynomial has the right properties this will produce a constant residue (0 or specific value), but many implementations get this wrong. Clearly define in the protocol specification: “CRC is computed over bytes N–M, not including the CRC field.”
3. Not Testing CRC With Corrupted Data
Implementing CRC in firmware without testing it with deliberately corrupted data is incomplete. A CRC implementation that always returns 0 or always returns the same value will pass “happy path” tests (uncorrupted data → CRC matches) but fail silently when corruption occurs. Test with: single bit flip (byte XOR 1), byte insertion, byte deletion, and truncation. Verify CRC detection on at least 1,000 randomly corrupted messages.
4. Assuming CRC Catches All Errors
CRC detects all single-bit errors, all 2-bit errors, and burst errors shorter than the CRC polynomial degree. However, CRC does NOT detect all errors: two specific error patterns that differ by a multiple of the generator polynomial have the same CRC (probability 1/2^n for n-bit CRC). For cryptographic data integrity (firmware, security keys, certificates), use HMAC-SHA256 or SHA-256 hash verification instead of CRC. CRC is for accidental corruption detection, not tamper detection.