52 Error Detection: Checksums and CRC

52.1 Learning Objectives

By the end of this chapter, you will be able to:

Calculate error detection: Compute checksums and understand CRC principles
Compare detection methods: Evaluate trade-offs between checksums and CRC
Choose appropriate methods: Select error detection for different reliability requirements
Debug corrupted packets: Identify error sources using detection mechanisms

Related Chapters

Prerequisites (Read These First): - Packet Anatomy - Headers, payloads, and trailers fundamentals - Data Representation - Binary and hexadecimal for calculations

Companion Chapters (Packet Structure Series): - Packet Structure Overview - Index of all packet structure topics - Frame Delimiters and Boundaries - How receivers detect packet boundaries - Protocol Overhead - Header comparison and encapsulation

Security: - Encryption Architecture - How encryption affects packet structure - Threats and Attacks - Packet sniffing, replay attacks

52.2 Error Detection: Checksums and CRC

Time: ~9 min | Difficulty: Intermediate | Unit: P02.C02.U03

Problem: Noise, interference, or hardware faults can corrupt data during transmission.

Solution: Add a calculated value in the trailer that the receiver can verify.

52.2.1 Simple Checksum

Add all bytes, take lowest 8 bits:

Payload bytes: [0x45, 0x3F, 0x12]
Checksum = (0x45 + 0x3F + 0x12) & 0xFF = 0x96
Trailer: [0x96]

Pros: Simple, fast Cons: Weak error detection (can miss burst errors)

52.2.2 Cyclic Redundancy Check (CRC)

Uses polynomial division for robust error detection: - CRC-16: 16-bit value, detects all single-bit and double-bit errors - CRC-32: 32-bit value, detects 99.9999% of errors - Used by: Ethernet, USB, LoRaWAN, Modbus

Example: Ethernet Frame Check Sequence (FCS) uses CRC-32

52.3 How CRC Works

CRC treats the data as a polynomial and divides it by a generator polynomial. The remainder becomes the CRC value:

Data as polynomial: Each bit position represents a coefficient (e.g., 0x45 = x^6 + x^2 + 1)
Generator polynomial: Standardized for each CRC variant (CRC-32 uses 0x04C11DB7)
Division: XOR-based polynomial division (no carry, just XOR)
Remainder: The final remainder is appended as the CRC

Why CRC is better than checksum:

Error Type	Checksum	CRC-16	CRC-32
Single bit flip	~50%	100%	100%
Two bit flips	~50%	100%	100%
Transposed bytes	0% (undetected!)	100%	100%
Burst < 16 bits	Poor	100%	100%
Burst < 32 bits	Poor	99.998%	100%
Random errors	~50%	99.998%	99.9999%

Three-panel comparison diagram showing checksum algorithm, CRC algorithm, and error detection capabilities. Simple Checksum panel (teal): data bytes 0x45 0x3F 0x12 flow through summation (0x96), masking (0x96 & 0xFF = 0x96), to trailer append (orange). CRC panel (navy): data plus generator polynomial flows through polynomial division (remainder = CRC), produces 32-bit value 0x7F82A3E9, appends to 4-byte trailer (orange). Error Detection panel compares capabilities: Checksum detects ~50% single-bit errors and poor burst detection (orange warning), CRC-16 detects 100% single/double-bit and 100% bursts under 16 bits (teal good), CRC-32 detects 100% single/double-bit and 99.9999% bursts under 32 bits (navy excellent). Examples show UDP/ICMP use checksums while Ethernet/LoRaWAN/USB use CRC.

%%{init: {'theme': 'base', 'themeVariables': {'primaryColor': '#2C3E50', 'primaryTextColor': '#fff', 'primaryBorderColor': '#16A085', 'lineColor': '#16A085', 'secondaryColor': '#E67E22', 'tertiaryColor': '#7F8C8D'}}}%%
sequenceDiagram
    participant TX as Transmitter
    participant CH as Noisy Channel
    participant RX as Receiver

    Note over TX: Original Data: [0x45, 0x3F, 0x12]

    TX->>TX: Calculate CRC-32
    Note over TX: CRC = 0x7F82A3E9

    TX->>CH: Send [Data + CRC]<br/>[0x45, 0x3F, 0x12, 0x7F, 0x82, 0xA3, 0xE9]

    Note over CH: Noise corrupts<br/>0x45 becomes 0x44

    CH->>RX: Receive [0x44, 0x3F, 0x12, 0x7F, 0x82, 0xA3, 0xE9]

    RX->>RX: Recalculate CRC
    Note over RX: Expected: 0x7F82A3E9<br/>Calculated: 0xB2C1D4F3<br/>MISMATCH!

    RX->>TX: NAK - Request retransmit

    TX->>CH: Resend [Data + CRC]
    CH->>RX: Received intact

    RX->>RX: CRC matches!
    Note over RX: Data verified<br/>Accept packet

Figure 52.2: Alternative view: Error Detection in Action - This sequence diagram shows error detection as a conversation over time. The transmitter calculates CRC and sends data. The noisy channel corrupts one byte. The receiver recalculates CRC, finds a mismatch, and requests retransmission. On the second attempt, CRC matches and data is accepted. This timeline view helps students understand that error detection is a process, not just a calculation. {fig-alt=“Sequence diagram showing CRC error detection over time between Transmitter, Noisy Channel, and Receiver. Transmitter starts with data 0x45 0x3F 0x12, calculates CRC-32 producing 0x7F82A3E9, sends combined data plus CRC to channel. Noisy Channel corrupts first byte from 0x45 to 0x44. Receiver gets corrupted packet, recalculates CRC expecting 0x7F82A3E9 but calculating 0xB2C1D4F3, detects MISMATCH. Receiver sends NAK requesting retransmit. Transmitter resends original data. Channel delivers intact this time. Receiver recalculates CRC which now matches, accepts packet as verified.”}

52.4 Common CRC Polynomials

CRC Type	Polynomial	Size	Used By
CRC-8	0x07	1 byte	I2C, ATM
CRC-16-CCITT	0x1021	2 bytes	Bluetooth, X.25
CRC-16-Modbus	0x8005	2 bytes	Modbus RTU
CRC-32	0x04C11DB7	4 bytes	Ethernet, USB, Zip
CRC-32C	0x1EDC6F41	4 bytes	iSCSI, SCTP

52.5 Error Detection vs. Error Correction

Error Detection (this chapter): Identifies that an error occurred, triggers retransmission

Error Correction (Forward Error Correction): Fixes errors without retransmission

Approach	Overhead	Latency	Use Case
Detection + Retransmit	Low (2-4 bytes)	Variable	Wi-Fi, TCP, most IoT
FEC (Reed-Solomon)	High (10-30%)	Fixed	Satellite, LoRa physical layer
Hybrid ARQ	Medium	Medium	LTE, 5G

For most IoT applications, error detection with retransmission is preferred because: 1. Errors are rare (< 1% on good links) 2. FEC overhead is expensive for constrained devices 3. Retransmission latency is acceptable for sensor data

52.6 Knowledge Check: Error Detection

Knowledge Check: Error Detection Methods Quick Check

Concept: Comparing checksum and CRC error detection.

Which error detection method can detect more types of errors: simple checksum or CRC-16?

C is correct. CRC (Cyclic Redundancy Check) uses polynomial mathematics that can detect: all single-bit errors, all double-bit errors, all odd-number errors, burst errors up to 16 bits, and 99.998% of all other errors. Simple checksums can miss transposed bytes (0x12 0x34 vs 0x34 0x12 have the same sum).

52.7 Scenario-Based Practice

Scenario: Designing a Custom Protocol for Industrial Sensors

Situation: You’re designing a communication protocol for 1,000 pressure sensors in an oil refinery. Requirements: - Each sensor sends: Sensor ID (16-bit), pressure (32-bit float), temperature (16-bit), timestamp (32-bit) - Transmission medium: RS-485 serial bus (noisy industrial environment) - Messages must be detectable even if receiver joins mid-transmission - Critical safety system: undetected errors could cause explosions

Question: Design the packet structure including header, payload, and trailer. Justify your choice of framing method and error detection mechanism.

Question: Which packet design best fits “receiver may join mid-frame” and “safety-critical error detection” on a noisy RS-485 bus?

B. A delimiter helps resynchronize mid-stream, a length field makes framing robust/extendable, and CRC is appropriate for safety-critical integrity checks.

Answer

Recommended Packet Structure:

Field	Size	Value/Purpose
Start Delimiter	2 bytes	`0x55 0xAA` (unique pattern, unlikely in data)
Length	1 byte	Total payload length (12 bytes for this message)
Sensor ID	2 bytes	16-bit sensor identifier
Pressure	4 bytes	32-bit IEEE 754 float
Temperature	2 bytes	16-bit signed integer (C x 100)
Timestamp	4 bytes	32-bit Unix epoch (seconds)
CRC-32	4 bytes	Polynomial: 0x04C11DB7
End Delimiter	1 byte	`0x7E`
Total	20 bytes

Error Detection: CRC-32 (not just checksum)

Why CRC-32 for safety-critical systems: - Checksum weakness: Can miss errors where bytes are transposed (0x45 0x32 vs 0x32 0x45 have same sum) - CRC-32 detects: All single-bit errors, all double-bit errors, all odd-bit errors, all burst errors < 32 bits - Safety margin: For random errors, undetected corruption is about 1 in 4.3 billion (about 1/2^32)

Real-world consideration: Many industrial protocols (Modbus RTU, CAN) use CRC-16, which is often sufficient. CRC-32 adds 2 bytes of overhead but provides extra safety margin for explosion-risk environments.

Scenario: Debugging a Corrupted Packet

Situation: Your smart home gateway received this hex dump from a Zigbee temperature sensor, but the reading seems wrong (showing 500C instead of expected 25C):

61 04 00 08 02 01 F4 01 48 2A

The expected packet format is: - Frame Control: 2 bytes - Sequence: 1 byte - Cluster ID: 2 bytes - Attribute ID: 2 bytes - Data Type: 1 byte - Value: 2 bytes (temperature x 100, little-endian)

Question: Parse the packet byte-by-byte and identify where the error might be. What temperature does the packet actually encode?

Question: If the value field is 48 2A and encoded as little-endian temperature x 100 (int16), what temperature does it represent?

C. Little-endian 48 2A becomes 0x2A48 = 10824, then divide by 100 gives 108.24C.

Answer

Byte-by-Byte Parsing:

Position	Hex	Field	Interpretation
0-1	`61 04`	Frame Control	0x0461 (ZCL Global, Server to Client)
2	`00`	Sequence	Message #0
3-4	`08 02`	Cluster ID	0x0208? Should be 0x0402 (Temperature)!
5-6	`01 F4`	Attribute ID	0xF401? Should be 0x0000 (MeasuredValue)
7	`01`	Data Type	Likely wrong position
8-9	`48 2A`	Value	0x2A48 = 10,824 -> 108.24C?

The actual temperature value:

Looking at bytes 8-9: 48 2A - Little-endian: 0x2A48 = 10,824 - As signed int16: 10,824 / 100 = 108.24C (still wrong!)

Root cause found:

The bytes should be: 09 C4 for 25.0C (2500 in hex = 0x09C4)

But we have: 48 2A (0x2A48 = 10824 = 108.24C)

Likely causes: 1. Sensor malfunction - reading garbage 2. Byte corruption - single bit flip in transmission 3. Wrong sensor type - maybe it’s humidity (0-100%) encoded differently

Debug steps: 1. Check CRC/FCS (not shown in dump) - was it valid? 2. Request retransmission 3. Check sensor wiring and calibration

52.8 Summary

Error detection ensures data integrity across noisy networks:

Checksums: Simple addition-based method, fast but weak detection
CRC: Polynomial-based method, detects 99.9999% of errors
CRC-16/CRC-32: Standard choices for IoT protocols
Trade-offs: More robust detection requires more computation and bytes

Key Takeaways: - CRC is much more reliable than simple checksums - Checksums can miss transposed bytes that CRC catches - Safety-critical systems should use CRC-32 or better - Error detection enables retransmission; error correction avoids it

52.9 What’s Next

Continue exploring packet structure with the final chapter in this series:

Protocol Overhead - Comparing header sizes and encapsulation across protocols

Continue to Protocol Overhead ->