22  Testing & Debugging

22.1 Learning Objectives

By the end of this chapter, you will be able to:

  • Write Unit Tests: Create testable firmware functions using Unity and PlatformIO
  • Debug with Serial Output: Implement structured logging with debug levels
  • Use Hardware Debuggers: Set breakpoints and inspect variables with JTAG/SWD
  • Debug Remotely: Monitor deployed devices via OTA logging and telnet

Testing and debugging are how you find and fix problems in your IoT code before it reaches users. Testing means writing code that checks if your main code works correctly – like having a quality inspector verify every part of your product. Debugging means investigating why code is not working as expected – like being a detective who solves mysteries by gathering clues. IoT debugging is especially challenging because you are working with physical hardware, sensors, and wireless connections, where problems can come from code, electronics, or even environmental factors.

“Debugging IoT devices is like being a detective!” said Max the Microcontroller. “When something goes wrong, you have to gather clues. The first tool is serial output – printing messages to your computer that tell you what the code is doing step by step.”

Sammy the Sensor shared a common problem. “Sometimes I read a sensor value of negative 999. Is the sensor broken? Is the wiring loose? Is the code reading the wrong pin? Serial debug messages at each step help you narrow down where the problem is.” Lila the LED described a more advanced tool. “A hardware debugger connects to the microcontroller’s JTAG or SWD port. It lets you pause the code at any line, inspect variables, and step through execution one line at a time. It is like watching the code in slow motion.”

“Unit tests catch bugs before they reach the hardware!” added Max. “You write small test functions that verify each piece of code works correctly. Does the temperature conversion formula produce the right output? Does the MQTT message format correctly? Run these tests automatically every time you change code.” Bella the Battery concluded, “And for deployed devices, use remote logging. Send debug messages to the cloud so you can troubleshoot problems without being physically present. But be careful – too much logging wastes my energy and bandwidth!”

Key Concepts

  • Firmware: Low-level software stored in a device’s non-volatile flash memory that directly controls hardware peripherals.
  • SDK (Software Development Kit): Collection of libraries, tools, and documentation provided by a platform vendor to accelerate application development.
  • RTOS (Real-Time Operating System): Lightweight OS providing task scheduling and timing guarantees for embedded systems with concurrent requirements.
  • Over-the-Air (OTA) Update: Mechanism for delivering new firmware to deployed devices without physical access or a cable connection.
  • Unit Test: Automated test verifying a single function or module in isolation, catching bugs before hardware integration.
  • CI/CD Pipeline: Automated build, test, and deployment workflow that validates firmware quality on every code change.
  • Hardware Abstraction Layer (HAL): Software interface decoupling application code from specific hardware, enabling portability across MCU variants.

22.2 Prerequisites

Before diving into this chapter, you should be familiar with:

In 60 Seconds

IoT firmware debugging requires systematic isolation of hardware, firmware, connectivity, and cloud layers because symptoms in one layer often originate in another, making holistic diagnostic tools and skills essential.


22.3 Why Test Embedded Code?

Firmware bugs discovered after deployment are 100-1,000 times more expensive to fix than bugs caught during development. A smart thermostat recall costs $15-50 per unit (shipping, reflashing, return handling), while a unit test catches the same bug for $0. For a 50,000-unit deployment, that difference is $750,000 to $2.5M versus essentially free.

The IoT testing challenge: Unlike web applications where you can deploy a fix in minutes, firmware on deployed devices may be unreachable, require physical access, or brick the device if the OTA update fails. Testing before shipping is your only reliable safety net.

22.4 Unit Testing

22.4.1 Worked Example: Testing a Sensor Calibration Module with Mocks

Scenario: Your soil moisture sensor outputs raw ADC values (0-4095). A calibration function converts these to volumetric water content (0-100%). You need to verify the conversion is correct without connecting real hardware.

Why mock the hardware: Unit tests must run on your development PC in milliseconds, not on a physical ESP32 with a real sensor. Mocking the ADC read function lets you inject known values and verify outputs deterministically.

// sensor_calibration.h - The code under test
#ifndef SENSOR_CALIBRATION_H
#define SENSOR_CALIBRATION_H

// Abstract the hardware read so tests can mock it
extern int (*adc_read_fn)(int channel);

typedef struct {
    int dry_value;    // ADC reading in dry air
    int wet_value;    // ADC reading in water
} CalibrationData;

float raw_to_moisture(int raw_adc, CalibrationData cal);
bool  is_reading_valid(int raw_adc);
float apply_temperature_compensation(float moisture, float temp_c);

#endif

// test_sensor_calibration.cpp - Unit tests with Unity framework
#include <unity.h>
#include "sensor_calibration.h"

static CalibrationData test_cal = { .dry_value = 3800, .wet_value = 1200 };

void test_dry_soil_returns_zero(void) {
    float moisture = raw_to_moisture(3800, test_cal);
    TEST_ASSERT_FLOAT_WITHIN(0.5, 0.0, moisture);
}

void test_saturated_soil_returns_hundred(void) {
    float moisture = raw_to_moisture(1200, test_cal);
    TEST_ASSERT_FLOAT_WITHIN(0.5, 100.0, moisture);
}

void test_midpoint_returns_fifty(void) {
    int midpoint = (3800 + 1200) / 2;  // 2500
    float moisture = raw_to_moisture(midpoint, test_cal);
    TEST_ASSERT_FLOAT_WITHIN(2.0, 50.0, moisture);
}

void test_out_of_range_high_is_invalid(void) {
    TEST_ASSERT_FALSE(is_reading_valid(4096));  // Above 12-bit max
    TEST_ASSERT_FALSE(is_reading_valid(-1));     // Negative
}

void test_temperature_compensation_hot(void) {
    // At 40C, capacitive sensors read 3-5% wetter than actual
    float compensated = apply_temperature_compensation(50.0, 40.0);
    TEST_ASSERT_TRUE(compensated < 50.0);  // Should reduce reading
}

void test_temperature_compensation_normal(void) {
    // At 25C (calibration temp), no adjustment
    float compensated = apply_temperature_compensation(50.0, 25.0);
    TEST_ASSERT_FLOAT_WITHIN(0.1, 50.0, compensated);
}

int main(void) {
    UNITY_BEGIN();
    RUN_TEST(test_dry_soil_returns_zero);
    RUN_TEST(test_saturated_soil_returns_hundred);
    RUN_TEST(test_midpoint_returns_fifty);
    RUN_TEST(test_out_of_range_high_is_invalid);
    RUN_TEST(test_temperature_compensation_hot);
    RUN_TEST(test_temperature_compensation_normal);
    return UNITY_END();
}

Running these tests: pio test -e native executes on your PC in under 1 second. No hardware needed. Run on every commit.

22.4.2 Basic PlatformIO Unit Tests

PlatformIO Unit Tests:

#include <Arduino.h>
#include <unity.h>

void test_temperature_reading() {
    float temp = readTemperature();
    TEST_ASSERT_TRUE(temp > -40.0 && temp < 85.0);
}

void test_sensor_initialization() {
    bool initialized = initSensor();
    TEST_ASSERT_TRUE(initialized);
}

void setup() {
    UNITY_BEGIN();
    RUN_TEST(test_temperature_reading);
    RUN_TEST(test_sensor_initialization);
    UNITY_END();
}

void loop() {
    // Tests run once in setup
}

Test Directory Structure:

test/
├── test_sensor_logic/
│   └── test_main.cpp      # Tests on HOST (your PC)
└── test_embedded/
    └── test_main.cpp      # Tests on TARGET (ESP32)

Running Tests:

# Run tests on host machine (fast, no hardware)
pio test -e native

# Run tests on actual hardware
pio test -e esp32dev

# Run specific test folder
pio test -e native -f test_sensor_logic

Unit Test Coverage ROI: Explore how test coverage affects return on investment for an IoT firmware project:

Key insight: Tests pay for themselves after preventing just 2-3 field bugs. Field debugging costs 10-20x more than development testing due to remote access, limited visibility, and user impact.

22.5 Serial Debugging

Debug Macros:

#define DEBUG 1

#if DEBUG
  #define DEBUG_PRINT(x) Serial.print(x)
  #define DEBUG_PRINTLN(x) Serial.println(x)
#else
  #define DEBUG_PRINT(x)
  #define DEBUG_PRINTLN(x)
#endif

void loop() {
  float temp = readSensor();
  DEBUG_PRINT("Temperature: ");
  DEBUG_PRINTLN(temp);
}

Structured Logging:

enum LogLevel {
  LOG_ERROR,
  LOG_WARN,
  LOG_INFO,
  LOG_DEBUG
};

LogLevel currentLevel = LOG_INFO;

void log(LogLevel level, const char* message) {
  if (level > currentLevel) return;

  const char* levelStr[] = {"ERROR", "WARN", "INFO", "DEBUG"};
  Serial.print("[");
  Serial.print(millis());
  Serial.print("] [");
  Serial.print(levelStr[level]);
  Serial.print("] ");
  Serial.println(message);
}

// Usage
log(LOG_INFO, "Sensor initialized");
log(LOG_ERROR, "Wi-Fi connection failed");
log(LOG_DEBUG, "Temperature: 25.5");

Formatted Logging:

void logf(LogLevel level, const char* format, ...) {
  if (level > currentLevel) return;

  char buffer[256];
  va_list args;
  va_start(args, format);
  vsnprintf(buffer, sizeof(buffer), format, args);
  va_end(args);

  log(level, buffer);
}

// Usage
logf(LOG_INFO, "Temperature: %.2f C, Humidity: %.1f%%", temp, humidity);
Try It: Debug Log Level Filter

Explore how debug log levels work in IoT firmware. Adjust the current log level to see which messages pass through the filter and which are suppressed. This helps you understand how to control verbosity in production versus development.

22.6 Hardware Debugging

JTAG/SWD Debugging:

  • Set breakpoints in code
  • Step through execution
  • Inspect variables
  • View call stack
  • Supported by professional IDEs (STM32CubeIDE, PlatformIO with debugger)

PlatformIO Debug Configuration:

[env:esp32dev]
platform = espressif32
board = esp32dev
framework = arduino

; Debug configuration
debug_tool = esp-builtin     ; For ESP32-S3/C3
; debug_tool = esp-prog      ; For ESP32 with ESP-PROG
debug_init_break = tbreak setup
debug_speed = 5000
build_type = debug

Common GDB Commands:

# In PlatformIO Debug Console
info registers         # Show CPU registers
print variable_name    # Print variable value
watch variable_name    # Break when variable changes
bt                     # Backtrace (call stack)
list                   # Show source code
continue               # Resume execution

Logic Analyzer Debugging:

  • Capture I2C, SPI, UART communication
  • Verify timing and protocols
  • Identify bus conflicts or errors
  • Decode protocol messages automatically

LED Debugging:

// Blink patterns indicate status
void indicateError(int errorCode) {
  for(int i = 0; i < errorCode; i++) {
    digitalWrite(LED_PIN, HIGH);
    delay(200);
    digitalWrite(LED_PIN, LOW);
    delay(200);
  }
  delay(2000);
}

// Error codes
// 1 blink: Sensor error
// 2 blinks: Wi-Fi error
// 3 blinks: MQTT error
// 4 blinks: OTA error
Try It: LED Error Code Visualizer

See how LED blink patterns communicate error codes on devices without screens. Select an error type and adjust the blink speed to understand how embedded engineers use simple LED patterns for hardware debugging.

22.7 Remote Debugging

OTA Logging (MQTT):

void logToCloud(String message) {
  if (WiFi.status() == WL_CONNECTED) {
    client.publish("device/logs", message.c_str());
  }
}

// Usage
logToCloud("Boot complete, firmware v1.2.3");
logToCloud("Sensor error: NaN reading");

Telnet Debugging:

#include <TelnetStream.h>

void setup() {
  Serial.begin(115200);
  TelnetStream.begin();
}

void loop() {
  // Output goes to both Serial and Telnet
  TelnetStream.println("Debug message over telnet");
  Serial.println("Debug message over serial");

  // Read Telnet commands
  if (TelnetStream.available()) {
    char cmd = TelnetStream.read();
    handleCommand(cmd);
  }
}

Remote Console:

#include <RemoteDebug.h>

RemoteDebug Debug;

void setup() {
  Debug.begin("esp32-device");
  Debug.setResetCmdEnabled(true);
}

void loop() {
  Debug.handle();

  // Debug levels: verbose, debug, info, warning, error
  debugV("Verbose message");
  debugD("Debug message");
  debugI("Info message");
  debugW("Warning message");
  debugE("Error message");
}
Try It: Remote Logging Cost Calculator

Explore the tradeoffs of remote debug logging on battery-powered IoT devices. Adjust the logging frequency, message size, and connection type to see how debug logging affects battery life and data usage.

22.8 Debugging Tools Comparison

Tool Best For Advantages Limitations
Serial.print() Quick debugging Simple, universal Adds timing overhead
JTAG/SWD Complex bugs Real-time, non-intrusive Requires hardware probe
Logic Analyzer Protocol issues Shows timing, decodes protocols External equipment
OTA Logging Deployed devices Remote monitoring Requires connectivity
Unit Tests Regression prevention Automated, fast Can’t test hardware interaction
Try It: Debugging Technique Advisor

Answer these questions about your debugging scenario and get a recommended technique with rationale. This helps you build intuition for choosing the right debugging approach.

22.9 Debugging Technique Selection

Decision flowchart for selecting IoT debugging techniques based on problem type: serial output for quick debugging, JTAG/SWD for complex bugs, logic analyzer for protocol issues, OTA logging for deployed devices, and unit tests for regression prevention

22.10 Knowledge Check

Common Pitfalls

Writing tests only for expected inputs and successful outcomes leaves failure modes untested. In IoT firmware, the most common real-world scenarios (connection timeout, sensor read failure, corrupted packet) are exactly the edge cases omitted from happy-path test suites. Explicitly write test cases for every error branch and boundary condition.

Unit tests running on a PC simulator can pass while the same code fails on target hardware due to endianness differences, timer resolution, or peripheral timing constraints. Include at least one hardware-in-the-loop test stage that exercises critical paths on real or emulated target hardware.

High code coverage (>90%) gives a false sense of security if tests only exercise code paths without asserting correct outputs. A test that calls every function without checking return values provides coverage but no quality assurance. Define assertions for every test case and review coverage together with mutation testing results.

22.11 What’s Next

If you want to… Read this
Apply debugging skills to a full prototype project Microcontroller Programming Essentials
Learn best practices for preventing bugs in the first place Programming Best Practices
Explore the professional tools that speed up debugging Programming Development Tools