24  Software Best Practices

24.1 Learning Objectives

By the end of this chapter, you will be able to:

  • Organize Code Effectively: Structure firmware into modular, maintainable components
  • Manage Configuration: Separate secrets from code and enable runtime configuration
  • Optimize Power Consumption: Implement sleep modes and duty cycling
  • Handle Errors Gracefully: Build robust error recovery and watchdog protection
  • Avoid Common Mistakes: Recognize and prevent the most frequent IoT firmware bugs

This hands-on chapter gives you practical prototyping experience with real (or simulated) IoT hardware. Think of it as workshop time – reading about prototyping is useful, but the real learning happens when you pick up the tools and start building. Each exercise builds skills you will use in your own IoT projects.

“Let me share the mistakes I have made so you do not repeat them!” said Max the Microcontroller with a sigh. “Rule one: organize your code into modules. Sensor code in one file, communication in another, configuration in a third. When everything is in one giant file, debugging becomes a nightmare.”

Sammy the Sensor shared his tip. “Never hard-code your Wi-Fi password or API keys in the firmware! Put them in a separate configuration file or use environment variables. If you accidentally share your code, you do not want secrets exposed.” Lila the LED added, “And always add logging! Print debug messages with severity levels – INFO for normal events, WARNING for concerning things, and ERROR for failures. When something breaks at 3 AM, those log messages are your only clue.”

Bella the Battery had the most important advice. “Use sleep modes! After reading a sensor and sending data, put the microcontroller into deep sleep until the next reading is due. I have seen projects waste 100 times more power than necessary just because the code forgot to sleep. And always set up a watchdog timer – if the code crashes or hangs, the watchdog automatically reboots the device. Without it, a frozen sensor stays frozen forever!”

Key Concepts

  • Microcontroller Unit (MCU): Integrated circuit combining CPU, RAM, flash, and peripherals optimised for embedded control applications.
  • Microprocessor Unit (MPU): High-performance processor requiring external RAM, storage, and peripherals, used in Linux-based IoT devices like Raspberry Pi.
  • Schematic: Electrical diagram showing component connections using standardised symbols, used to guide PCB layout.
  • PCB (Printed Circuit Board): Fiberglass substrate with etched copper traces connecting electronic components into a permanent assembly.
  • ESD Protection: Diodes and resistors protecting sensitive IC pins from electrostatic discharge during handling and in-field use.
  • Decoupling Capacitor: Small capacitor placed close to IC power pins to suppress high-frequency noise on the supply rail.
  • Design Rule Check (DRC): Automated PCB verification ensuring trace widths, clearances, and drill sizes meet the fabrication process constraints.

24.2 Prerequisites

Before diving into this chapter, you should be familiar with:


24.3 Code Organization

Modular Structure:

// sensors.h
#ifndef SENSORS_H
#define SENSORS_H

void initSensors();
float readTemperature();
float readHumidity();

#endif

// sensors.cpp
#include "sensors.h"
#include <Adafruit_BME280.h>

Adafruit_BME280 bme;

void initSensors() {
  bme.begin(0x76);
}

float readTemperature() {
  return bme.readTemperature();
}

Benefits:

  • Reusable components
  • Easier testing
  • Clearer dependencies
  • Maintainable codebase

24.4 Configuration Management

Centralized Configuration:

// config.h
#ifndef CONFIG_H
#define CONFIG_H

// Wi-Fi Configuration
#define WIFI_SSID "YourSSID"
#define WIFI_PASSWORD "YourPassword"

// MQTT Configuration
#define MQTT_SERVER "mqtt.example.com"
#define MQTT_PORT 1883

// Sensor Configuration
#define SENSOR_READ_INTERVAL 60000  // ms
#define TEMPERATURE_OFFSET 0.0

#endif

Secrets Management:

// secrets.h (add to .gitignore)
#define WIFI_PASSWORD "actual_password"
#define API_KEY "actual_api_key"

// secrets.h.template (commit to repo)
#define WIFI_PASSWORD "your_password_here"
#define API_KEY "your_api_key_here"

Runtime Configuration (EEPROM/NVS):

#include <Preferences.h>

Preferences prefs;

void loadConfig() {
  prefs.begin("config", true);  // Read-only
  String ssid = prefs.getString("wifi_ssid", "");
  String password = prefs.getString("wifi_pass", "");
  prefs.end();
}

void saveConfig(String ssid, String password) {
  prefs.begin("config", false);  // Read-write
  prefs.putString("wifi_ssid", ssid);
  prefs.putString("wifi_pass", password);
  prefs.end();
}
Try It: Configuration Security Risk Explorer

Evaluate the security risk of different credential storage approaches. Select where you store each type of secret and see the overall risk assessment.

24.5 Power Management

Sleep Modes:

#include <esp_sleep.h>

void enterDeepSleep(int seconds) {
  esp_sleep_enable_timer_wakeup(seconds * 1000000ULL);
  esp_deep_sleep_start();
}

void loop() {
  readSensorsAndSend();
  enterDeepSleep(300); // Sleep 5 minutes
}

Peripheral Power Down:

void powerDown() {
  WiFi.disconnect(true);
  WiFi.mode(WIFI_OFF);
  btStop();
  // Disable unused peripherals
}

Deep Sleep Energy Savings Calculation: ESP32 environmental sensor with 2000mAh battery:

Duty cycle breakdown (60s interval): \[E_{\text{cycle}} = (t_{\text{wake}} \times I_{\text{active}}) + (t_{\text{sleep}} \times I_{\text{sleep}})\] \[E_{\text{cycle}} = (0.3s \times 100mA) + (59.7s \times 0.01mA) = 30mAs + 0.597mAs = 30.597mAs\]

Converting to mAh: \[E_{\text{cycle}} = \frac{30.597mAs}{3600s/h} = 0.00850mAh\]

Average current draw: \[I_{\text{avg}} = \frac{0.00850mAh}{60s/3600s/h} = \frac{0.00850mAh}{1/60 h} = 0.51mA\]

Battery lifetime: \[\text{Days} = \frac{2000mAh}{0.51mA \times 24h/day} = \frac{2000}{12.24} = 163 \text{ days}\]

Without deep sleep (continuous 25mA): only 3.3 days. Deep sleep provides 49x improvement.

Test with Power Profiling Tools Early

A critical oversight is developing firmware without measuring actual power consumption until deployment. Prototype testing on USB power masks the reality of battery operation.

Example: A “low power” environmental sensor prototype ran fine on USB for months, but deployed units with 2000mAh batteries lasted only 3 days instead of projected 6 months - Wi-Fi wasn’t properly sleeping, consuming 80mA continuously.

Solution: Use power profilers (Nordic Power Profiler Kit, Joulescope, or simple INA219 breakout) during development. Measure current in all states: active, idle, sleep, transmission.

24.6 Error Handling

Graceful Degradation:

bool sendData() {
  int retries = 3;
  while(retries > 0) {
    if (WiFi.status() == WL_CONNECTED) {
      if (client.publish(topic, data)) {
        return true;
      }
    }
    retries--;
    delay(1000);
  }
  // Store data locally for later transmission
  storeDataLocally(data);
  return false;
}

Watchdog Timer:

#include <esp_task_wdt.h>

void setup() {
  esp_task_wdt_init(30, true);  // 30 second watchdog
  esp_task_wdt_add(NULL);
}

void loop() {
  // Feed watchdog to prevent reset
  esp_task_wdt_reset();

  // Your code
  doWork();
}

24.7 Memory Management

Avoid Memory Leaks:

// Bad: Memory leak
void loop() {
  char* buffer = new char[1024];
  processData(buffer);
  // Missing: delete[] buffer;
}

// Good: Proper cleanup
void loop() {
  char* buffer = new char[1024];
  processData(buffer);
  delete[] buffer;
}

// Better: Stack allocation
void loop() {
  char buffer[1024];
  processData(buffer);
  // Automatically freed
}

Monitor Memory Usage:

void printMemoryUsage() {
  Serial.print("Free heap: ");
  Serial.println(ESP.getFreeHeap());
  Serial.print("Heap fragmentation: ");
  Serial.println(ESP.getHeapFragmentation());
}
Try It: Memory Leak Simulator

Watch how an ESP32’s heap memory degrades over time with and without proper memory management. Adjust the allocation size and loop interval to see how quickly memory runs out.


24.8 Common Pitfalls and Solutions

24.8.1 Pitfall 1: Blocking Code

Problem:

// BAD: This blocks for 5 seconds!
void loop() {
  delay(5000);  // Device can't respond to anything
  readSensor();
}

Solution:

// GOOD: Non-blocking delay
unsigned long lastRead = 0;
void loop() {
  if (millis() - lastRead >= 5000) {
    readSensor();
    lastRead = millis();
  }
  // Can do other things while "waiting"
}
Try It: Blocking vs Non-Blocking Timeline

Visualize how blocking delay() freezes the entire device versus how millis()-based timing lets the microcontroller handle multiple tasks concurrently. Adjust the number of tasks and timing to see the impact.

24.8.2 Pitfall 2: No Error Handling

Problem:

// BAD: Assumes everything always works
void loop() {
  float temp = dht.readTemperature();
  sendToCloud(temp);
}

Solution:

// GOOD: Always check return values
void loop() {
  float temp = dht.readTemperature();
  if (isnan(temp)) {
    Serial.println("ERROR: Sensor read failed!");
    return;
  }

  if (WiFi.status() != WL_CONNECTED) {
    saveToLocalStorage(temp);
    return;
  }

  if (!sendToCloud(temp)) {
    retryQueue.add(temp);
  }
}

24.8.3 Pitfall 3: Interrupt Safety

Problem:

// BAD: Race condition!
int counter = 0;

void IRAM_ATTR buttonISR() {
  counter++;
}

void loop() {
  Serial.println(counter);
}

Solution:

// GOOD: Volatile and atomic access
volatile int counter = 0;

void IRAM_ATTR buttonISR() {
  counter++;
}

void loop() {
  noInterrupts();
  int localCounter = counter;
  interrupts();
  Serial.println(localCounter);
}

24.8.4 Pitfall 4: No Watchdog Timer

Problem:

// Prototype runs fine, then hangs forever
void loop() {
  readSensor();
  sendToCloud();  // If this hangs, device freezes!
}

Real consequence: 500 environmental sensors deployed in remote rainforest. 100 froze within first week. No physical access to restart. Had to send technicians on 6-hour hike ($15,000 cost).

Solution:

#include <esp_task_wdt.h>

void setup() {
  esp_task_wdt_init(30, true);  // 30-second watchdog
  esp_task_wdt_add(NULL);
}

void loop() {
  esp_task_wdt_reset();  // Pet the dog every loop
  readSensor();
  sendToCloud();
  // If loop hangs for 30+ seconds, watchdog resets device
}

24.9 Requirements Pitfalls

Pitfall: Requirements Scope Creep

The mistake: Continuously adding features during prototyping without re-evaluating timeline, budget, or feasibility.

Symptoms:

  • Feature list grows after every stakeholder meeting
  • Original 3-month timeline becomes 9 months
  • Team working on multiple half-finished features simultaneously

The fix:

  • Define MVP before prototyping starts
  • Create a “feature parking lot” for post-MVP ideas
  • Require trade-off analysis: “What do we cut to add this?”
  • Use timeboxing: fixed end dates, not feature gates
Pitfall: Ignoring Non-Functional Requirements

The mistake: Focusing entirely on features while ignoring security, power consumption, reliability.

Symptoms:

  • Prototype sends data over unencrypted HTTP
  • Battery life measured in hours instead of months
  • Device crashes after 24 hours due to memory leaks
  • No OTA update mechanism

The fix:

  • Security: Use TLS from the first prototype
  • Power: Measure current consumption weekly
  • Reliability: Implement watchdog timers in prototype code
  • Maintainability: Design OTA mechanism before field deployment

24.10 Summary Table

Mistake Symptom Fix
Blocking code Device unresponsive Use millis() instead of delay()
No error handling Random crashes Check all return values
Hardcoded credentials Can’t deploy to different sites Store in EEPROM/NVS, use config portal (see detailed callout below)
Interrupt safety issues Race conditions, corrupted data Use volatile, atomic access with noInterrupts()
Memory leaks Crashes after hours/days Free allocated memory, use stack allocation
Not testing edge cases Field failures Test Wi-Fi drops, sensor errors, broker timeouts
No watchdog timer Freezes require manual restart Enable hardware watchdog, feed regularly

Golden Rule: Every “I’ll fix it later” in your prototype becomes a $10,000 bug in production. Fix it NOW while it’s easy!

Try It: Firmware Quality Scorecard

Rate your IoT firmware project against the best practices covered in this chapter. Check each practice you have implemented to see your overall quality score and identify areas for improvement.

24.11 Knowledge Check

Scenario: An ESP32 weather station randomly resets every 2-4 hours in the field, but never during bench testing. Serial logs show: rst:0x10 (RTCWDT_RTC_RESET),boot:0x13 (SPI_FAST_FLASH_BOOT) — an RTC watchdog reset. How do you diagnose and fix this?

Step 1: Enable Detailed Logging

Add structured logging with timestamps and severity levels:

#define LOG_ERROR(msg) Serial.printf("[%lu] ERROR: %s\n", millis(), msg)
#define LOG_WARN(msg)  Serial.printf("[%lu] WARN: %s\n", millis(), msg)
#define LOG_INFO(msg)  Serial.printf("[%lu] INFO: %s\n", millis(), msg)

void loop() {
  LOG_INFO("Reading sensor...");
  float temp = readSensor();

  LOG_INFO("Connecting to Wi-Fi...");
  if (!connectWiFi()) {
    LOG_ERROR("Wi-Fi connection failed!");
    return;
  }

  LOG_INFO("Sending to MQTT...");
  if (!mqtt.publish("temp", String(temp).c_str())) {
    LOG_ERROR("MQTT publish failed!");
  }

  LOG_INFO("Entering deep sleep...");
  esp_deep_sleep_start();
}

Step 2: Analyze the Logs

After 3 hours in the field, the device resets. Last logs before reset:

[10234567] INFO: Reading sensor...
[10234589] INFO: Connecting to Wi-Fi...
[10234612] INFO: Wi-Fi connected
[10234615] INFO: Sending to MQTT...
[RTC WDT RESET]

Finding: Reset happens during MQTT publish, not during sensor reading or Wi-Fi connection. This suggests the MQTT publish is blocking for too long (>10 seconds), triggering the RTC watchdog.

Step 3: Reproduce the Issue

Hypothesis: MQTT broker becomes unreachable (network partition), causing publish to block indefinitely.

Test: Disconnect the MQTT broker (unplug Ethernet cable) and watch the device behavior. After 10-15 seconds, device resets with same error.

Root cause: The mqtt.publish() call has no timeout. When the broker is unreachable, the underlying TCP socket blocks for 15+ seconds waiting for a response, triggering the watchdog.

Step 4: Apply the Fix

Add timeout to MQTT publish:

bool publishWithTimeout(const char* topic, const char* payload, int timeoutMs) {
  unsigned long startTime = millis();

  // Set socket timeout
  mqtt.setSocketTimeout(timeoutMs / 1000);

  // Attempt publish
  bool success = mqtt.publish(topic, payload);

  // Check if we exceeded timeout
  if (millis() - startTime > timeoutMs) {
    LOG_WARN("MQTT publish timed out!");
    mqtt.disconnect();  // Force disconnect hung connection
    return false;
  }

  return success;
}

void loop() {
  LOG_INFO("Sending to MQTT...");
  if (!publishWithTimeout("temp", String(temp).c_str(), 5000)) {
    LOG_ERROR("MQTT publish failed or timed out");
    // Store data locally for retry on next cycle
    storeToSD(temp);
  }
}

Step 5: Add Watchdog Timer Management

Feed the watchdog before any long-running operation:

#include <esp_task_wdt.h>

void setup() {
  esp_task_wdt_init(30, true);  // 30-second watchdog
  esp_task_wdt_add(NULL);  // Subscribe current task
}

void loop() {
  esp_task_wdt_reset();  // Pet the dog

  readSensor();
  esp_task_wdt_reset();  // Pet again before network ops

  connectWiFi();
  esp_task_wdt_reset();

  publishWithTimeout("temp", String(temp).c_str(), 5000);
  esp_task_wdt_reset();

  esp_deep_sleep_start();
}

Step 6: Field Test

Deployed with fix: 7 days, zero resets. Problem solved!

Lessons learned:

  1. Structured logging is essential for debugging intermittent field issues
  2. Timeouts on all blocking operations (network, SD card, sensor reads)
  3. Watchdog timers catch hangs, but you must feed them appropriately
  4. Test failure scenarios (unplug broker, bad sensor, no SD card) before deployment
Project Characteristic Monolithic (Single main.cpp) Modular (Multiple files, classes)
Code size <500 lines >500 lines
Number of features 1-3 (read sensor, send data, sleep) 5+ (sensors, network, display, OTA, config UI)
Team size 1 developer 2+ developers
Testing approach Manual on-device testing Unit tests + integration tests
Maintenance timeline Prototype (1-3 months) Production (1+ years)
Code reuse None (one-off project) Multiple similar projects

Monolithic example (acceptable for simple projects):

// main.cpp (200 lines)
void loop() {
  float temp = dht.readTemperature();
  WiFi.begin(SSID, PASSWORD);
  mqtt.publish("temp", String(temp).c_str());
  delay(60000);
}

Modular example (better for complex/production projects):

src/
├── main.cpp (orchestration only, 50 lines)
├── sensors/
│   ├── SensorManager.h
│   └── SensorManager.cpp (200 lines)
├── network/
│   ├── WiFiManager.h
│   └── WiFiManager.cpp (300 lines)
├── storage/
│   ├── DataStore.h
│   └── DataStore.cpp (150 lines)
└── config/
    ├── Config.h
    └── Config.cpp (100 lines)

Benefits of modular:

  • Each module testable independently
  • Changes to network code don’t affect sensor code
  • Multiple developers can work in parallel
  • Code reusable across projects

Cost of modular:

  • Initial setup time (create file structure)
  • Learning curve for beginners
  • Slightly larger binary size

When to refactor from monolithic to modular:

  • File exceeds 500 lines (hard to navigate)
  • Adding features requires changing unrelated code (tight coupling)
  • Team grows beyond 1 developer (merge conflicts)
  • You want to write unit tests (hard to test monolithic code)
Common Mistake: Hardcoding Secrets in Firmware

The Problem: A developer commits this code to GitHub:

const char* WIFI_SSID = "HomeNetwork";
const char* WIFI_PASSWORD = "MyPassword123";
const char* MQTT_USER = "admin";
const char* MQTT_PASS = "secretPassword";
const char* API_KEY = "sk_live_51HabcdefghijklmnopqrstuvwxyzABCDEF";

Within 24 hours: 1. GitHub scanning bots find the API key 2. Credential stuffing attacks try the Wi-Fi password on other services 3. MQTT broker receives thousands of unauthorized connection attempts

Why it’s dangerous:

  • Git history is permanent: Even if you delete the file, credentials remain in commit history
  • Forks and clones: Once public, credentials spread uncontrollably
  • Credential reuse: Many users reuse passwords across services
In 60 Seconds

IoT prototyping best practices—version control for hardware and firmware, modular code architecture, hardware abstraction layers, and automated testing—dramatically reduce the time from concept to validated prototype.

The fix (secrets management strategy):

Option 1: Runtime Configuration (Best)

Store credentials in device EEPROM/NVS, configured via web interface:

#include <Preferences.h>

Preferences prefs;

void loadCredentials() {
  prefs.begin("wifi", true);  // Read-only
  String ssid = prefs.getString("ssid", "");
  String password = prefs.getString("password", "");
  prefs.end();

  if (ssid.length() == 0) {
    // First boot: start captive portal for configuration
    startConfigPortal();
  }
}

void startConfigPortal() {
  // Start Wi-Fi AP mode with web server
  WiFi.softAP("ESP32-Setup");
  server.on("/", handleRoot);  // HTML form for credentials
  server.on("/save", handleSave);  // Save to NVS
  server.begin();
}

Option 2: Separate Secrets File (Good)

// config.h (committed to Git)
#include "secrets.h"  // Not committed

#define MQTT_SERVER "mqtt.example.com"
#define MQTT_PORT 1883

// secrets.h (in .gitignore)
#define WIFI_SSID "HomeNetwork"
#define WIFI_PASSWORD "MyPassword123"
#define MQTT_USER "admin"
#define MQTT_PASS "secretPassword"

// secrets.h.template (committed to Git as example)
#define WIFI_SSID "your_wifi_ssid"
#define WIFI_PASSWORD "your_wifi_password"

.gitignore:

secrets.h
*.env
.env.*

Option 3: Environment Variables (CI/CD)

For automated builds, use platform environment variables:

// config.h
#ifndef WIFI_SSID
  #define WIFI_SSID "default_ssid"  // Fallback for local dev
#endif

// Build command:
// platformio run -e esp32 -D WIFI_SSID="ActualSSID" -D WIFI_PASSWORD="ActualPassword"

Damage control if credentials leaked:

  1. Rotate all secrets immediately (new API keys, new passwords)
  2. Revoke old credentials at the service provider
  3. Rewrite Git history (BFG Repo-Cleaner, but doesn’t help with forks)
  4. Notify affected users if customer data was exposed
  5. Enable 2FA on all accounts with leaked credentials

Prevention checklist:

  • ✅ Add secrets.h, .env, *.key to .gitignore before first commit
  • ✅ Use pre-commit hooks to scan for patterns like password=, api_key=
  • ✅ Store production credentials in device NVS, not firmware
  • ✅ Never commit real credentials, even temporarily
  • ✅ Use secret scanning tools (GitHub Advanced Security, GitGuardian)

::

::

24.12 What’s Next

If you want to… Read this
Apply best practices to a complete prototype project Programming Code Examples
Learn about CI toolchain setup Programming Development Tools
Explore professional debugging workflows Programming Paradigms and Tools