75 State Machines in IoT
75.1 Learning Objectives
By the end of this section, you will be able to:
- Explain finite state machines: Describe why FSMs are essential for structuring IoT device behavior with explicit states, transitions, and event handling
- Compare FSM types: Differentiate Moore, Mealy, and Hierarchical state machines and select the appropriate type for specific IoT applications
- Design state machines: Construct FSMs with explicit states, transitions, guard conditions, and entry/exit actions for embedded systems
- Implement event-driven FSMs: Build state machines on ESP32 using C/C++ event queues, guard conditions, and non-blocking patterns
- Apply proven design patterns: Select and adapt connection management, duty cycling, and safety patterns for production IoT deployments
- Debug and persist state: Trace state machine execution and maintain state integrity across power cycles using NVS storage
Minimum Viable Understanding
- A state machine models an IoT device as being in exactly one state at a time (e.g., IDLE, SAMPLING, TRANSMITTING), with clear rules (transitions) governing how it moves between states in response to events.
- State machines prevent “spaghetti code” by making every possible behavior explicit and verifiable, which is critical for devices that must run unattended for years in the field.
- Three essential IoT patterns exist: a Connection pattern (with exponential backoff), a Sampling/Duty-Cycle pattern (for power efficiency), and a Safety pattern (for actuator protection), each reusable across projects.
Sensor Squad: State Machines Are Like Board Games!
Sammy the Sensor was trying to explain state machines to new members of the Sensor Squad.
“Think of it like a board game!” said Sammy. “In a board game, your game piece is on one square at a time. That square is your state. When you roll the dice or draw a card, that is an event. The rules tell you which square to move to next – that is a transition!”
Lila the Light Sensor jumped in: “My traffic light works the same way! I am always in exactly ONE color: GREEN, YELLOW, or RED. When my timer goes off, I follow the rules to switch to the next color. I never get confused because I can only be one color at a time!”
Max the Motion Detector added: “And sometimes there are extra rules! I only switch from SLEEPING to ALERT if someone moves AND it is nighttime. That extra ‘AND it is nighttime’ part is called a guard – like a bouncer at a door who checks your ticket!”
Bella the Buzzer said: “The best part is that you can draw the whole game board on paper before you build anything. If you can see all the squares and arrows, you know your device will never get stuck!”
Key Ideas for Young Engineers:
| Board Game Concept | State Machine Concept | Real IoT Example |
|---|---|---|
| The square you are on | State (current mode) | Door lock is LOCKED |
| Rolling the dice | Event (something happens) | Correct PIN entered |
| Moving to a new square | Transition (changing modes) | Lock moves to UNLOCKED |
| Special rule (“only if…”) | Guard (extra condition) | Only unlock if battery > 10% |
For Beginners: Why State Machines Matter for IoT
What is a State Machine?
A state machine is a way to organize the behavior of any system that can be in one of several distinct modes. Your phone is a state machine: it can be LOCKED, UNLOCKED, RINGING, or in DO_NOT_DISTURB mode. At any moment, it is in exactly one of these states, and clear rules determine how it transitions between them.
Why not just use if/else?
For simple devices with two or three conditions, if/else works fine. But IoT devices often have dozens of conditions that interact:
- What if the network drops while you are sending data?
- What if the battery is low AND a sensor reading is critical?
- What if the device has been running for 6 months and memory is fragmented?
With nested if/else, the logic becomes impossible to reason about. A state machine makes every possible situation and response explicit and visible.
The Three Core Ideas
- States – The distinct modes your device can be in (IDLE, ACTIVE, ERROR, SLEEP)
- Events – Things that happen (button press, timer fires, data arrives, error detected)
- Transitions – Rules mapping (current state + event) to a new state
If you understand these three ideas, you understand state machines. Everything else (guards, actions, hierarchies) builds on this foundation.
75.2 Overview
State machines are one of the most powerful design patterns in embedded systems and IoT development. They provide a structured way to model complex device behavior that responds to events, handles errors gracefully, and maintains predictable operation even in challenging conditions.
This topic is covered in the following focused chapters:
75.3 Chapters in This Section
75.3.1 State Machine Fundamentals
Learn the core concepts of finite state machines for IoT:
- States, Transitions, and Events: The building blocks of state machines
- Moore vs Mealy vs Hierarchical: Choose the right type for your application
- Guards and Actions: Add conditional logic and behaviors
- Anti-Patterns: Common mistakes to avoid
Time: ~15 min | Difficulty: Intermediate
75.3.2 State Machine Lab: ESP32 Implementation
Hands-on implementation with Wokwi ESP32 simulation:
- Complete FSM Engine: Production-quality C++ implementation
- Guard Conditions: Potentiometer-based threshold testing
- Hierarchical States: Parent/child state relationships
- State Persistence: Save and restore across power cycles
- Event Queues: Handle multiple events safely
- Challenge Exercises: Extend the state machine with new features
Time: ~45 min | Difficulty: Intermediate
75.3.3 State Machine Design Patterns
Proven patterns for common IoT scenarios:
- Connection Pattern: Network reliability with backoff and retry
- Sampling Pattern: Power-efficient duty cycling
- Safety Pattern: Actuator control with protection layers
- Pattern Selection: Choose the right pattern for your needs
Time: ~15 min | Difficulty: Intermediate
75.4 How State Machine Types Compare
Understanding the three main types helps you pick the right model for your IoT application.
| Type | Output Determined By | Complexity | Best For | IoT Example |
|---|---|---|---|---|
| Moore | Current state only | Low | Simple, predictable devices | LED status indicator, alarm states |
| Mealy | State + current input | Medium | Responsive, fewer states needed | Button debouncing, protocol parsers |
| Hierarchical | Nested state context | High | Complex multi-mode behaviors | Smart thermostat, gateway manager |
75.5 Learning Path
The three chapters in this section build on each other. Start with fundamentals, then practice in the lab, and finally learn the reusable patterns.
75.6 Prerequisites
Before starting these chapters, you should be familiar with:
- Processes and Systems Fundamentals: System behavior and feedback loops
- Microcontroller Programming Essentials: Basic C/C++ and GPIO handling
- Sensor Fundamentals: How sensors trigger events
Key Concepts
- State Machine: A computational model with a finite set of states, transitions triggered by events, and actions executed on entry/exit — used to model IoT device behavior deterministically.
- Finite State Machine (FSM): A state machine with a fixed number of states and deterministic transitions — used in firmware for mode management (idle, sensing, transmitting, sleeping) where complexity is bounded.
- Hierarchical State Machine (HSM): A state machine where states can contain sub-states — parent state behavior is inherited by child states, dramatically reducing transition count compared to flat FSMs for complex devices.
- Event-Driven Architecture: A programming paradigm where state transitions are triggered by events (sensor interrupt, timer expiry, message received) rather than polling — essential for power-efficient IoT firmware.
- Guard Condition: A boolean expression that must be true for a transition to fire even when the triggering event occurs — allows conditional transitions based on system context (e.g., only transition to TX if battery > 10%).
- Entry and Exit Actions: Operations automatically executed when entering or leaving a state — entry actions initialize state resources; exit actions clean up, ensuring consistent behavior regardless of which transition was taken.
- State Explosion Problem: The rapid growth in the number of required states when modeling complex systems with flat FSMs — addressed by using hierarchical state machines, which compress equivalent states into parent-child relationships.
75.7 Worked Example: Smart Irrigation Controller
This end-to-end example shows how you would design a state machine for a real IoT product – a solar-powered garden irrigation controller that waters plants based on soil moisture readings.
Scenario: A community garden uses ESP32-based controllers with soil moisture sensors, a solenoid valve, a solar panel with battery, and LoRaWAN for remote monitoring. The device must run for 6 months on a 3.7V 6000mAh LiPo battery supplemented by solar.
75.7.1 Step 1: Identify States
| State | Purpose | Power Draw |
|---|---|---|
| DEEP_SLEEP | Wait for next check cycle | 10 uA |
| WAKING | Initialize sensors, check battery | 20 mA |
| MEASURING | Read soil moisture (3 samples, averaged) | 35 mA |
| WATERING | Open solenoid valve | 200 mA |
| REPORTING | Transmit data via LoRaWAN | 120 mA |
| ERROR | Handle faults (low battery, sensor failure) | 5 mA |
75.7.2 Step 2: Define Transitions
75.7.3 Step 3: Add Guard Conditions
| Transition | Guard | Rationale |
|---|---|---|
| WAKING to MEASURING | battery > 15% |
Ensure enough power for full cycle |
| MEASURING to WATERING | moisture < 30% AND battery > 25% |
Only water if soil is dry AND enough battery to complete |
| WATERING duration | elapsed < 5 min |
Prevent flooding if sensor fails |
| REPORTING timeout | elapsed < 10 s |
Do not drain battery on failed TX |
75.7.4 Step 4: Calculate Battery Life
- Cycle: 30 min sleep + ~15 s active = 1800 s sleep + 15 s active
- Sleep energy: 10 uA x 1800 s = 18 mAs per cycle
- Active energy: ~50 mA avg x 15 s = 750 mAs per cycle (without watering)
- Cycles per day: 48
- Daily energy: 48 x (18 + 750) = 36,864 mAs = ~10.2 mAh/day
- Battery: 6000 mAh / 10.2 mAh/day = ~588 days (without solar, without watering)
- With watering (2x/day): adds ~2 x 200 mA x 180 s = 72,000 mAs = 20 mAh/day
- Adjusted: 6000 / 30.2 = ~198 days, well within the 6-month target with solar supplementing
75.7.5 Interactive: Irrigation Controller Battery Life Calculator
Adjust duty cycle and watering parameters to estimate battery life for your irrigation design.
Putting Numbers to It
The calculation assumes instant transitions between sleep and active states. In reality, each state transition from DEEP_SLEEP to MEASURING requires MCU stabilization + sensor warm-up:
\[\text{Transition Overhead} = \text{MCU wake-up (5ms @ 20mA)} + \text{Sensor init (50ms @ 30mA)}\] \[= (0.005 \times 20) + (0.05 \times 30) = 0.1 + 1.5 = 1.6 \text{ mAs per transition}\]
Converting to mAh: \(1.6 \text{ mAs} = \frac{1.6}{3600} = 0.000444 \text{ mAh per transition}\).
With 48 wake-ups per day:
\[\text{Daily Transition Cost} = 48 \times 0.000444 = 0.0213 \text{ mAh/day}\]
This is negligible compared to the 10.2 mAh/day active energy budget (0.2% overhead). The dominant cost remains the active-state current during sensing and transmission, not the wake-up transitions. State machine optimization should focus on reducing the number of transmissions (batch data across multiple measurement cycles) rather than reducing transitions.
This worked example demonstrates how a state machine naturally organizes complex device logic into manageable, verifiable pieces.
Common Pitfalls in IoT State Machine Design
1. State Explosion – Too Many States Adding a new state for every minor condition leads to unmaintainable diagrams. Use hierarchical states or guard conditions instead. If your diagram has more than 8-10 states at one level, refactor.
2. Forgetting the ERROR State Every IoT device will encounter faults – sensor failures, low battery, network timeouts. If your state machine has no ERROR or FAULT state, it will hang or behave unpredictably when things go wrong.
3. Blocking in State Handlers Never use delay() or blocking I/O inside a state handler. This prevents the state machine from processing other events. Use timers and non-blocking patterns instead.
4. No Timeout on TRANSMITTING States Network operations can hang indefinitely. Always add a timeout guard on any state that involves network communication. A stuck TRANSMITTING state will drain your battery in hours.
5. Ignoring State Persistence If your device resets (watchdog, power glitch, OTA update), it loses its current state. For critical applications, persist the state to flash/EEPROM so the device can resume correctly after restart.
6. Testing Only the Happy Path State machines are only as reliable as the transitions you test. Create a test matrix covering every (state, event) combination, including unexpected events in each state.
75.8 Knowledge Check
75.9 Key Takeaways
After completing all chapters in this section, you will be able to:
- Design clear state machines with explicit states, transitions, and behaviors
- Implement FSMs in C/C++ with guard conditions and event queues
- Handle power management through state-based duty cycling
- Build safety-critical systems with multiple protection layers
- Persist state across resets for reliable long-running operation
- Debug state logic through comprehensive logging and tracing
75.9.1 Summary of Key Concepts
| Concept | What It Does | Why It Matters for IoT |
|---|---|---|
| States | Define distinct operating modes | Constrained devices need clear, simple modes |
| Transitions | Rules for moving between states | Make all behavior paths explicit and testable |
| Guards | Conditional checks on transitions | Prevent unsafe operations (low battery, no network) |
| Entry/Exit Actions | Code run on state change | Initialize peripherals on entry, clean up on exit |
| Hierarchical States | Nested sub-states | Manage complexity without state explosion |
| Event Queues | Buffer incoming events | Handle bursts without losing events |
| State Persistence | Save state to flash/EEPROM | Recover correctly after power loss or reset |
Worked Example: Sizing State Machine Memory for Battery-Powered IoT Devices
Scenario: A smart water meter uses an STM32L0 microcontroller (20 KB RAM) with a state machine for valve control and leak detection. Calculate the memory footprint to ensure it fits within the constrained RAM budget.
Given Data:
- STM32L0: 20 KB RAM total, 8 KB reserved for firmware stack, 4 KB for communication buffers
- Available for application: 8 KB RAM
- State machine configuration:
- 6 states: IDLE, MEASURING, REPORTING, VALVE_CONTROL, LEAK_ALERT, ERROR
- Event queue: 16 events max
- State history: last 10 transitions (for debugging)
- Flow statistics: 50 samples averaged (leak detection algorithm)
Step 1: Calculate State Machine Core Memory
// State machine context structure
struct StateMachine {
uint8_t currentState; // 1 byte (6 states fit in uint8)
uint8_t previousState; // 1 byte
uint32_t stateEntryTime; // 4 bytes (millis() timestamp)
uint32_t lastEventTime; // 4 bytes
uint32_t transitionCount; // 4 bytes (diagnostic counter)
uint16_t errorCode; // 2 bytes
uint8_t flags; // 1 byte (fault, recoverable, etc.)
} __attribute__((packed)); // Total: 17 bytesCore state machine: 17 bytes
Step 2: Calculate Event Queue Memory
struct Event {
uint8_t type; // 1 byte (16 event types fit in uint8)
uint16_t data; // 2 bytes (event-specific payload)
uint32_t timestamp; // 4 bytes
} __attribute__((packed)); // Total: 7 bytes per event
struct EventQueue {
Event events[16]; // 16 × 7 = 112 bytes
uint8_t head; // 1 byte
uint8_t tail; // 1 byte
uint8_t count; // 1 byte
}; // Total: 115 bytesEvent queue: 115 bytes
Step 3: Calculate State History Buffer
struct StateTransition {
uint8_t fromState; // 1 byte
uint8_t toState; // 1 byte
uint8_t triggerEvent; // 1 byte
uint32_t timestamp; // 4 bytes
} __attribute__((packed)); // Total: 7 bytes per transition
StateTransition history[10]; // 10 × 7 = 70 bytesState history: 70 bytes
Step 4: Calculate Application-Specific State Data
struct LeakDetectionData {
float flowSamples[50]; // 50 × 4 = 200 bytes
float averageFlow; // 4 bytes
float peakFlow; // 4 bytes
uint32_t lastSampleTime; // 4 bytes
uint16_t sampleCount; // 2 bytes
}; // Total: 214 bytesApplication data: 214 bytes
Step 5: Total Memory Calculation
| Component | Size | Percentage of 8 KB budget |
|---|---|---|
| State machine core | 17 bytes | 0.2% |
| Event queue | 115 bytes | 1.4% |
| State history | 70 bytes | 0.9% |
| Leak detection data | 214 bytes | 2.7% |
| Total state machine | 416 bytes | 5.2% |
| Remaining RAM | 7,768 bytes | 94.8% ✓ |
Decision: The state machine fits comfortably in the 8 KB RAM budget with 94.8% remaining for other application needs (sensor drivers, communication stack, temporary buffers).
Memory Optimization Tips Discovered:
- Use uint8_t for state/event types (saves 3 bytes per field vs uint32_t)
- Pack structures with
__attribute__((packed))to eliminate padding (saves ~15%) - Store timestamps as uint32_t (millis(), 49 days before rollover) instead of uint64_t
- Circular buffers (event queue, history) provide bounded memory usage
Validation: After implementing, actual measured RAM usage was 448 bytes (7% higher than calculated due to compiler alignment), still well within budget at 5.6% of available RAM.
Decision Framework: Choosing State Machine Type for IoT Applications
Different IoT applications benefit from different state machine types. This framework maps application requirements to Moore, Mealy, or Hierarchical state machines.
| Application Type | Key Requirements | Best State Machine Type | Rationale |
|---|---|---|---|
| LED Status Indicator | Simple on/off/blink states Output depends only on current state |
Moore | Output tied to state (ON state = LED on); no input variations; simplest debugging |
| Button Debouncer | Response varies by press duration Short press vs long press actions |
Mealy | Output depends on state + current input (PRESSED state + 50ms = short, +2s = long); fewer states needed |
| Smart Thermostat | Multiple operating modes Heating/Cooling each with sub-states |
Hierarchical | Top level: OFF/HEATING/COOLING/FAN Sub-states: HEATING→{WARMUP, ACTIVE, STABILIZING} prevents state explosion |
| Irrigation Controller | Sequential operation stages Zone 1 → Zone 2 → Zone 3 |
Moore | Each zone is a distinct state with fixed output (valve control); sequential transitions obvious |
| Door Access Control | Multiple authentication methods PIN vs RFID vs biometric |
Hierarchical | Top: LOCKED/UNLOCKING/UNLOCKED Sub: UNLOCKING→{PIN_ENTRY, RFID_SCAN, FINGERPRINT} separates concerns |
| Motor Speed Controller | PID control loop Speed varies continuously with input |
Mealy | Output (PWM duty cycle) varies with state (ACCELERATING) + input (current speed vs target) |
| Battery Charge Manager | Charge stages with temperature compensation Trickle/Fast/Absorption |
Hierarchical | Top: IDLE/CHARGING/FULL/ERROR Sub: CHARGING→{TRICKLE, FAST, ABSORPTION, TEMPERATURE_HOLD} |
| Network Connection Handler | Disconnected/Connecting/Connected/Reconnecting Backoff retry logic |
Hierarchical | Top: DISCONNECTED/CONNECTED Sub: DISCONNECTED→{INITIAL_ATTEMPT, RETRY_1, RETRY_2, BACKOFF} manages retry complexity |
Quick Decision Rules:
Choose Moore if:
- Outputs are fully determined by the state (e.g., LED colors, valve positions)
- Debugging is critical (easier to visualize: state = output)
- Minimal state count (<10 states)
Choose Mealy if:
- Same state produces different outputs based on inputs (e.g., button press duration)
- Fewer states needed (Mealy can replace 2-3 Moore states with 1 state + input conditions)
- Response time critical (Mealy reacts one cycle faster than Moore)
Choose Hierarchical if:
- State count exceeds 8-10 (sign of state explosion)
- Nested behavior (e.g., connection states within each operating mode)
- Want to reuse sub-state machines (e.g., “authentication” sub-machine used by multiple top-level states)
Red Flag: Use Hierarchical if you find yourself thinking:
- “I need a HEATING_ZONE1, HEATING_ZONE2, COOLING_ZONE1, COOLING_ZONE2…” (state explosion)
- “Every state needs the same error handling code” (missing parent state)
- “I keep copying the same state logic for different contexts” (missing sub-state abstraction)
Common Mistake: State Explosion from Orthogonal Concerns
The Problem: A smart home hub manages 3 devices (light, thermostat, lock) with 3 states each (OFF, ON, ERROR). A naive flat state machine creates 3³ = 27 states: - LIGHT_OFF_THERMOSTAT_OFF_LOCK_LOCKED - LIGHT_ON_THERMOSTAT_OFF_LOCK_LOCKED - LIGHT_OFF_THERMOSTAT_HEATING_LOCK_LOCKED - … (24 more combinations)
With 5 devices of 4 states each, this explodes to 4⁵ = 1,024 states – unmaintainable.
Why It Happens: Developers model independent devices as a single monolithic state machine, forgetting that devices operate orthogonally (independently). The light state has nothing to do with the lock state.
The Solution: Parallel State Machines (Orthogonal Regions)
Instead of one state machine with 27 states, create 3 independent state machines:
struct DeviceStateMachine {
enum State { OFF, ON, ERROR };
State currentState;
// ... transition logic
};
DeviceStateMachine lightSM;
DeviceStateMachine thermostatSM;
DeviceStateMachine lockSM;State count reduction:
- Flat (combinatorial): 27 states
- Parallel (orthogonal): 3 + 3 + 3 = 9 states (3x reduction)
- With 5 devices: 1,024 states → 20 states (51x reduction)
Interaction Between Devices: When devices must coordinate (e.g., “lock door when thermostat enters AWAY mode”), use events to coordinate independent state machines:
void handleThermostatTransition(State newState) {
// Thermostat state machine handles its own transition
thermostatSM.transitionTo(newState);
// Notify other state machines of relevant events
if (newState == AWAY) {
lockSM.handleEvent(EVENT_HOUSE_EMPTY); // Prompts lock to LOCKED state
lightSM.handleEvent(EVENT_HOUSE_EMPTY); // Prompts light to OFF state
}
}Decision Rule: If you can describe devices/concerns as “X operates independently of Y most of the time, but occasionally they coordinate,” use parallel state machines with event-based coordination, not a single flat machine.
Real-World Impact: A smart home hub reduced its state machine from 64 manually-coded states (unmaintainable) to 4 parallel state machines totaling 16 states (maintainable), while adding clearer coordination logic through explicit events. Code size dropped 40%, bugs reduced by 60%.
75.10 What’s Next
| If you want to… | Read this |
|---|---|
| Study state machine fundamentals | State Machine Fundamentals |
| Build a state machine lab on ESP32 | ESP32 State Machine Lab |
| Explore process control and PID | Process Control and PID |
| Learn about specialized IoT architecture | Specialized Architecture |
| Study IoT reference architectures | IoT Reference Architectures |