17 Lab: TinyML Gesture Classification

edge-fog

lab

In 60 Seconds

This lab builds a small TinyML-style gesture classifier on an ESP32 simulator. The goal is not to prove that one toy model is production ready. The goal is to practice the workflow that makes microcontroller ML reliable: define the input window, run a repeatable baseline, measure latency and memory, compare float and int8 behavior, stress the classifier with noise, and record a decision before claiming the model is deployable.

Phoebe’s Field Notes: Two Different Rounding Errors, One Lab

Phoebe the physics guide

Phoebe’s Why

This lab hides two unrelated rounding problems behind one word, “quantization.” The accelerometer side is a sampling problem: a real gesture is a continuous wiggle in time, and the 100 Hz sample rate can only represent motion that completes its fastest wiggle no quicker than every 20 ms – sample slower than that and a genuine fast motion does not vanish, it folds back and impersonates a slower one. The quantizeWeights() side is a completely different problem: each already-known, already-static weight is rounded to the nearest of 255 int8 levels. Nothing is being sampled in time there; a float is just being parked at the nearest tick mark on a ruler. Both failures get called “quantization error” in casual conversation, but only one of them cares about how fast the accelerometer samples, and only the other one cares about how many bits int8_t has.

The Derivation

Alias-free sampling of the physical gesture requires the sensor’s sample rate to clear twice the fastest real motion content:

\[f_s \geq 2f_{max} \quad\Longrightarrow\quad f_{max} = \frac{f_s}{2}\]

Separately, rounding any value to the nearest of a uniform step \(q\) produces an error that is uniform on \([-q/2, q/2]\), with RMS value:

\[q = \frac{\max|w|}{127}, \qquad \sigma_q = \frac{q}{\sqrt{12}}\]

using the same \(q/\sqrt{12}\) result that describes ADC quantization noise, because a symmetric int8 weight quantizer is the same uniform-rounding operation, just applied to a trained weight instead of a voltage.

Worked Numbers: This Lab’s Own 100 Hz Window and int8 Weights

Accelerometer Nyquist limit: this lab’s own \(f_s = 100\) Hz gives \(f_{max} = 50\) Hz – alias-free for any gesture whose fastest component completes in 20 ms or slower. The chapter’s own 800 ms wrist-wave example has a fundamental near \(1/0.8 = 1.25\) Hz, about \(40\times\) under that limit, which is why 100 Hz comfortably covers real gestures without needing a faster IMU.
This sketch’s own w2 quantizer: scaleW2 = maxW2 / 127.0 with the code’s own fixed \(w_2\) values (\(\pm0.85\) and \(-0.20\)) gives \(\max|w| = 0.85\), so \(q = 0.85/127 = 0.0066929\) and \(\sigma_q = 0.0066929/\sqrt{12} = 0.0019321\) per weight.
Exact round-trip check: quantizeValue(-0.20, 0.0066929) computes round(-29.882) = -30, which dequantizes to \(-30 \times 0.0066929 = -0.200787\) – an error of \(0.000787\), safely inside the \(\pm q/2 = 0.00335\) bound the derivation predicts.
Effective resolution: the code’s own -127..127 range is 255 codes, i.e. \(\log_2(255) = 7.994\) bits – close enough to \(N=8\) that the ideal quantizer ceiling \(6.02(8) + 1.76 = 49.9\) dB is the right reference figure for how much headroom int8 leaves before rounding starts moving borderline gesture windows across the confidence threshold.

Both numbers matter for the same release decision, but they answer different questions: the Nyquist check says the input was captured honestly; the quantization check says the model was compressed honestly. A lab can pass one and fail the other.

17.1 Start Simple

Treat the lab as a rehearsal for a field decision, not as a demo to admire. One gesture window becomes one model run, one latency and memory measurement, one noise test, and one acceptance record. Everyday IoT teams use this sequence so a small classifier has evidence before it controls anything important. Start by making the baseline repeatable, then compare the optimized version against the same inputs before adding complexity.

Minimum Viable Understanding

TinyML is a constrained inference problem. Flash, RAM, input timing, power state, and update path matter as much as model accuracy.
A simulator is a learning harness. It helps you inspect data flow and measurements before moving to real sensors.
Quantization must be validated. A 4-byte float weight can become a 1-byte int8 weight, but the accuracy and threshold behavior must be measured on representative inputs.
Pruning is not free compression. Removing weights can help only if the remaining model still passes the test set and latency target.
Lab evidence should be repeatable. Record the code version, noise setting, sample count, latency, memory estimate, and accept/reject decision.

17.2 Learning Objectives

By the end of this lab, you will be able to:

Build a simulated gesture classifier that runs locally on an ESP32.
Explain the input window, hidden layer, activation, logits, and softmax output.
Measure baseline accuracy, confidence, and inference latency from serial output.
Compare float-style and int8-style model storage without treating compression as automatic success.
Stress the classifier with noise and explain how confidence thresholds affect false accepts and rejects.
Write a short lab decision record that states whether the model is ready for real sensor testing.

Chapter Roadmap

Read this lab as one evidence pipeline:

First set up the simulator contract: four gesture classes, 12 input values, 8 hidden activations, 4 outputs, and a 70 percent confidence gate.
Then load the sketch and connect the ESP32 controls so Mode 0 through Mode 3 produce repeatable serial evidence.
Next compare baseline, int8-style, pruning, and noise behavior without treating compression or speed as automatic success.
Finally write the decision record that says whether the simulator result is ready for real accelerometer windows.

Quick Check: Edge AI Lab

Most Valuable Understanding

A TinyML lab is complete only when the model passes a measured gate. “It runs” is not enough; the result must include latency, memory, robustness, and a clear next action.

17.3 Prerequisites

Edge AI Fundamentals: Review why some inference belongs near the sensor.
Edge AI Hardware: Review memory, operator, power, and lifecycle constraints.
TinyML on Microcontrollers: Review microcontroller deployment limits.
Basic Arduino or ESP32 sketch experience.

17.4 Lab Outcome

You will create a simulator with four gesture classes:

17.4.1 Classifier

The sketch generates synthetic accelerometer windows for SHAKE, TAP, TILT, and CIRCLE, then runs a small dense network with ReLU and softmax.

17.4.2 Controls

A push button changes lab mode. A potentiometer changes input noise. Four LEDs show the predicted gesture class.

17.4.3 Measurements

The serial monitor prints prediction, confidence, latency, estimated model bytes, and pass/fail guidance for each mode.

17.4.4 Decision

You will decide whether the simulated model is ready to replace synthetic inputs with real accelerometer data.

Checkpoint: Lab Contract

You now know the simulator has four gesture classes and four LEDs.
You now know the button selects Mode 0 through Mode 3, while the potentiometer changes input noise.
You now know the lab outcome is a decision record, not just a running sketch.

17.5 Hardware and Simulator Setup

Use the simulator first. The optional physical build can come after the lab evidence is clean.

17.5.1 Simulator Parts

ESP32 DevKit v1
Four LEDs with current-limiting resistors
One push button
One potentiometer
Serial monitor

17.5.2 Optional Physical Parts

ESP32 DevKit v1
MPU6050, LSM6DS3, or similar IMU
Four LEDs with 220 ohm resistors
Push button and potentiometer
Breadboard and jumper wires

Open the Simulator

Open a new ESP32 project at wokwi.com/projects/new/esp32. Replace the default sketch.ino with the lab sketch below. If you want the same wiring, replace diagram.json with the provided diagram file.

A measurement loop showing ESP32 input generation, classifier execution, serial logging, test gate review, and lab decision record — Figure 17.1: TinyML lab measurement loop

17.6 Circuit Wiring

Use the following pin map in Wokwi or on a breadboard:

17.6.1 Gesture LEDs

Red LED: GPIO 12
Yellow LED: GPIO 13
Green LED: GPIO 14
Blue LED: GPIO 15

17.6.2 Controls

Push button: GPIO 4 to GND
Potentiometer signal: GPIO 34
Potentiometer ends: 3V3 and GND

17.6.3 Serial Monitor

Baud rate: 115200
Run interval: every 3 seconds
Button mode debounce: 300 ms

Wokwi diagram.json

{
  "version": 1,
  "author": "IoTClass",
  "editor": "wokwi",
  "parts": [
    { "type": "wokwi-esp32-devkit-v1", "id": "esp", "top": 0, "left": 0, "attrs": {} },
    { "type": "wokwi-led", "id": "ledShake", "top": -88, "left": 235, "attrs": { "color": "red" } },
    { "type": "wokwi-led", "id": "ledTap", "top": -40, "left": 235, "attrs": { "color": "yellow" } },
    { "type": "wokwi-led", "id": "ledTilt", "top": 8, "left": 235, "attrs": { "color": "green" } },
    { "type": "wokwi-led", "id": "ledCircle", "top": 56, "left": 235, "attrs": { "color": "blue" } },
    { "type": "wokwi-resistor", "id": "r1", "top": -78, "left": 184, "attrs": { "value": "220" } },
    { "type": "wokwi-resistor", "id": "r2", "top": -30, "left": 184, "attrs": { "value": "220" } },
    { "type": "wokwi-resistor", "id": "r3", "top": 18, "left": 184, "attrs": { "value": "220" } },
    { "type": "wokwi-resistor", "id": "r4", "top": 66, "left": 184, "attrs": { "value": "220" } },
    { "type": "wokwi-pushbutton", "id": "btn", "top": 156, "left": 228, "attrs": { "color": "gray" } },
    { "type": "wokwi-potentiometer", "id": "pot", "top": 178, "left": -132, "attrs": {} }
  ],
  "connections": [
    [ "esp:TX0", "$serialMonitor:RX", "", [] ],
    [ "esp:RX0", "$serialMonitor:TX", "", [] ],
    [ "esp:12", "r1:1", "green", [] ],
    [ "r1:2", "ledShake:A", "green", [] ],
    [ "ledShake:C", "esp:GND.1", "black", [] ],
    [ "esp:13", "r2:1", "green", [] ],
    [ "r2:2", "ledTap:A", "green", [] ],
    [ "ledTap:C", "esp:GND.1", "black", [] ],
    [ "esp:14", "r3:1", "green", [] ],
    [ "r3:2", "ledTilt:A", "green", [] ],
    [ "ledTilt:C", "esp:GND.1", "black", [] ],
    [ "esp:15", "r4:1", "green", [] ],
    [ "r4:2", "ledCircle:A", "green", [] ],
    [ "ledCircle:C", "esp:GND.1", "black", [] ],
    [ "esp:4", "btn:1.l", "blue", [] ],
    [ "btn:2.l", "esp:GND.1", "black", [] ],
    [ "esp:34", "pot:SIG", "blue", [] ],
    [ "esp:3V3", "pot:VCC", "red", [] ],
    [ "esp:GND.1", "pot:GND", "black", [] ]
  ]
}

17.7 Lab Modes

The sketch intentionally keeps the model small so you can read every step. The modes are controlled by the button:

Mode 0 Run gesture recognition and light the predicted LED.

Mode 1 Compare float-style and int8-style outputs and memory estimates.

Mode 2 Sweep simulated pruning levels and watch confidence degrade.

Mode 3 Apply an accept threshold and evaluate noisy inputs.

Simulator Scope

The int8 mode in this educational sketch dequantizes weights during inference so the math stays readable. A production TinyML runtime uses optimized kernels and hardware-specific memory layouts. Use this lab to learn measurement discipline, then validate a real TensorFlow Lite Micro or vendor runtime before deployment.

17.8 Lab Code

Complete sketch.ino

/*
 * TinyML Gesture Classification Lab
 * ---------------------------------
 * Educational ESP32 simulator:
 * - Four synthetic accelerometer gesture classes
 * - Dense network: 12 inputs -> 8 hidden ReLU units -> 4 outputs
 * - Float-style and int8-style weight storage comparison
 * - Simulated pruning and confidence threshold tests
 *
 * Controls:
 * - Button on GPIO 4 changes mode
 * - Potentiometer on GPIO 34 changes input noise
 * - LEDs on GPIO 12-15 show predicted class
 */

#include <Arduino.h>
#include <math.h>

#define INPUT_SIZE 12
#define HIDDEN_SIZE 8
#define CLASS_COUNT 4

#define PIN_LED_SHAKE 12
#define PIN_LED_TAP 13
#define PIN_LED_TILT 14
#define PIN_LED_CIRCLE 15
#define PIN_BUTTON 4
#define PIN_POT 34

const char *CLASS_NAMES[CLASS_COUNT] = {
  "SHAKE", "TAP", "TILT", "CIRCLE"
};

const int CLASS_LEDS[CLASS_COUNT] = {
  PIN_LED_SHAKE, PIN_LED_TAP, PIN_LED_TILT, PIN_LED_CIRCLE
};

const float PROFILES[CLASS_COUNT][INPUT_SIZE] = {
  { 0.8, 0.1, 0.0, -0.8, 0.1, 0.0, 0.7, -0.1, 0.0, -0.7, 0.0, 0.0 },
  { 0.0, 0.0, 0.2,  0.1, 0.0, 0.9, 0.0,  0.0, -0.2, 0.0, 0.0, 0.1 },
  { 0.1, 0.7, 0.2,  0.2, 0.6, 0.2, 0.4,  0.4,  0.2, 0.7, 0.1, 0.2 },
  { 0.0, 0.7, 0.0,  0.7, 0.0, 0.0, 0.0, -0.7,  0.0, -0.7, 0.0, 0.0 }
};

float w1[INPUT_SIZE][HIDDEN_SIZE];
float b1[HIDDEN_SIZE];
float w2[HIDDEN_SIZE][CLASS_COUNT];
float b2[CLASS_COUNT];

int8_t qw1[INPUT_SIZE][HIDDEN_SIZE];
int8_t qw2[HIDDEN_SIZE][CLASS_COUNT];
float scaleW1 = 1.0f;
float scaleW2 = 1.0f;

int mode = 0;
int demoClass = -1;
unsigned long lastRunMs = 0;
unsigned long lastButtonMs = 0;
float noiseLevel = 0.0f;

float relu(float x) {
  return x > 0.0f ? x : 0.0f;
}

float randomUnit() {
  return random(-100, 101) / 100.0f;
}

void softmax(const float *logits, float *out) {
  float maxVal = logits[0];
  for (int i = 1; i < CLASS_COUNT; i++) {
    if (logits[i] > maxVal) {
      maxVal = logits[i];
    }
  }

  float sum = 0.0f;
  for (int i = 0; i < CLASS_COUNT; i++) {
    out[i] = expf(logits[i] - maxVal);
    sum += out[i];
  }

  for (int i = 0; i < CLASS_COUNT; i++) {
    out[i] = out[i] / sum;
  }
}

void buildModel() {
  randomSeed(42);

  for (int h = 0; h < HIDDEN_SIZE; h++) {
    int classIndex = h % CLASS_COUNT;
    b1[h] = -0.05f;

    for (int i = 0; i < INPUT_SIZE; i++) {
      float jitter = randomUnit() * 0.05f;
      w1[i][h] = (PROFILES[classIndex][i] * 0.55f) + jitter;
    }
  }

  for (int h = 0; h < HIDDEN_SIZE; h++) {
    int sourceClass = h % CLASS_COUNT;
    for (int c = 0; c < CLASS_COUNT; c++) {
      w2[h][c] = (sourceClass == c) ? 0.85f : -0.20f;
    }
  }

  for (int c = 0; c < CLASS_COUNT; c++) {
    b2[c] = 0.0f;
  }
}

int8_t quantizeValue(float value, float scale) {
  int q = (int)roundf(value / scale);
  if (q > 127) {
    q = 127;
  }
  if (q < -127) {
    q = -127;
  }
  return (int8_t)q;
}

void quantizeWeights() {
  float maxW1 = 0.0f;
  float maxW2 = 0.0f;

  for (int i = 0; i < INPUT_SIZE; i++) {
    for (int h = 0; h < HIDDEN_SIZE; h++) {
      maxW1 = max(maxW1, fabsf(w1[i][h]));
    }
  }

  for (int h = 0; h < HIDDEN_SIZE; h++) {
    for (int c = 0; c < CLASS_COUNT; c++) {
      maxW2 = max(maxW2, fabsf(w2[h][c]));
    }
  }

  scaleW1 = maxW1 > 0.0f ? maxW1 / 127.0f : 1.0f;
  scaleW2 = maxW2 > 0.0f ? maxW2 / 127.0f : 1.0f;

  for (int i = 0; i < INPUT_SIZE; i++) {
    for (int h = 0; h < HIDDEN_SIZE; h++) {
      qw1[i][h] = quantizeValue(w1[i][h], scaleW1);
    }
  }

  for (int h = 0; h < HIDDEN_SIZE; h++) {
    for (int c = 0; c < CLASS_COUNT; c++) {
      qw2[h][c] = quantizeValue(w2[h][c], scaleW2);
    }
  }
}

bool keepWeight(int inputIndex, int hiddenIndex, int prunePercent) {
  if (prunePercent <= 0) {
    return true;
  }
  int bucket = (inputIndex * 37 + hiddenIndex * 17 + 11) % 100;
  return bucket >= prunePercent;
}

void forwardFloat(const float *input, float *out, int prunePercent) {
  float hidden[HIDDEN_SIZE];

  for (int h = 0; h < HIDDEN_SIZE; h++) {
    float sum = b1[h];
    for (int i = 0; i < INPUT_SIZE; i++) {
      if (keepWeight(i, h, prunePercent)) {
        sum += input[i] * w1[i][h];
      }
    }
    hidden[h] = relu(sum);
  }

  float logits[CLASS_COUNT];
  for (int c = 0; c < CLASS_COUNT; c++) {
    float sum = b2[c];
    for (int h = 0; h < HIDDEN_SIZE; h++) {
      sum += hidden[h] * w2[h][c];
    }
    logits[c] = sum;
  }

  softmax(logits, out);
}

void forwardInt8Style(const float *input, float *out) {
  float hidden[HIDDEN_SIZE];

  for (int h = 0; h < HIDDEN_SIZE; h++) {
    float sum = b1[h];
    for (int i = 0; i < INPUT_SIZE; i++) {
      float dequantized = ((float)qw1[i][h]) * scaleW1;
      sum += input[i] * dequantized;
    }
    hidden[h] = relu(sum);
  }

  float logits[CLASS_COUNT];
  for (int c = 0; c < CLASS_COUNT; c++) {
    float sum = b2[c];
    for (int h = 0; h < HIDDEN_SIZE; h++) {
      float dequantized = ((float)qw2[h][c]) * scaleW2;
      sum += hidden[h] * dequantized;
    }
    logits[c] = sum;
  }

  softmax(logits, out);
}

void makeInput(int classIndex, float *input) {
  for (int i = 0; i < INPUT_SIZE; i++) {
    float noisy = PROFILES[classIndex][i] + (randomUnit() * noiseLevel);
    input[i] = constrain(noisy, -1.0f, 1.0f);
  }
}

int argMax(const float *out) {
  int best = 0;
  for (int i = 1; i < CLASS_COUNT; i++) {
    if (out[i] > out[best]) {
      best = i;
    }
  }
  return best;
}

void clearLeds() {
  for (int i = 0; i < CLASS_COUNT; i++) {
    digitalWrite(CLASS_LEDS[i], LOW);
  }
}

void showLed(int classIndex, bool accepted) {
  clearLeds();
  digitalWrite(CLASS_LEDS[classIndex], HIGH);

  if (!accepted) {
    delay(100);
    digitalWrite(CLASS_LEDS[classIndex], LOW);
    delay(100);
    digitalWrite(CLASS_LEDS[classIndex], HIGH);
  }
}

void printScores(const float *out) {
  for (int i = 0; i < CLASS_COUNT; i++) {
    Serial.print(CLASS_NAMES[i]);
    Serial.print("=");
    Serial.print(out[i] * 100.0f, 1);
    Serial.print("% ");
  }
  Serial.println();
}

int floatWeightBytes() {
  int weights = (INPUT_SIZE * HIDDEN_SIZE) + (HIDDEN_SIZE * CLASS_COUNT);
  int biases = HIDDEN_SIZE + CLASS_COUNT;
  return (weights + biases) * 4;
}

int int8WeightBytes() {
  int weights = (INPUT_SIZE * HIDDEN_SIZE) + (HIDDEN_SIZE * CLASS_COUNT);
  int biases = HIDDEN_SIZE + CLASS_COUNT;
  int scales = 2;
  return weights + (biases * 4) + (scales * 4);
}

void runRecognition() {
  demoClass = (demoClass + 1) % CLASS_COUNT;

  float input[INPUT_SIZE];
  float out[CLASS_COUNT];
  makeInput(demoClass, input);

  unsigned long start = micros();
  forwardFloat(input, out, 0);
  unsigned long elapsed = micros() - start;

  int predicted = argMax(out);
  float confidence = out[predicted];
  bool accepted = confidence >= 0.70f;

  Serial.println();
  Serial.println("[MODE 0] Gesture recognition");
  Serial.print("Target: ");
  Serial.print(CLASS_NAMES[demoClass]);
  Serial.print(" | Noise: ");
  Serial.print(noiseLevel * 100.0f, 0);
  Serial.println("%");
  printScores(out);
  Serial.print("Prediction: ");
  Serial.print(CLASS_NAMES[predicted]);
  Serial.print(" | Confidence: ");
  Serial.print(confidence * 100.0f, 1);
  Serial.print("% | Latency: ");
  Serial.print(elapsed);
  Serial.println(" us");
  Serial.print("Accepted: ");
  Serial.println(accepted ? "yes" : "no");

  showLed(predicted, accepted);
}

void runQuantizationCompare() {
  float input[INPUT_SIZE];
  float floatOut[CLASS_COUNT];
  float int8Out[CLASS_COUNT];
  makeInput(0, input);

  unsigned long startFloat = micros();
  forwardFloat(input, floatOut, 0);
  unsigned long floatTime = micros() - startFloat;

  unsigned long startInt8 = micros();
  forwardInt8Style(input, int8Out);
  unsigned long int8Time = micros() - startInt8;

  int floatPred = argMax(floatOut);
  int int8Pred = argMax(int8Out);

  Serial.println();
  Serial.println("[MODE 1] Float vs int8-style comparison");
  Serial.print("Float bytes: ");
  Serial.print(floatWeightBytes());
  Serial.print(" | Int8-style bytes: ");
  Serial.println(int8WeightBytes());
  Serial.print("Float: ");
  Serial.print(CLASS_NAMES[floatPred]);
  Serial.print(" | ");
  Serial.print(floatOut[floatPred] * 100.0f, 1);
  Serial.print("% | ");
  Serial.print(floatTime);
  Serial.println(" us");
  Serial.print("Int8-style: ");
  Serial.print(CLASS_NAMES[int8Pred]);
  Serial.print(" | ");
  Serial.print(int8Out[int8Pred] * 100.0f, 1);
  Serial.print("% | ");
  Serial.print(int8Time);
  Serial.println(" us");
  Serial.println("Gate: predictions should match before int8 deployment.");

  showLed(int8Pred, floatPred == int8Pred);
}

void runPruningSweep() {
  float input[INPUT_SIZE];
  float out[CLASS_COUNT];
  int levels[4] = { 0, 30, 60, 85 };
  makeInput(1, input);

  Serial.println();
  Serial.println("[MODE 2] Simulated pruning sweep");

  for (int i = 0; i < 4; i++) {
    int prunePercent = levels[i];
    unsigned long start = micros();
    forwardFloat(input, out, prunePercent);
    unsigned long elapsed = micros() - start;

    int predicted = argMax(out);
    Serial.print("Prune ");
    Serial.print(prunePercent);
    Serial.print("% -> ");
    Serial.print(CLASS_NAMES[predicted]);
    Serial.print(" ");
    Serial.print(out[predicted] * 100.0f, 1);
    Serial.print("% | ");
    Serial.print(elapsed);
    Serial.println(" us");
  }

  Serial.println("Gate: heavy pruning fails if confidence or class changes.");
  showLed(argMax(out), out[argMax(out)] >= 0.70f);
}

void runThresholdTest() {
  int passes = 0;
  int trials = 12;
  float threshold = 0.70f;

  Serial.println();
  Serial.println("[MODE 3] Threshold and noise test");
  Serial.print("Noise: ");
  Serial.print(noiseLevel * 100.0f, 0);
  Serial.print("% | Threshold: ");
  Serial.print(threshold * 100.0f, 0);
  Serial.println("%");

  for (int t = 0; t < trials; t++) {
    int expected = t % CLASS_COUNT;
    float input[INPUT_SIZE];
    float out[CLASS_COUNT];
    makeInput(expected, input);
    forwardFloat(input, out, 0);

    int predicted = argMax(out);
    bool ok = predicted == expected && out[predicted] >= threshold;
    if (ok) {
      passes++;
    }
  }

  Serial.print("Passes: ");
  Serial.print(passes);
  Serial.print("/");
  Serial.println(trials);
  Serial.println("Gate: repeat at low, medium, and high noise settings.");
}

void handleButton() {
  static int lastState = HIGH;
  int state = digitalRead(PIN_BUTTON);

  if (state == LOW && lastState == HIGH) {
    if (millis() - lastButtonMs > 300) {
      lastButtonMs = millis();
      mode = (mode + 1) % 4;
      Serial.println();
      Serial.print("Changed to mode ");
      Serial.println(mode);
      clearLeds();
    }
  }

  lastState = state;
}

void setup() {
  Serial.begin(115200);
  delay(500);

  pinMode(PIN_LED_SHAKE, OUTPUT);
  pinMode(PIN_LED_TAP, OUTPUT);
  pinMode(PIN_LED_TILT, OUTPUT);
  pinMode(PIN_LED_CIRCLE, OUTPUT);
  pinMode(PIN_BUTTON, INPUT_PULLUP);
  pinMode(PIN_POT, INPUT);

  buildModel();
  quantizeWeights();

  Serial.println();
  Serial.println("TinyML Gesture Classification Lab");
  Serial.println("Button changes mode. Potentiometer changes noise.");
  Serial.println("Mode 0: recognition");
  Serial.println("Mode 1: float vs int8-style");
  Serial.println("Mode 2: pruning sweep");
  Serial.println("Mode 3: threshold test");
}

void loop() {
  handleButton();
  noiseLevel = analogRead(PIN_POT) / 4095.0f;

  if (millis() - lastRunMs > 3000) {
    lastRunMs = millis();

    if (mode == 0) {
      runRecognition();
    } else if (mode == 1) {
      runQuantizationCompare();
    } else if (mode == 2) {
      runPruningSweep();
    } else {
      runThresholdTest();
    }
  }

  delay(10);
}

17.9 What the Code Is Doing

The sketch is small enough to inspect, but it still follows the structure of a real TinyML inference path.

17.9.1 Input Window

Each gesture is represented by 12 values: four samples of X, Y, and Z acceleration. The simulator generates these values from fixed profiles and adds potentiometer-controlled noise.

17.9.2 Hidden Layer

The first layer maps 12 input values into 8 hidden activations. ReLU sets negative activation values to zero.

17.9.3 Output Layer

The output layer produces four logits. Softmax converts those logits into probabilities for the four gesture classes.

17.9.4 Decision Gate

The lab treats a prediction as accepted only when the class is correct and the confidence clears the chosen threshold.

Model Size Formula

The lab model has (12 x 8) + (8 x 4) = 128 weights and 8 + 4 = 12 biases. A float-style version stores weights and biases as 4-byte values. An int8-style version stores weights as 1-byte values, but still keeps float biases and scale factors in this readable sketch.

Checkpoint: Model Path

You now know how 12 input values move through 8 hidden activations into 4 gesture outputs.
You now know why the readable int8-style path still keeps float biases and scale factors.
You now know the confidence gate is part of the model evidence, not a separate afterthought.

17.10 Deep Dive: From Simulator Windows to Deployment Evidence

The simulator uses a 12-value teaching window so every layer is easy to inspect. A production IMU gesture model usually uses a longer fixed window. If the accelerometer samples at 100 Hz on three axes and the window length is one second, each input example has a fixed tensor size:

sample rate     = 100 samples/s
window length   = 1.0 s
axes            = 3
input per window = 100 x 1.0 x 3 = 300 values

A TinyML lab workflow showing input window, baseline model, quantized model, noise test, pruning test, and decision record stages — Figure 17.2: TinyML gesture lab workflow

The fixed window is the contract between the sensor stream and the model. Overlapping windows can produce more decisions per second and reduce the chance of splitting a gesture across two examples, but they also increase compute work. A one-second window with 50 percent overlap gives a fresh decision every 500 ms while still covering an 800 ms wrist wave. A 250 ms window may be too short to contain the gesture. A two-second window may contain the gesture but delay the result and consume more buffer memory.

Deployment evidence must also match the real sensor and mounting. If training windows were recorded with a board strapped flat to a wrist, deploying the model on a vertical tool handle shifts the input distribution before the first field run. Record the sensor type, mounting orientation, sample rate, window length, overlap, threshold, and who owns retraining when those conditions change.

17.10.1 Training Evidence

Show that labeled windows were captured from representative users, devices, orientations, and speeds. Keep held-out data separate from the examples used to tune the model.

17.10.2 Deployment Evidence

Show that the converted model still fits memory, agrees with the baseline after quantization, meets the latency gate, and fails safely when confidence is low.

Quantization adds one more gate. INT8 values are compact, but rounding can move borderline windows across the confidence threshold. Compare float-style and quantized decisions on held-out windows before trusting the compressed model, and record false accepts as well as false rejects.

17.11 Run 1: Baseline Recognition

Keep the potentiometer near the low-noise end and leave the sketch in Mode 0. Record at least eight predictions.

Run it: Before you log a single prediction, step the inference-pipeline animation below through the same path your sketch runs: a captured input window feeds the hidden layer, the output layer scores each gesture, and the decision gate accepts or rejects on the confidence threshold. Watch which stage sets the predicted class and where the confidence margin comes from, so the eight predictions you record are read against a pipeline you can see rather than a black box.

Expected behavior: The predicted class should usually match the printed target class.

Confidence: With low noise, accepted predictions should usually be above the threshold printed by the sketch.

Latency: Record the reported microsecond value, but do not compare it to other boards unless the code, clock, compiler, and serial logging setup are also comparable.

Baseline Gate

Do not move to optimization until the baseline is stable. If Mode 0 fails at low noise, fix the input profile, model weights, or threshold first. Optimizing a broken baseline only hides the real defect.

17.12 Run 2: Quantization Comparison

Press the button once to switch to Mode 1. This mode prints the float-style and int8-style predictions for the same input.

The baseline gate tells you whether the model is worth optimizing. Mode 1 asks a narrower question: after compression, does the same input still lead to the same class with usable confidence?

Run it: Use the quantization trade-off animation below to see what Mode 1 is actually doing. Move a model from the float version to the int8 version and watch three things move together: the memory footprint shrinks, the weights snap to rounded steps, and the predicted class and confidence can drift. Decide your Pass or Fail from what the animation shows – a class that holds and a memory saving that fits your target – rather than from the speed number alone.

17.12.1 Pass

The predicted class matches, confidence stays close enough for the use case, and the compressed memory estimate is meaningful for the target device.

17.12.2 Fail

The predicted class changes, the accepted confidence collapses near the threshold, or the memory savings are not enough to fit the target firmware.

Do Not Overread the Speed Number

This educational int8 path dequantizes values during inference, so it is not a production int8 benchmark. Use it to observe storage and rounding effects. Use TensorFlow Lite Micro, vendor kernels, or board-specific libraries for real speed measurements.

Checkpoint: Optimization Gate

You now know int8-style storage is useful only if the class and confidence still hold.
You now know pruning is acceptable only while the validation gate still passes.
You now know a speed number from this readable sketch is not a production benchmark.

17.13 Run 3: Pruning Stress Test

Press the button again to switch to Mode 2. The sketch prints predictions for several simulated pruning levels.

17.13.1 Low Pruning

Small changes should not alter the class or confidence much.

17.13.2 Moderate Pruning

Confidence may decline. This is acceptable only if the validation set still passes the gate.

17.13.3 Heavy Pruning

Class changes or low confidence mean the model has crossed a failure point for this task.

Why This Is Simulated

Production pruning usually removes weights by magnitude or by structured blocks and then validates or retrains the model. The sketch uses a deterministic mask so the effect is visible in a short embedded demo. Treat the result as a teaching signal, not as a production pruning method.

17.14 Run 4: Threshold and Noise Test

Press the button again to switch to Mode 3. Try three noise settings: low, middle, and high. For each setting, record the number of accepted predictions out of 12 trials.

17.14.1 If Low Noise Fails

The classifier, profiles, or threshold are not ready for real sensor data.

17.14.2 If Middle Noise Fails

The model may need broader training data, data augmentation, filtering, or a different threshold.

17.14.3 If High Noise Fails

That may be acceptable if the deployment never sees that noise level. It is not acceptable if the field environment commonly reaches it.

17.14.4 If Everything Passes

Move from synthetic inputs to recorded accelerometer windows and repeat the same gate.

17.15 Lab Decision Record

Use this record before moving from simulator to real hardware.

Checkpoint: Decision Evidence

You now know why low, middle, and high noise settings must be reported separately.
You now know the real-hardware step needs recorded accelerometer windows, not just synthetic profiles.
You now know the next action should be accept, repeat tuning, change the model, or replace the input data.

17.15.1 Context

Board and simulator date
Sketch version or commit
Gesture classes
Input window length
Noise settings tested

17.15.2 Measurements

Baseline correct predictions
Quantized class agreement
Latency range
Model byte estimate
Threshold test pass count

17.15.3 Decision

Accept for real sensor trial
Repeat simulator tuning
Change model or threshold
Replace synthetic profiles with recorded data

17.15.4 Risks

User variation not represented
Sensor mounting not represented
Battery behavior not measured
Production runtime not benchmarked

Example Decision Record

Decision: Continue to real accelerometer data collection.

Evidence: Mode 0 passed at low noise, Mode 1 kept the same class after int8-style quantization, Mode 2 failed at heavy pruning but passed at light pruning, and Mode 3 passed low and middle noise.

Limits: The current result uses synthetic profiles. The next run must use recorded accelerometer windows from multiple users and device orientations.

17.16 Common Mistakes and Fixes

17.16.1 Mistake: Treating Simulator Success as Deployment Success

Synthetic input profiles do not cover mounting angle, user style, vibration, sensor drift, or real sampling jitter.

Fix: Replace generated windows with recorded IMU windows before field testing.

17.16.2 Mistake: Comparing Latency Numbers Across Different Setups

Serial logging, compiler flags, board clock, and runtime implementation can change timing.

Fix: Compare latency only under controlled build and board conditions.

17.16.3 Mistake: Quantizing Without a Held-Out Test Set

Memory improves, but behavior can shift near the decision threshold.

Fix: Compare class, confidence, and false accept/reject behavior after quantization.

17.16.4 Mistake: Pruning Until the Model Fits, Then Stopping

A smaller model that fails the acceptance gate is not useful.

Fix: Prune incrementally and validate after each change.

Label the Lab Pipeline

Code Challenge

Interactive Quiz: Match Concepts

Interactive Quiz: Sequence the Lab

17.17 Reference Path

Use these references when moving from the educational sketch to a production TinyML workflow:

TensorFlow Lite for Microcontrollers: https://www.tensorflow.org/lite/microcontrollers
TensorFlow Lite model optimization: https://www.tensorflow.org/lite/performance/model_optimization
Edge Impulse embedded machine learning docs: https://docs.edgeimpulse.com/docs
Wokwi ESP32 simulator: https://docs.wokwi.com/guides/esp32

17.18 Summary

This lab demonstrated the measurement discipline behind TinyML gesture classification. You built a small local classifier, ran baseline recognition, compared compressed behavior, stressed the model with pruning and noise, and recorded the evidence needed for a next-step decision. The important outcome is not a perfect toy model. The important outcome is a repeatable workflow that can be reused when the synthetic profiles are replaced with real IMU data and a production TinyML runtime.

17.19 What’s Next

17.19.1 Model Optimization for Edge AI

Continue with quantization, pruning, distillation, and conversion workflows for production models.

17.19.2 TinyML on Microcontrollers

Connect this lab to real microcontroller constraints and deployment patterns.

17.19.3 Edge AI Hardware

Use benchmark evidence to choose the hardware class for a real edge AI workload.

17.19.4 Fog Production and Review

Review rollout, monitoring, rollback, and field validation for deployed edge systems.

17.20 Key Takeaway

Treat an edge AI lab as an evidence pipeline: collect representative data, train or select a baseline model, optimize it, deploy it, and measure accuracy, latency, memory, power, and failure behavior on the actual target.