338 Edge AI and Machine Learning at the Edge
338.1 Overview
Edge AI brings machine learning to IoT devices, enabling real-time inference where data is created rather than sending everything to the cloud. This chapter series covers the techniques, hardware, and deployment patterns that make Edge AI possible.
Why Edge AI? Three critical drivers make edge AI essential for many IoT applications:
- Latency: 10-50ms local inference vs 100-500ms cloud round-trip (safety-critical systems)
- Bandwidth: Process locally, send only alerts (99% reduction in data transfer)
- Privacy: Sensitive data never leaves the device (GDPR/HIPAA compliance by design)
338.2 Chapter Series
This comprehensive topic is divided into focused chapters:
338.2.1 Edge AI Fundamentals
Why and when to use edge AI
- The business case: bandwidth savings, latency requirements, privacy compliance
- When edge AI is mandatory (the “Four Mandates”)
- Decision framework for edge vs cloud AI
- Real-world cost calculations and ROI analysis
338.2.2 TinyML: Machine Learning on Microcontrollers
Running ML on ultra-low-power devices
- Hardware platforms: Arduino Nano 33 BLE, ESP32-S3, STM32L4, Nordic nRF52840
- TensorFlow Lite Micro framework and deployment
- Edge Impulse for end-to-end TinyML development
- Memory budgeting and model size constraints
338.2.3 Model Optimization Techniques
Compressing models 10-100x for edge deployment
- Quantization: float32 to int8 (4x size reduction, 2-4x speedup)
- Pruning: removing 70-90% of weights with minimal accuracy loss
- Knowledge distillation: teacher-student training
- Combined optimization pipelines and worked examples
338.2.4 Hardware Accelerators
Choosing NPUs, GPUs, TPUs, and FPGAs
- Neural Processing Units (NPUs): Coral Edge TPU, Intel Movidius, Apple Neural Engine
- Edge GPUs: NVIDIA Jetson family (Nano, Xavier NX, AGX Orin)
- FPGAs for custom operations and deterministic latency
- Hardware selection decision tree and TOPS vs GFLOPS comparison
338.2.5 Edge AI Applications and Deployment Pipeline
Real-world use cases and end-to-end workflows
- Visual inspection for manufacturing quality control
- Predictive maintenance with vibration analysis
- Keyword spotting for always-on voice detection
- Smart parking deployment pipeline (data collection to production)
- Continuous learning and model retraining
338.2.6 Interactive Lab: TinyML Gesture Recognition
Hands-on practice with edge AI concepts
- ESP32-based TinyML gesture recognition simulator
- Neural network forward pass visualization
- Quantization and pruning demonstrations
- Challenge exercises for deeper learning
- Wokwi simulator for browser-based experimentation
338.3 Quick Reference
| Topic | Key Concept | Learn More |
|---|---|---|
| When to use Edge AI | Sub-100ms latency, >1GB/day data, privacy requirements | Fundamentals |
| TinyML platforms | ESP32, Arduino Nano 33 BLE, STM32 with 128-512 KB RAM | TinyML |
| Model compression | INT8 quantization = 4x smaller, pruning = 10x smaller | Optimization |
| Hardware selection | NPU for int8, GPU for custom models, FPGA for <10ms latency | Hardware |
| Deployment | Data collection -> training -> quantization -> deploy -> retrain | Applications |
| Hands-on | Gesture recognition on ESP32 with quantization demo | Lab |
338.4 Prerequisites
Before diving into this series, you should be familiar with:
- Edge and Fog Computing - Understanding where edge AI fits in the edge-fog-cloud hierarchy
- Data Analytics and ML Basics - Machine learning fundamentals
- Hardware Characteristics - IoT device resource constraints
338.5 What’s Next
Start with Edge AI Fundamentals to understand when and why to use edge AI, then progress through the series based on your learning goals:
- Quick start: Fundamentals -> TinyML -> Lab
- Deep dive: All chapters in sequence
- Hardware focus: Fundamentals -> Hardware -> Lab
- Production deployment: Fundamentals -> Optimization -> Applications