Problem: Predict bearing failure in industrial motors 2-4 weeks before catastrophic failure, avoiding $100K+ downtime.
315.3.1 Edge AI Solution
Hardware: Vibration sensor (accelerometer) + ESP32 microcontroller ($10)
Model: 1D CNN anomaly detection (30 KB TFLite)
Data: Vibration FFT features (frequency spectrum analysis)
Inference: 20ms per 1-second window, runs continuously
How It Works:
1. Accelerometer samples vibration at 10 kHz (10,000 samples/second)
2. Every 1 second, compute FFT (Fast Fourier Transform) to get frequency spectrum
3. Extract 64 frequency bins as features (e.g., energy in 10-100 Hz, 100-500 Hz, etc.)
4. CNN model classifies: Normal vs Early Warning vs Critical
5. Normal: Continue monitoring, Critical: Immediate alert to maintenance team
Training:
- Collect months of normal operation data (healthy baseline)
- Inject synthetic anomalies or use historical failure data
- Autoencoder or one-class SVM to detect "anything unusual"
315.3.2 Vibration Feature Engineering
Normal bearing:
Peak frequency: 60 Hz (motor rotation speed)
Harmonics: 120 Hz, 180 Hz (expected)
Amplitude: Stable +/-10%
Failing bearing (early stage):
New frequencies appear: 237 Hz, 412 Hz (bearing defect frequencies)
Amplitude increases: +30% in high-frequency range (>1 kHz)
Intermittent: Not constant, appears during load
Failing bearing (critical):
Broad spectrum noise: Energy across all frequencies
Amplitude spikes: +200% peaks
Constant: Always present
Edge AI Model detects these patterns in real-time, alerting 2-4 weeks early.
315.3.3 Deployment Results
50 motors monitored continuously:
- False positive rate: 5% (2-3 false alarms per year)
- True positive rate: 95% (detected 19 of 20 actual failures)
- Lead time: Average 18 days before failure
- Cost savings: $2M/year avoided downtime (vs $10K hardware investment)
315.4 Voice and Audio: Keyword Spotting
Problem: Continuously listen for wake word (βHey Deviceβ) on battery-powered smart speaker, using <1 mW power.
315.4.1 Two-Stage Pipeline
1. Always-On Detector (ultra-low-power DSP):
- Runs 18 KB tiny model continuously
- Detects wake word with 85% accuracy, 5% false positive rate
- Power: 0.5-1 mW
2. Verification Stage (main CPU):
- Activates only when Stage 1 detects keyword
- Runs larger 200 KB model for confirmation (95% accuracy)
- Power: 50 mW for 2 seconds (then back to sleep)
Why Two Stages?
- Stage 1 runs 24/7 on tiny power budget
- Stage 2 only activates occasionally (1-2 times/hour) to filter false positives
- Average power: 1 mW + (50 mW x 2 sec x 2 times/hour / 3600 sec/hour) = 1.06 mW
- Battery life: 1000 mAh battery / 1 mW = 1000 hours = 40 days
315.4.2 Audio Feature Extraction
Raw audio: 16 kHz sample rate, 16-bit PCM
Window: 1 second = 16,000 samples
Preprocessing:
1. Pre-emphasis filter (boost high frequencies)
2. Frame audio into 25ms windows with 10ms stride (100 frames/second)
3. Compute MFCC (Mel-Frequency Cepstral Coefficients):
- 40 MFCC coefficients per frame
- Captures phonetic content of speech
4. Stack 49 frames (490ms of audio context)
Input tensor: 40 MFCC x 49 frames = 1960 features -> CNN -> [Wake Word Probability]
315.5 Building an End-to-End Edge AI Pipeline
Scenario: Deploy a smart parking space detector using computer vision on a solar-powered edge device.
315.5.1 Step 1: Data Collection
Equipment:
- Raspberry Pi 4 + Camera Module v2 (8MP, $25)
- Mount camera above parking lot, capturing 4 spaces per camera
Data Collection Strategy:
- Capture 1 image every 10 seconds for 2 weeks (120,000 images)
- Vary lighting conditions: morning, afternoon, night, rain, snow
- Capture different car types, angles, partial occupancy
Labeling:
- Use Label Studio or Roboflow to draw bounding boxes around cars
- Classes: [Empty, Occupied]
- 5,000 images manually labeled, 115,000 automatically using pre-trained model + manual review
315.5.2 Step 2: Model Training (Cloud)
# Transfer learning with MobileNetV2import tensorflow as tffrom tensorflow.keras.applications import MobileNetV2base_model = MobileNetV2(weights='imagenet', include_top=False, input_shape=(224, 224, 3))base_model.trainable =False# Freeze pre-trained weightsmodel = tf.keras.Sequential([ base_model, tf.keras.layers.GlobalAveragePooling2D(), tf.keras.layers.Dense(128, activation='relu'), tf.keras.layers.Dropout(0.3), tf.keras.layers.Dense(2, activation='softmax') # Empty vs Occupied])model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])# Train for 20 epochs on 5,000 labeled imageshistory = model.fit(train_dataset, epochs=20, validation_data=val_dataset)# Result: 97.5% validation accuracy
315.5.3 Step 3: Quantization and Optimization
# Post-training quantization to int8converter = tf.lite.TFLiteConverter.from_keras_model(model)converter.optimizations = [tf.lite.Optimize.DEFAULT]# Provide representative dataset for calibrationdef representative_dataset():for i inrange(100):yield [train_images[i:i+1]] # Sample calibration dataconverter.representative_dataset = representative_datasetconverter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8]converter.inference_input_type = tf.uint8converter.inference_output_type = tf.uint8# Converttflite_model = converter.convert()# Save quantized modelwithopen('parking_detector_int8.tflite', 'wb') as f: f.write(tflite_model)# Output: Original ~14 MB -> Quantized 3.8 MB (3.7x smaller)
315.5.4 Step 4: Deploy to Edge Device
# Raspberry Pi inference scriptimport tflite_runtime.interpreter as tfliteimport cv2import numpy as np# Load quantized modelinterpreter = tflite.Interpreter(model_path="parking_detector_int8.tflite")interpreter.allocate_tensors()input_details = interpreter.get_input_details()output_details = interpreter.get_output_details()# Define parking space ROIs (regions of interest)spaces = [ {"id": "A1", "bbox": (100, 200, 300, 400)}, {"id": "A2", "bbox": (350, 200, 550, 400)}, {"id": "A3", "bbox": (600, 200, 800, 400)}, {"id": "A4", "bbox": (850, 200, 1050, 400)}]def check_parking_space(image, bbox):"""Run inference on cropped parking space""" x1, y1, x2, y2 = bbox crop = image[y1:y2, x1:x2]# Preprocess resized = cv2.resize(crop, (224, 224)) input_data = np.expand_dims(resized, axis=0).astype(np.uint8)# Inference interpreter.set_tensor(input_details[0]['index'], input_data) interpreter.invoke() output = interpreter.get_tensor(output_details[0]['index'])[0]# Classes: [Empty, Occupied] confidence = output[1] # Occupied probabilityreturn confidence >0.7# Threshold# Main loopwhileTrue: ret, frame = camera.read()ifnot ret:continue# Check each parking space occupancy = {}for space in spaces: is_occupied = check_parking_space(frame, space["bbox"]) occupancy[space["id"]] = is_occupied# Update cloud dashboard (only when status changes)# Reduces bandwidth: 4 spaces x 10 bytes/status = 40 bytes vs 500 KB image send_status_update(occupancy)# Sleep 10 seconds (no need for 30fps monitoring) time.sleep(10)
315.5.5 Step 5: Continuous Monitoring and Retraining
Production Monitoring:
- Log inference confidence scores to detect model drift
- Sample 1% of images for manual review (quality assurance)
- Track false positives (marked occupied but actually empty) and false negatives
Model Retraining (every 3 months):
- Collect edge cases from production logs (e.g., motorcycles, trucks, snow-covered)
- Add 500-1000 new labeled images to training set
- Retrain model with expanded dataset
- A/B test: Deploy to 10% of cameras, compare accuracy vs old model
- Full rollout if accuracy improves by >1%
Result:
- Initial accuracy: 97.5%
- After 6 months of continuous learning: 98.9%
- False positive rate: <2%
315.6 Knowledge Check
Show code
{const container =document.createElement('div');if (container &&typeof InlineKnowledgeCheck !=='undefined') { container.innerHTML=''; container.appendChild(InlineKnowledgeCheck.create({question:"A smart parking deployment runs MobileNetV2 on Raspberry Pi 4. Initial accuracy: 97.5%. After 6 months, accuracy drops to 89% due to new car models, weather conditions, and lighting changes. What is the best strategy?",options: [ {text:"Replace Pi 4 with Jetson Nano for more compute power",correct:false,feedback:"More compute power doesn't solve the problem. Accuracy dropped because the model hasn't seen new data patterns."}, {text:"Implement continuous learning: sample 1% of production images, retrain quarterly, A/B test before rollout",correct:true,feedback:"Correct! Continuous learning addresses data drift: (1) Sample production images, (2) Add to training set, (3) Retrain, (4) A/B test, (5) Full rollout. Result: 97.5% -> 98.9% after 6 months."}, {text:"Switch to cloud AI for more powerful models",correct:false,feedback:"Cloud AI doesn't solve data distribution shift. Plus adds latency, bandwidth costs, and defeats edge AI benefits."}, {text:"Increase model size to ResNet-50 (100 MB) for better generalization",correct:false,feedback:"Larger model won't generalize to unseen patterns without retraining. Plus ResNet-50 doesn't fit Pi 4 constraints."} ],explanation:"Edge AI production requires continuous monitoring and retraining to handle data drift. Initial model trained on limited dataset reveals new patterns in production. Continuous learning pipeline: sample edge cases, expand training set, retrain quarterly.",difficulty:"medium",topic:"edge-ai-continuous-learning" })); }return container;}
Show code
{const container =document.createElement('div');if (container &&typeof InlineKnowledgeCheck !=='undefined') { container.innerHTML=''; container.appendChild(InlineKnowledgeCheck.create({question:"Compare edge AI architectures for smart factory defect detection (100 items/minute). A: Jetson Nano local only (10ms, $99). B: ESP32 + cloud (250ms total, $8 + cloud costs). C: Jetson Nano hybrid - local for 90%, cloud for uncertain cases (15ms average). Which is optimal?",options: [ {text:"Architecture A (pure edge) - most cost-effective, no cloud costs",correct:false,feedback:"While A has lowest latency, it doesn't leverage cloud for continuous improvement from uncertain edge cases."}, {text:"Architecture B (pure cloud) - centralized intelligence, faster model updates",correct:false,feedback:"B's 250ms latency is too slow for 100 items/minute (600ms budget). Plus bandwidth costs: 144 GB/day = $14.4/day."}, {text:"Architecture C (hybrid edge-cloud) - combines edge speed with cloud learning for continuous improvement",correct:true,feedback:"Correct! Hybrid optimizes: (1) Latency: 90% local at 10ms, 10% cloud at 200ms = 15ms average. (2) Cost: 10% cloud traffic vs 100%. (3) Accuracy: Edge cases improve model. (4) Resilience: Works offline for 90% of cases."}, {text:"Architecture A initially, migrate to C after 6 months",correct:false,feedback:"Starting pure edge misses early cloud-assisted learning opportunities. Design hybrid from day one for faster accuracy improvements."} ],explanation:"Hybrid edge-cloud architectures optimize speed-cost-accuracy tradeoff: EDGE handles common cases (90%) locally. CLOUD handles uncertain cases (10%) with powerful models. LEARNING LOOP: Cloud insights retrain edge models monthly. Real-world deployments architect intelligent task distribution based on confidence thresholds.",difficulty:"hard",topic:"hybrid-edge-cloud-architecture" })); }return container;}
315.7 Summary
Key Applications:
Application
Hardware
Model Size
Latency
ROI
Visual Inspection
Jetson/Coral
3.5 MB
5-10ms
Replace $60K/year inspector
Predictive Maintenance
ESP32
30 KB
20ms
$2M/year savings
Keyword Spotting
Low-power DSP
18 KB
20ms
40-day battery life
Pipeline Best Practices: 1. Data Collection: Capture diverse conditions (lighting, weather, variations) 2. Transfer Learning: Start with pretrained model (MobileNetV2, EfficientNet) 3. Quantization: INT8 for 4x size reduction and speedup 4. Continuous Learning: Sample production data, retrain quarterly 5. Hybrid Architecture: Local for 90%, cloud for uncertain cases
315.8 Whatβs Next
Now that you understand edge AI applications and deployment, continue to: