%%{init: {'theme': 'base', 'themeVariables': { 'primaryColor': '#2C3E50', 'primaryTextColor': '#fff', 'primaryBorderColor': '#16A085', 'lineColor': '#16A085', 'secondaryColor': '#E67E22', 'tertiaryColor': '#ecf0f1'}}}%%
flowchart LR
S1[1. Pre-emphasis<br/>Boost high freq] --> S2[2. Framing<br/>25ms windows]
S2 --> S3[3. Windowing<br/>Hamming]
S3 --> S4[4. FFT<br/>Spectrum]
S4 --> S5[5. Mel Filter<br/>26 bands]
S5 --> S6[6. Log<br/>Compress]
S6 --> S7[7. DCT<br/>Decorrelate]
S7 --> S8[8. MFCC<br/>13 coefficients]
style S1 fill:#2C3E50,color:#fff
style S4 fill:#16A085,color:#fff
style S8 fill:#27AE60,color:#fff
1265 Real-World Sensor Fusion Applications
Learning Objectives
After completing this chapter, you will be able to:
- Understand how smartphones use sensor fusion for screen rotation
- Apply feature extraction for activity recognition
- Implement audio feature extraction (MFCC) for IoT
- Design multi-sensor fusion for autonomous vehicles
1265.1 Smartphone Screen Rotation
When you rotate your phone from portrait to landscape, three sensors work together to make the screen flip smoothly.
1265.1.1 The Sensors and Their Raw Data
| Sensor | What It Measures | Portrait | Landscape | Weakness |
|---|---|---|---|---|
| Accelerometer | Gravity direction | X: 0, Y: 9.8, Z: 0 | X: 9.8, Y: 0, Z: 0 | Noisy (+/-0.5 m/s2) |
| Gyroscope | Rotation speed | 0 deg/s | 90 deg/s during rotation | Drifts (+0.1 deg/s) |
| Magnetometer | Magnetic north | X: 20, Y: 40 | X: 40, Y: -20 | Metal interference |
1265.1.2 Without Fusion (Single Sensor Fails)
Accelerometer Only:
- Reading: “Gravity points right -> 90 deg rotation”
- Problem: Table bump causes vibration -> screen flickers!
- Error: +/-15 deg jitter from hand shaking
Gyroscope Only:
- Reading: “Rotating at 90 deg/s for 1 second -> 90 deg total”
- Problem: Drift accumulates. After 10 minutes, gyro thinks phone rotated 60 deg!
- Error: +6 deg/min drift
1265.1.3 With Fusion (Complementary Filter)
orientation = 0.98 * (previous + gyro * dt) + 0.02 * accel_orientation
| Time | Gyro (deg/s) | Gyro Integrated | Accel Reading | Fused Result |
|---|---|---|---|---|
| 0.0 | 0 | 0 deg | 0 deg | 0 deg |
| 0.2 | 90 | 18 deg | 15 deg (noisy) | 17.9 deg |
| 0.4 | 90 | 36 deg | 38 deg (noisy) | 36.0 deg |
| 1.0 | 0 | 90 deg | 92 deg (noisy) | 90.0 deg |
| 10.0 | 0 | 90.6 deg (drift!) | 90 deg | 90.0 deg (corrected) |
1265.1.4 Results
| Metric | Accel Only | Gyro Only | Magnetometer Only | Fused |
|---|---|---|---|---|
| Accuracy | +/-15 deg | +/-5 deg | +/-20 deg | +/-1 deg |
| Latency | 50 ms | 10 ms | 100 ms | 15 ms |
| Drift over 10 min | 0 deg | 60 deg | 10 deg | 0.5 deg |
Real implementation (Android SensorManager):
- Reads accelerometer + magnetometer at 50 Hz
- Reads gyroscope at 200 Hz
- Fuses using Extended Kalman Filter
- Result: Smooth, accurate within 1 deg, no drift
1265.2 Activity Recognition
Sensor fusion for activity recognition combines accelerometer magnitude (motion intensity) with gyroscope data (rotation patterns).
Key features from fused sensors:
- Accelerometer: Mean, std dev, max magnitude (motion intensity)
- Gyroscope: Mean, std dev of rotation (turning/spinning)
- Cross-sensor: Correlation between accel and gyro (coordination)
Simple activity classification (threshold-based):
| Activity | Accel Variance | Rotation |
|---|---|---|
| Stationary | <1.0 m/s2 | <0.1 rad/s |
| Walking | 1-2 m/s2 | Low-moderate |
| Running | 2-4 m/s2 | Moderate |
| High activity | >4 m/s2 | High |
Production systems: Replace thresholds with ML models (Random Forest, LSTM) trained on labeled data.
1265.3 Audio Feature Extraction: MFCC Pipeline
Many IoT applications process audio - voice assistants, acoustic monitoring, wildlife recognition, security systems. The standard approach is Mel-Frequency Cepstral Coefficients (MFCC).
1265.3.1 The 8-Stage MFCC Pipeline
Stage 1: Pre-emphasis
Boost high frequencies to flatten spectrum:
y[n] = x[n] - 0.97 * x[n-1]
Stage 2: Framing
Split into 25ms overlapping windows (400 samples at 16kHz)
Stage 3: Windowing
Apply Hamming window to reduce spectral leakage
Stage 4: FFT
Compute power spectrum via Fast Fourier Transform
Stage 5: Mel Filter Bank
Apply triangular filters spaced on mel scale (mimics human hearing)
Stage 6: Log Compression
Take logarithm (mimics human loudness perception)
Stage 7: DCT
Discrete Cosine Transform decorrelates features
Stage 8: Output
First 13 MFCC coefficients (compact representation)
1265.3.2 Why MFCC for IoT
- Compact: 13 values per 25ms frame (vs 200 raw samples)
- Robust: Speaker-independent, noise-tolerant
- Efficient: Edge devices can compute in real-time
- Proven: Standard for speech recognition, acoustic event detection
1265.4 Autonomous Vehicle Sensor Fusion
1265.5 Worked Example: Multi-Sensor Obstacle Detection
Scenario: Autonomous vehicle detects pedestrian 45m ahead using three sensors.
Sensors:
- Camera: 30 Hz, 0-150m, 0.1m resolution, fails in darkness
- LiDAR: 10 Hz, 0-200m, 0.03m resolution, all-weather
- Radar: 20 Hz, 0-250m, 0.5m resolution, provides velocity
Raw Measurements:
| Sensor | Position [x, y] | Uncertainty | Notes |
|---|---|---|---|
| Camera | [44.8, 2.3] | 1.0m/0.45m | Color, classification |
| LiDAR | [45.1, 1.9] | 0.3m/0.2m | Precise range |
| Radar | [45.5, 2.1] | 0.5m/0.6m | Velocity: -1.2 m/s |
Fusion Process:
- Association: Match detections across sensors (spatial proximity)
- Sequential Kalman updates: Camera -> LiDAR -> Radar
- Covariance-weighted fusion: Each sensor contributes by reliability
Result:
| Stage | Position | Uncertainty |
|---|---|---|
| Camera only | [44.8, 2.3] | 0.36m |
| + LiDAR | [45.04, 1.96] | 0.12m |
| + Radar | [45.04, 1.98] | 0.04m |
Key Decisions:
- Sequential Kalman updates (numerically stable)
- Covariance-based weighting (each sensor contributes by reliability)
- Dynamic confidence adjustment (reduce camera weight in darkness)
- Graceful degradation (system works if sensors fail)
1265.6 CMU Sensing Systems Research Examples
1265.6.1 Multi-Sensor Wearable Activity Recognition
Wearable systems combine multiple sensor channels for robust activity classification:
- Proximity sensors: Detect face touching
- Gyroscopes: Capture rotation during eating, drinking
- Accelerometers: Motion intensity patterns
- Audio spectrograms: Environmental context
Key Insight: Each activity produces a unique “fingerprint” across sensor channels. No single sensor could reliably distinguish all activities alone.
1265.6.2 Smart Glasses Platform
Multi-modal sensor integration in wearable form factor:
- Camera (vision)
- Microphone (audio)
- Proximity (gesture)
- IMU (motion)
By fusing all modalities, the system understands user context far better than any single sensor. This is the hardware foundation enabling sensor fusion algorithms.
1265.7 Summary
Real-world sensor fusion applications demonstrate the power of combining multiple sensors:
- Smartphone rotation: Gyro + accelerometer + magnetometer for 1 deg accuracy
- Activity recognition: Feature-level fusion of motion sensors
- Audio processing: MFCC pipeline for compact audio features
- Autonomous vehicles: Multi-sensor fusion for safe navigation
1265.8 What’s Next
- Best Practices - Common pitfalls to avoid
- Exercises - Practice implementing fusion systems