Apply Nyquist Theorem: Calculate appropriate sampling rates for different sensor types
Implement Data Reduction Techniques: Use aggregation, compression, event-based reporting, and delta encoding
Select Compression Algorithms: Choose optimal algorithms based on data type and edge device constraints
Avoid Common Pitfalls: Prevent sampling aliasing, buffer overflow, and rate mismatch errors
In 60 Seconds
Adaptive sampling and on-device compression are the two most powerful techniques for reducing IoT data volume at the edge — adaptive sampling matches collection frequency to signal dynamics, while compression exploits redundancy to shrink the data that is transmitted. Together they can reduce bandwidth requirements by 90–99% without meaningful loss of analytical value.
For Beginners: Edge Sampling and Compression
Edge sampling and compression reduce the amount of data IoT devices need to transmit. Think of sending a friend the highlights of a movie instead of the entire film. By transmitting only important changes or compressed summaries, devices save battery power and network bandwidth while preserving the information that matters most.
47.2 Prerequisites
Before diving into this chapter, you should be familiar with:
Basic signal processing concepts: Familiarity with frequency and time-domain representations
Python programming: Code examples use Python for data processing
Minimum Viable Understanding: Data Reduction at the Edge
Core Concept: Transform raw sensor data into actionable information at the source - send summaries, statistics, and alerts rather than every reading.
Why It Matters: Transmitting data costs 10-100x more energy than processing it locally. A sensor sending 1000 samples/minute to the cloud uses 100x more bandwidth than one sending minute-averages - with identical analytical value for most applications.
Key Takeaway: Apply the 90% rule - if 90% of your data is “normal” readings that will never be analyzed individually, aggregate them locally. Send statistical summaries (min, max, mean, std) at lower frequency, and only transmit raw data when anomalies are detected. This extends battery life from days to years.
47.3 Nyquist Sampling Rate
Time: ~8 min | Difficulty: Intermediate | Reference: P10.C08.U03
Key Concepts
Adaptive sampling: Dynamically adjusting the sensor sampling rate based on signal variance or event rate — increasing frequency when the signal changes rapidly and decreasing it during quiet periods.
Nyquist-Shannon theorem: The fundamental sampling principle stating that a signal must be sampled at least twice its highest frequency component to be reconstructed accurately — the minimum sampling rate for any IoT sensor.
Delta encoding: A compression technique transmitting only the change between consecutive readings rather than absolute values, highly effective for slowly varying sensors.
Run-length encoding (RLE): Compressing sequences of identical values into a count-value pair — very effective for binary event streams or sensors with frequent identical readings.
Lossless compression: Compression that allows perfect reconstruction of the original data — required for financial billing data, safety-critical readings, and regulatory compliance.
Lossy compression: Compression that discards some information to achieve higher compression ratios — acceptable for analytics workloads where small accuracy loss is tolerable.
To accurately capture a signal, the sampling rate must be at least twice the highest frequency component of interest:
\[f_{sample} \geq 2 \times f_{max}\]
Practical examples:
Signal Type
Max Frequency
Min Sample Rate
Typical Rate
Temperature
0.1 Hz (slow changes)
0.2 Hz
1 sample/minute
Vibration
500 Hz
1 kHz
2-5 kHz
Audio
20 kHz
40 kHz
44.1 kHz
Motion (IMU)
50 Hz
100 Hz
100-200 Hz
Common Pitfall: Sampling Aliasing
The mistake: Sampling signals below the Nyquist rate (2x the highest frequency), causing phantom patterns (aliasing) that don’t exist in the original signal.
Motor speed readings fluctuate despite constant RPM
Temperature data shows oscillations that don’t match physical reality
Frequency analysis reveals false peaks at wrong frequencies
Bearing fault detection produces false positives
Why it happens: Engineers apply “common sense” sampling rates without frequency analysis. Underestimating signal bandwidth - a 60 Hz motor generates harmonics at 120 Hz, 180 Hz, etc. Cost pressure drives lower sampling rates. Copy-pasting configurations between different sensor types.
The fix: Always sample at >2x the highest frequency of interest:
Signal Type
Max Frequency
Minimum Sample Rate
Recommended
Room temperature
0.01 Hz
0.02 Hz
1/minute
HVAC response
0.1 Hz
0.2 Hz
1/second
Motor vibration
500 Hz
1 kHz
2.5 kHz
Bearing analysis
5 kHz
10 kHz
25 kHz
Prevention: Perform frequency analysis on representative signals before deployment. Use anti-aliasing filters (low-pass hardware filters) before the ADC. When in doubt, oversample then downsample digitally with proper filtering.
Putting Numbers to It
What sampling rate is needed to detect bearing faults in a 1800 RPM motor?
Given:
Motor speed: 1800 RPM = 30 Hz (revolutions per second)
Bearing has 8 rolling elements
Bearing fault frequency: Rolling element passes once per revolution = \(8 \times 30 = 240\) Hz
Harmonics: 2nd harmonic at 480 Hz, 3rd at 720 Hz
Highest frequency of interest: 3rd harmonic = 720 Hz
Time: ~12 min | Difficulty: Intermediate | Reference: P10.C08.U03b
Before transmitting data to the cloud, edge devices can apply several reduction strategies:
Aggregation: Compute statistics over time windows (mean, min, max, variance)
Compression: Apply lossless (ZIP) or lossy (threshold-based) compression
Event-based reporting: Only transmit when values exceed thresholds
Delta encoding: Send only changes from previous values
# Example: Edge aggregation for temperature sensorclass EdgeAggregator:def__init__(self, window_size=60): # 60 samples = 1 minute at 1 Hzself.window_size = window_sizeself.buffer= []def add_sample(self, value):self.buffer.append(value)iflen(self.buffer) >=self.window_size:returnself.compute_summary()returnNonedef compute_summary(self): summary = {"min": min(self.buffer),"max": max(self.buffer),"mean": sum(self.buffer) /len(self.buffer),"samples": len(self.buffer) }self.buffer= []return summary# Usage: Send 1 summary per minute instead of 60 raw samplesaggregator = EdgeAggregator(window_size=60)for temp_reading in sensor_stream: summary = aggregator.add_sample(temp_reading)if summary: transmit_to_cloud(summary) # 60x bandwidth reduction
Try It: Edge Aggregation Data Reduction
Adjust the sensor sampling rate and aggregation window size to see how edge aggregation reduces data volume. The widget calculates the bandwidth reduction and shows what information is preserved versus lost.
The mistake: Combining data from sensors with different sampling rates without proper resampling, leading to incorrect correlations and temporal misalignment.
Symptoms:
Correlation analysis shows unexpected null or spurious relationships
Merged datasets have many NaN/missing values at certain timestamps
Time-series plots show “jagged” or misaligned signals
ML models perform poorly despite good individual sensor data
Why it happens: Teams often assume all sensors operate at the same rate. A temperature sensor at 1 Hz combined with a vibration sensor at 100 Hz creates 99 missing values per temperature reading. Naive timestamp matching drops 99% of vibration data.
The fix: Use proper resampling/interpolation before combining:
# Resample high-frequency data to match low-frequencyvibration_1hz = vibration_100hz.resample('1S').mean()# Or upsample low-frequency with interpolationtemp_100hz = temp_1hz.resample('10ms').interpolate(method='linear')# Then merge on aligned timestampsmerged = pd.merge_asof(vibration_1hz, temp_1hz, on='timestamp', tolerance=pd.Timedelta('500ms'))
Prevention: Document sampling rates in sensor metadata. Create a data alignment layer that resamples all sources to a common time base before analysis.
Common Pitfall: Edge Buffer Overflow
The mistake: Configuring edge device buffers without considering worst-case scenarios, causing data loss during network outages or traffic spikes.
Symptoms:
Gaps in time-series data after network recovery
“Buffer full, dropping oldest data” warnings in device logs
Critical events missing during high-activity periods
Why it happens: Buffer sizes calculated for average conditions, not peak loads. Network outage duration underestimated. Sensor burst rates during events (motion, vibration) exceed steady-state assumptions. Memory constraints on edge devices force small buffers.
The fix: Size buffers for worst-case, not average:
Calculate the required buffer size for your edge device based on sampling rate, expected outage duration, and available memory. See how tiered retention can help when memory is constrained.
Prevention: Monitor buffer utilization as a health metric. Alert at 70% capacity. Implement graceful degradation (reduce resolution before dropping data). Test with simulated network outages lasting 2x your expected maximum.
47.5 Compression Algorithms Deep Dive
Time: ~20 min | Difficulty: Advanced | Reference: P10.C08.U03c
Deep Dive: Edge Data Compression Algorithms Comparison
Edge devices face a fundamental trade-off: transmit less data (save power, bandwidth, cost) while preserving information needed for downstream analytics. This deep dive compares compression techniques across three dimensions: compression ratio, computational cost, and information preservation.
47.5.1 Compression Algorithm Categories
Category
Compression Ratio
CPU Cost
Information Loss
Best For
Lossless
2:1 - 4:1
Medium
None
Critical data, audit logs
Lossy Statistical
10:1 - 100:1
Low
Controlled
Trend analysis, dashboards
Lossy Transform
50:1 - 500:1
High
Controlled
Pattern detection, ML features
Semantic
100:1 - 1000:1
Low-Medium
Significant
Event detection, alerts
47.5.2 Lossless Compression: DEFLATE/GZIP
Standard lossless compression works well for structured IoT data:
import gzipimport jsondef compress_batch(readings: list[dict]) ->bytes:""" Compress a batch of sensor readings losslessly. Typical compression: 3-5x for JSON sensor data. """ json_str = json.dumps(readings) compressed = gzip.compress(json_str.encode('utf-8'), compresslevel=6)return compressed# Example: 100 temperature readingsreadings = [{"ts": 1704067200+ i, "v": 22.5+ (i %10) *0.1} for i inrange(100)]raw_size =len(json.dumps(readings).encode()) # ~4,500 bytescompressed_size =len(compress_batch(readings)) # ~1,200 bytes# Compression ratio: 3.75:1
Try It: GZIP Compression Estimator
Explore how batch size, data format, and compression level affect lossless GZIP compression of IoT sensor readings.
Configure the sensor rate and aggregation window to see how statistical summaries compress continuous data while preserving trend and anomaly detection capability.
When to use: Temperature, humidity, air quality - any slowly changing signal where trends matter more than exact samples.
47.5.4 Lossy Transform: FFT-Based Compression
Transform to frequency domain, keep only significant components:
import numpy as npfrom dataclasses import dataclass@dataclassclass FFTCompressed: timestamp: int sample_rate: float duration: float frequencies: list[float] # Top N frequency components magnitudes: list[float] # Corresponding magnitudes phases: list[float] # Phase angles for reconstructiondef fft_compress(samples: np.ndarray, sample_rate: float, timestamp: int, top_n: int=10) -> FFTCompressed:""" Compress time-series data using FFT, keeping top N frequency components. Typical compression: 100:1 to 500:1 depending on signal complexity. Best for: Vibration, audio, periodic signals. """# Compute FFT fft_result = np.fft.rfft(samples) freqs = np.fft.rfftfreq(len(samples), 1/sample_rate)# Get magnitudes and find top N (excluding DC component) magnitudes = np.abs(fft_result[1:]) # Skip DC phases = np.angle(fft_result[1:]) freqs = freqs[1:]# Select top N by magnitude top_indices = np.argsort(magnitudes)[-top_n:]return FFTCompressed( timestamp=timestamp, sample_rate=sample_rate, duration=len(samples) / sample_rate, frequencies=freqs[top_indices].tolist(), magnitudes=magnitudes[top_indices].tolist(), phases=phases[top_indices].tolist() )def fft_decompress(compressed: FFTCompressed, num_samples: int) -> np.ndarray:""" Reconstruct signal from FFT components (lossy reconstruction). """ t = np.linspace(0, compressed.duration, num_samples) signal = np.zeros(num_samples)for freq, mag, phase inzip(compressed.frequencies, compressed.magnitudes, compressed.phases): signal += mag * np.cos(2* np.pi * freq * t + phase)return signal# Example: Vibration sensor, 1 second at 1000 Hzsamples = np.sin(2*np.pi*50*np.linspace(0, 1, 1000)) # 50 Hz signalsamples +=0.3* np.sin(2*np.pi*150*np.linspace(0, 1, 1000)) # 150 Hz harmonic# Input: 1000 samples x 4 bytes = 4000 bytes# Output: 10 freq-mag-phase tuples x 12 bytes = 120 bytes + 20 bytes metadata# Compression ratio: ~30:1compressed = fft_compress(samples, 1000.0, 1704067200, top_n=10)reconstructed = fft_decompress(compressed, 1000)# Reconstruction error for this example: ~5% RMS# Bearing fault detection: Still works (frequency peaks preserved)
Try It: FFT Compression Explorer
Explore how FFT-based compression trades off between keeping more frequency components (higher fidelity) and achieving higher compression. Adjust the signal parameters and number of retained components.
When to use: Vibration analysis, acoustic monitoring, any signal where frequency content matters more than exact waveform.
47.5.5 Semantic Compression: Event Extraction
Highest compression, but requires domain knowledge:
from dataclasses import dataclassfrom enum import Enumfrom typing import Optionalclass EventType(Enum): THRESHOLD_EXCEEDED ="threshold_exceeded" ANOMALY_DETECTED ="anomaly_detected" STATE_CHANGE ="state_change" PERIODIC_SUMMARY ="periodic_summary"@dataclassclass SemanticEvent: timestamp: int device_id: str event_type: EventType value: float context: dict# Additional info (threshold, previous state, etc.)class SemanticCompressor:def__init__(self, device_id: str, threshold_high: float, threshold_low: float, anomaly_std_factor: float=3.0):self.device_id = device_idself.threshold_high = threshold_highself.threshold_low = threshold_lowself.anomaly_std_factor = anomaly_std_factorself.history: list[float] = []self.last_state: Optional[str] =Noneself.summary_count =0self.summary_sum =0.0def process_sample(self, timestamp: int, value: float) ->list[SemanticEvent]:""" Process a sample and return events (if any). Most samples produce NO events - that's the compression. """ events = []# Update history for anomaly detectionself.history.append(value)iflen(self.history) >100:self.history.pop(0)# Track for periodic summaryself.summary_count +=1self.summary_sum += value# Check threshold crossing current_state ="normal"if value >self.threshold_high: current_state ="high"elif value <self.threshold_low: current_state ="low"if current_state !=self.last_state andself.last_state isnotNone: events.append(SemanticEvent( timestamp=timestamp, device_id=self.device_id, event_type=EventType.STATE_CHANGE, value=value, context={"previous_state": self.last_state,"new_state": current_state } ))self.last_state = current_state# Check for statistical anomalyiflen(self.history) >=20: mean =sum(self.history) /len(self.history) std = (sum((x - mean)**2for x inself.history) /len(self.history)) **0.5if std >0andabs(value - mean) >self.anomaly_std_factor * std: events.append(SemanticEvent( timestamp=timestamp, device_id=self.device_id, event_type=EventType.ANOMALY_DETECTED, value=value, context={"mean": mean,"std": std,"z_score": (value - mean) / std } ))return eventsdef get_periodic_summary(self, timestamp: int) -> SemanticEvent:"""Call every N minutes to send a heartbeat/summary.""" avg =self.summary_sum /self.summary_count ifself.summary_count >0else0 event = SemanticEvent( timestamp=timestamp, device_id=self.device_id, event_type=EventType.PERIODIC_SUMMARY, value=avg, context={"sample_count": self.summary_count,"period_seconds": 300# 5 minutes } )self.summary_count =0self.summary_sum =0.0return event# Example: Temperature sensor, 1 sample/second# Normal operation: 0 events per sample# State change: 1 event (~100 bytes)# 5-minute summary: 1 event (~80 bytes)## Input: 300 samples x 8 bytes = 2400 bytes per 5 minutes# Output: 1 summary + maybe 0-2 events = 80-280 bytes# Compression ratio: 10:1 to 30:1 (varies by activity)
Try It: Semantic Event Compression Simulator
Configure thresholds and signal behavior to see how semantic compression extracts only meaningful events from a continuous sensor stream. Most samples produce zero events – that is the compression.
semc_totalSamples = semc_simDuration *60// 1 Hz samplingsemc_rawBytes = semc_totalSamples *8semc_summaryEvents =Math.floor(semc_simDuration / semc_summaryMin)// Simulate: each sample has semc_spikeChance% of crossing thresholdsemc_expectedSpikes =Math.round(semc_totalSamples * (semc_spikeChance /100))semc_stateChanges =Math.min(semc_expectedSpikes *2, semc_totalSamples) // enter + exit spikesemc_anomalyEvents =Math.round(semc_expectedSpikes *0.3) // 30% of spikes also flag as anomalysemc_totalEvents = semc_summaryEvents + semc_stateChanges + semc_anomalyEventssemc_eventBytes = semc_summaryEvents *80+ semc_stateChanges *100+ semc_anomalyEvents *120semc_ratio = semc_eventBytes >0? semc_rawBytes / semc_eventBytes : semc_rawBytessemc_pctTransmit = (semc_totalEvents / semc_totalSamples) *100
Show code
html`<div style="background: var(--bs-light, #f8f9fa); border: 1px solid var(--bs-border-color, #dee2e6); border-radius: 8px; padding: 1.2rem; color: var(--bs-body-color);"><h4 style="margin-top:0; color: #2C3E50;">Semantic Compression Results</h4><div style="display: grid; grid-template-columns: 1fr 1fr; gap: 1rem; margin-bottom: 1rem;"> <div style="padding: 0.6rem; background: rgba(231,76,60,0.08); border-radius: 6px; text-align: center;"> <div style="font-size: 0.8rem; color: #7F8C8D;">Raw Input</div> <div style="font-size: 1.5rem; font-weight: bold; color: #E74C3C;">${semc_totalSamples.toLocaleString()} samples</div> <div style="font-size: 0.85rem; color: #7F8C8D;">${(semc_rawBytes/1024).toFixed(1)} KB</div> </div> <div style="padding: 0.6rem; background: rgba(22,160,133,0.08); border-radius: 6px; text-align: center;"> <div style="font-size: 0.8rem; color: #7F8C8D;">Events Extracted</div> <div style="font-size: 1.5rem; font-weight: bold; color: #16A085;">${semc_totalEvents} events</div> <div style="font-size: 0.85rem; color: #7F8C8D;">${semc_eventBytes} bytes</div> </div></div><table style="width:100%; border-collapse: collapse; font-size: 0.9rem;"><tr style="border-bottom: 1px solid var(--bs-border-color, #dee2e6);"> <td style="padding: 5px 8px;">Periodic Summaries (every ${semc_summaryMin} min)</td> <td style="padding: 5px 8px; text-align: right; color: #3498DB;">${semc_summaryEvents} events</td></tr><tr style="border-bottom: 1px solid var(--bs-border-color, #dee2e6);"> <td style="padding: 5px 8px;">State Change Events (threshold crossings)</td> <td style="padding: 5px 8px; text-align: right; color: #E67E22;">${semc_stateChanges} events</td></tr><tr style="border-bottom: 1px solid var(--bs-border-color, #dee2e6);"> <td style="padding: 5px 8px;">Anomaly Detections (statistical outliers)</td> <td style="padding: 5px 8px; text-align: right; color: #9B59B6;">${semc_anomalyEvents} events</td></tr><tr style="border-bottom: 1px solid var(--bs-border-color, #dee2e6);"> <td style="padding: 5px 8px;"><strong>Compression Ratio</strong></td> <td style="padding: 5px 8px; text-align: right; font-weight: bold; color: #16A085;">${semc_ratio.toFixed(0)}:1</td></tr><tr> <td style="padding: 5px 8px;"><strong>Samples Producing Events</strong></td> <td style="padding: 5px 8px; text-align: right;">${semc_pctTransmit.toFixed(1)}% of samples</td></tr></table><p style="margin-top: 0.8rem; padding: 0.5rem; background: rgba(155,89,182,0.08); border-radius: 4px; font-size: 0.85rem;"><strong>Insight:</strong> ${semc_pctTransmit <5?"With only "+ semc_pctTransmit.toFixed(1) +"% of samples generating events, semantic compression is highly effective for this signal -- most of the data is 'nothing happening'.": semc_pctTransmit <20?"Moderate event rate at "+ semc_pctTransmit.toFixed(1) +"% -- semantic compression is still beneficial but consider lowering anomaly sensitivity or widening thresholds.":"High event rate at "+ semc_pctTransmit.toFixed(1) +"% -- thresholds may be too tight. Consider widening the normal range or the signal may genuinely be volatile, in which case statistical aggregation may be more appropriate."}</p></div>`
When to use: Monitoring systems where “nothing happening” is the common case. Alarm systems, threshold monitoring, sparse event streams.
47.5.6 Algorithm Selection Decision Tree
Figure 47.1: Edge Data Compression Algorithm Selection Decision Tree
47.5.7 Benchmark Results: ESP32 Edge Device
Real measurements on ESP32-WROOM-32 (240 MHz, 520KB RAM):
Algorithm
1000 Samples
Compress Time
Output Size
Power (mJ)
Raw JSON
-
15 ms (serialize)
28,000 bytes
2.4
GZIP-6
28,000 bytes
85 ms
8,200 bytes
8.5
Window Agg
8 bytes/sample
2 ms
48 bytes
0.4
FFT Top-10
4 bytes/sample
45 ms
140 bytes
5.0
Semantic
8 bytes/sample
3 ms
0-100 bytes
0.5
Key insight: For battery-powered edge devices, window aggregation offers the best power efficiency. FFT is valuable when frequency content matters, but the CPU cost is significant. Semantic compression is ideal for sparse event streams.
47.5.8 Memory Constraints on Edge Devices
Compression algorithms have memory overhead. Consider carefully on constrained devices:
Algorithm
RAM Required
Notes
GZIP
32-64 KB
Sliding window + Huffman tables
Window Agg
<1 KB
Just buffer for current window
FFT (1024 pt)
16 KB
Complex float buffer + twiddle factors
FFT (4096 pt)
64 KB
May not fit on small MCUs
Semantic
2-4 KB
History buffer + state
ESP32 recommendation: Use window aggregation or semantic compression as primary strategy. Reserve FFT for specific signals where frequency analysis is required.
47.6 Common Compression Pitfalls
Pitfall: Over-Aggressive Lossy Compression
The Mistake: Applying high compression ratios uniformly across all sensor data without understanding which information is critical for downstream analytics, permanently destroying signals needed for root cause analysis.
Why It Happens: Bandwidth costs drive aggressive compression targets. Teams optimize for average case without considering anomaly detection requirements. Compression algorithms are chosen based on benchmark performance rather than domain-specific information preservation. The “we can always collect more data later” assumption fails for non-reproducible events.
The Fix: Profile your analytics requirements before choosing compression. For predictive maintenance, preserve frequency-domain information (use FFT compression, not just statistics). For threshold alerting, min/max preservation is critical. For trend analysis, mean and standard deviation suffice. Implement tiered compression: full resolution for anomalies detected locally, heavy compression for steady-state readings. Always retain enough information to answer “why did this alert trigger?” after the fact.
Pitfall: Compression Without Metadata
The Mistake: Compressing sensor data without preserving the metadata needed to decompress or interpret it correctly, creating files that cannot be decoded weeks or months later.
Why It Happens: Metadata seems redundant during development when context is fresh. Schema documentation is maintained separately and drifts over time. Edge device memory constraints pressure developers to strip every unnecessary byte. Compression parameters are hardcoded rather than embedded in output.
The Fix: Always include compression metadata in the payload or use self-describing formats. For FFT compression, include sample rate, window size, and which frequency bins are transmitted. For statistical aggregation, include sample count, window duration, and timestamp precision. Use envelope formats that version the compression scheme: {"compression": "fft-v2", "params": {...}, "data": [...]}. Maintain a compression schema registry that maps version identifiers to decompression algorithms.
Pitfall: Ignoring Compression Computation Cost
The Mistake: Selecting compression algorithms based purely on compression ratio without accounting for CPU time and energy cost on battery-powered edge devices, resulting in net-negative energy savings.
Why It Happens: Compression benchmarks on desktop hardware show impressive ratios with negligible CPU time. The 1000x difference in computational efficiency between an ESP32 and a laptop is underestimated. Energy cost of computation versus transmission varies by network type (Wi-Fi is cheap to transmit, LoRa is expensive). Algorithm selection copied from cloud/server contexts.
The Fix: Measure end-to-end energy consumption: E_total = E_compute + E_transmit. For LoRaWAN devices where transmission costs 100+ mJ per packet, aggressive compression (even expensive algorithms) saves energy. For Wi-Fi devices where transmission costs 1-5 mJ per packet, simple aggregation beats complex compression. Profile specific algorithms on your target MCU: GZIP on ESP32 consumes 8.5 mJ for 1000 samples versus 0.4 mJ for window aggregation. Choose the algorithm that minimizes total energy, not just bytes transmitted.
47.7 Understanding Check: Industrial Edge Data Pipeline Design
Scenario: Factory Vibration Monitoring System
Your manufacturing plant monitors 50 critical machines using vibration sensors to detect bearing failures before catastrophic breakdown. Each sensor must detect frequencies up to 200 Hz (bearing defects manifest at 50-200 Hz harmonics).
System constraints:
Sensor: MEMS accelerometer (+/-16g range)
Edge compute: ESP32 gateway with 4MB flash, 520KB RAM
Network: 4G cellular with 10 GB/month data cap ($0.10/GB overage)
Requirement: Detect anomalies within 1 minute, minimize bandwidth costs
Current naive approach:
Sample at 500 Hz (meets Nyquist: 2 x 200 Hz)
Stream raw data to cloud continuously
Result: 500 samples/sec x 2 bytes x 50 sensors = 50 KB/s = 129 GB/month ($11.90 overage!)
47.7.1 Think About: Data Reduction Strategy Trade-offs
Which edge processing strategy best balances anomaly detection accuracy, bandwidth costs, and latency?
Strategy
Data Transmitted
Bandwidth Cost
Detection Latency
Information Loss
A. Raw streaming
5000 samples/10s
129 GB/month ($11.90)
Real-time (<1s)
None (full fidelity)
B. Downsample to 100 Hz
1000 samples/10s
26 GB/month ($1.60)
Real-time (<1s)
Loses 200+ Hz info (aliasing)
C. Time-domain stats
6 values/10s (min/max/mean/std/peak/RMS)
0.15 GB/month ($0)
10 seconds
Loses frequency info (can’t detect bearing harmonics)
D. FFT + compression
10 FFT bins/10s
0.26 GB/month ($0)
10 seconds
Preserves frequency info (50-200 Hz)
47.7.2 Key Insight: Edge FFT for Bandwidth Reduction
Option D (FFT + compression) achieves 500x bandwidth reduction while preserving anomaly detection capability:
How it works:
# Edge processing pipeline (runs on ESP32 every 10 seconds)def vibration_pipeline():# 1. Collect 10 seconds of data samples = collect_samples(rate=500, duration=10) # 5000 samples# 2. Apply FFT (frequency-domain analysis) fft_result = numpy.fft.rfft(samples) # -> 2500 frequency bins# 3. Extract critical frequency bins# Bin width = 500 Hz / 5000 samples = 0.1 Hz per bin# Bin index = frequency / bin_width bin_width =500.0/5000# 0.1 Hz per bin target_freqs = [50, 80, 110, 140, 170, 200] # Hz bins = [fft_result[int(f / bin_width)] for f in target_freqs]# bins at indices [500, 800, 1100, 1400, 1700, 2000]# + 4 more bins for comprehensive coverage# 4. Transmit 10 values instead of 5000 transmit_to_cloud(bins) # 20 bytes vs 10,000 bytesreturn bins# Data reduction: 5000 samples -> 10 FFT bins = 500x compression
Try It: Edge FFT Pipeline Data Reduction
Adjust the vibration sensor parameters and FFT settings to see how edge FFT compression reduces bandwidth for factory monitoring.
Bearing failure signatures live in frequency domain: Healthy bearing = smooth spectrum. Failing bearing = spikes at harmonics (80 Hz, 160 Hz for 2400 RPM machine).
No information loss for anomaly detection: Cloud ML model trained on FFT bins, not raw waveforms. Accuracy: 94% (FFT bins) vs 96% (raw samples, not worth 500x bandwidth cost).
Latency acceptable: 10-second aggregation + 2-second transmission = 12 seconds total (well under 1-minute requirement).
Cost savings: 0.26 GB/month stays under data cap! (vs $11.90 overage for raw streaming).
Worked Example: Vibration Sensor Sampling Rate Calculation for Bearing Fault Detection
Scenario: A manufacturing facility wants to detect bearing faults in motors running at 1800 RPM. Bearing defects produce vibration frequencies at harmonics of the motor speed. You need to determine the minimum sampling rate to capture fault signatures.
Given:
Motor speed: 1800 RPM = 30 Hz (revolutions per second)
Bearing fault frequencies:
1x: 30 Hz (fundamental, imbalance)
2x: 60 Hz (misalignment)
3x: 90 Hz (looseness)
5x: 150 Hz (bearing outer race defect)
7x: 210 Hz (bearing inner race defect)
Highest frequency of interest: 210 Hz (7th harmonic)
Question: What is the minimum sampling rate required, and what sampling rate should you actually use in practice?
Solution:
Step 1: Apply Nyquist theorem
Minimum sampling rate = 2 × highest frequency
f_sample_min = 2 × 210 Hz = 420 Hz
Step 2: Calculate practical sampling rate with safety margin
Industry practice: Use 2.5x to 3x Nyquist for anti-aliasing filter roll-off
Round up to convenient power-of-2 or decade value: 1,280 Hz or 1,000 Hz
Step 3: Verify no aliasing occurs
Check if any harmonic would alias into the measurement band: - At 1,000 Hz sampling, Nyquist frequency = 500 Hz - All fault frequencies (30-210 Hz) are below 500 Hz ✓ - No aliasing! All harmonics are correctly captured.
Step 4: Calculate data volume
Single sensor: - Sample rate: 1,000 Hz - Data size: 2 bytes per sample (16-bit ADC) - Data rate: 1,000 × 2 = 2,000 bytes/sec = 2 KB/sec - Daily data: 2 KB/sec × 86,400 sec = 172.8 MB/day
Key Insight: For vibration analysis, sample at 2.5-3x Nyquist (not just 2x minimum) to allow for anti-aliasing filter roll-off. Then apply edge FFT compression by transmitting only the frequency bins of interest (harmonics), achieving 1,000-10,000x data reduction while preserving all fault detection capability. The edge gateway does the heavy computation; the cloud receives only the diagnostic features.
Decision Framework: Compression Algorithm Selection for Edge Devices
Choose the appropriate compression strategy based on signal characteristics, edge compute capabilities, and analytical requirements:
Common Mistake: Undersampling Harmonics in Vibration Monitoring
The Mistake: Sampling vibration data at only 2x the motor’s fundamental frequency, missing critical high-frequency bearing fault signatures that appear at 5x-7x harmonics.
Real-World Example: A factory deployed vibration sensors with 100 Hz sampling on 30 Hz motors (thinking “2x motor speed is enough”):
Motor: 30 Hz
Bearing outer race fault frequency: 5 × 30 = 150 Hz
At 100 Hz sampling:
- Nyquist = 50 Hz
- 150 Hz aliases to |150 - round(150/100) × 100| = |150 - 200| = 50 Hz
- At exactly the Nyquist frequency, the signal is severely distorted
The bearing fault appeared as an unreliable artifact at the Nyquist boundary.
ML model missed 8 bearing failures before they became catastrophic.
One failure caused $500K in downtime.
Correct Implementation:
def calculate_vibration_sampling_rate(motor_rpm, bearing_type="ball"):""" Calculate sampling rate for bearing fault detection. Accounts for all possible fault harmonics. """ motor_hz = motor_rpm /60# Bearing fault frequency multipliers fault_harmonics = {"ball": [1, 2, 3, 4, 5, 6, 7, 8], # Ball bearings: up to 8x"roller": [1, 2, 3, 4, 5], # Roller bearings: up to 5x"sleeve": [1, 2, 3], # Sleeve bearings: up to 3x } highest_harmonic =max(fault_harmonics[bearing_type]) highest_frequency = motor_hz * highest_harmonic# Safety factor: 3x Nyquist for anti-aliasing filter recommended_rate =3*2* highest_frequencyreturn {'motor_hz': motor_hz,'highest_fault_hz': highest_frequency,'nyquist_min': 2* highest_frequency,'recommended': recommended_rate, }# Example usage:rate_info = calculate_vibration_sampling_rate(motor_rpm=1800, bearing_type="ball")# Motor: 30 Hz | Highest fault: 240 Hz | Recommended: 1440 Hz (3x Nyquist)
Try It: Vibration Sampling Rate by Bearing Type
Adjust motor RPM and bearing type to see how bearing fault harmonics determine the required sampling rate.
Warning Signs: Vibration analysis shows only the fundamental frequency with no harmonics. Bearing failures occur with “no warning” despite continuous monitoring. FFT spectrum looks suspiciously clean.
Prevention: Always analyze the FULL harmonic series for rotating machinery. Sample at 3x Nyquist (not just 2x) to allow for anti-aliasing filter roll-off. Verify by plotting the FFT spectrum and confirming all expected fault harmonics are visible.
47.8 Knowledge Check
Quiz: Sampling and Compression
47.9 Practice Exercises
Exercise 1: Sampling Rate Selection
Objective: Determine optimal sampling rates for different sensor types.
Tasks:
Identify signal characteristics for 4 sensors: temperature (max 0.1 Hz), vibration (max 500 Hz), audio (max 20 kHz), motion IMU (max 50 Hz)
Validate: can you detect a 1C temperature spike with each method?
Expected Outcome: Understand trade-offs between compression ratio and information preservation.
For Kids: Meet the Sensor Squad!
Sammy learns the Nyquist rule!
Sammy the Sensor was monitoring a washing machine’s vibrations. He was taking measurements really slowly – only once every 5 seconds.
“Something is wrong!” said Max the Microcontroller. “The machine is shaking like crazy, but your readings look perfectly calm!”
Lila the LED, who loved science, explained: “Sammy, you are measuring too slowly! The machine vibrates hundreds of times per second, but you only check every 5 seconds. It is like trying to watch a hummingbird by opening your eyes once every minute – you would never see its wings move!”
“There is a rule called the Nyquist rule,” Lila continued. “You need to measure at LEAST twice as fast as the thing is changing. If the machine vibrates 100 times per second, you need to measure at least 200 times per second!”
Sammy sped up his measurements, and suddenly the vibration patterns appeared clearly. But now Bella the Battery was worried: “That is SO much data! I cannot send all of it to the cloud!”
Max had the solution: “We can COMPRESS the data! Instead of sending every single measurement, let us calculate a summary – the average vibration, the biggest shake, and the overall pattern. We send 10 numbers instead of 5,000!”
The lesson: Sample fast enough to catch what is happening (Nyquist rule), then compress smartly to save energy and bandwidth!
Key Takeaway
Always sample at 2x or higher than the highest frequency of interest (Nyquist theorem) to avoid aliasing artifacts. Then apply edge data reduction – aggregation for slow-changing signals, FFT compression for vibration analysis, or semantic event extraction for sparse event streams – to reduce bandwidth by 10-1000x while preserving the information needed for downstream analytics.
Interactive Quiz: Match Concepts
Interactive Quiz: Sequence the Steps
Label the Diagram
Code Challenge
47.10 Summary
Edge data acquisition requires careful balance between data fidelity and resource constraints:
Nyquist compliance: Sample at 2x or higher than your highest frequency of interest to avoid aliasing
Reduction techniques: Aggregation (10-50x), FFT compression (50-500x), and semantic extraction (100-1000x) each serve different use cases
Algorithm selection: Match compression to downstream analytics needs - lossless for audit, statistical for trends, FFT for vibration, semantic for events
Resource awareness: Consider CPU time and memory on constrained edge devices, not just compression ratio
47.11 Concept Relationships
Sampling and compression determine data fidelity, bandwidth, and power consumption trade-offs:
Sampling Theory (This chapter):
Nyquist theorem: sample at 2x highest frequency to avoid aliasing (vibration monitoring: sample at 2.5-3x Nyquist for anti-aliasing filter)
Under-sampling causes phantom patterns (150 Hz bearing fault aliased to 30 Hz when sampled at 60 Hz)
Compression Strategies (This chapter):
Lossless (GZIP): 2-4x reduction, preserves all data (audit trails, compliance)
FFT-based: 50-500x reduction, preserves frequency spectrum (vibration analysis)
Semantic (event extraction): 100-1000x reduction, preserves state changes (threshold monitoring)
Power Impact:
Edge Data Acquisition: Power and Gateways - Compression reduces transmission frequency, directly extending battery life (factory case: 14,400x reduction enables LoRa vs cellular)
Architecture Context:
Edge Data Acquisition: Architecture - Device category determines compression need (cameras need heavy compression; temperature sensors need aggregation)
Edge Compute Patterns - Edge ML requires compressed features (FFT bins, not raw waveforms)
Key Insight: Compression algorithm selection depends on analytics requirements, not just compression ratio. Vibration monitoring needs FFT compression (preserves frequency info for bearing fault detection) even though aggregation yields higher ratios. Choosing wrong compression permanently destroys the signal needed for analysis.
47.12 What’s Next
If you want to…
Read this
Understand the acquisition architecture that applies these strategies