15 Signal Processing & Filtering
Learning Objectives
After completing this chapter, you will be able to:
- Calculate minimum sampling rates using the Nyquist theorem and diagnose aliasing artefacts
- Apply digital filters (moving average, EMA, median) to sensor data
- Implement hardware and software signal conditioning
- Detect and compensate for sensor calibration drift
- Select appropriate filters for different noise types and justify your choice
For Beginners: Signal Processing and Filtering
Raw sensor readings are almost always noisy – they jump around even when nothing is changing, like a bathroom scale that flickers between 69 and 71 kg while you stand still. Signal processing cleans up this noise using techniques like averaging (take several readings and find the middle) and filtering (ignore sudden spikes that are clearly wrong). These simple techniques turn unreliable raw data into smooth, trustworthy measurements.
15.1 Prerequisites
- Sensor Introduction: Basic sensor concepts
- Sensor Specifications: Accuracy, precision, resolution
Cross-Hub Connections
Learning Resources:
- Quizzes Hub - Test your signal processing knowledge with interactive questions on sampling, filtering, and calibration
- Simulations Hub - Explore the Signal Processing Workbench to visualize filter effects in real-time
- Labs Hub - Practice ADC sampling and digital filtering on ESP32 with Wokwi simulations
- Knowledge Gaps Hub - Address common misconceptions about sampling rates and filter selection
- Knowledge Map - See how signal processing connects to sensor calibration, ADC fundamentals, and data quality
Key Takeaway
In one sentence: Signal processing transforms noisy raw sensor data into clean, reliable measurements through proper sampling, filtering, and calibration.
Remember this rule: Sample at >2x your highest frequency of interest, filter with the right algorithm for your noise type (moving average for Gaussian, median for outliers), and recalibrate sensors periodically to maintain accuracy.
Sensor Squad: Signal Processing for Kids!
Meet the Clean-Up Crew!
Imagine you’re trying to hear your friend whisper in a noisy cafeteria. That’s what sensors deal with every day - they’re trying to measure the real temperature or distance, but there’s lots of “noise” getting in the way!
The Signal Processing Superheroes:
- Sammy the Sampler takes pictures of the signal really fast - like taking 100 photos per second of a spinning wheel so you can see each spoke clearly!
- Fiona the Filter is like a picky eater who only keeps the good readings and throws away the weird ones
- Cal the Calibrator makes sure your ruler starts at exactly zero, not at 1 inch!
Why does this matter?
Think of your thermometer app. Without signal processing:
- It might jump from 70 to 250 and back to 72 (scary but not real!)
- It might show a temperature that happened 5 minutes ago
- It might always read 3 degrees too high
Real-world example: When you use a fitness tracker, it filters out the noise from your wrist moving around and only counts actual steps. That’s signal processing in action!
Fun Fact: Your phone’s screen has a filter that ignores accidental touches when you’re holding it - that’s why it doesn’t go crazy when your palm touches the screen!
15.2 Sensor Data Acquisition Pitfalls
These common mistakes cause incorrect sensor readings, false alarms, and wasted development time. Understanding each pitfall – and its fix – will save you hours of debugging.
15.2.1 Ignoring Nyquist Sampling Rate (Aliasing)
The Mistake
Sampling signals below twice their highest frequency component, causing false low-frequency patterns to appear.
Symptoms:
- False patterns appear in data that don’t exist in the real signal
- High-frequency events (vibrations, transients) are missed entirely
- Frequency analysis shows phantom signals at incorrect frequencies
Why it happens: Developers choose sample rates based on storage/bandwidth constraints rather than signal characteristics. The Nyquist theorem requirement (sample at >2x the highest frequency) isn’t understood.
How to diagnose:
- Identify the highest frequency component in your signal of interest
- Compare your sample rate to 2x that frequency
- Look for patterns that vary with sample rate changes
- Use an oscilloscope or high-speed capture to see the true signal
The fix: Sample faster than the signal you care about, then block frequencies above the Nyquist limit before they reach the ADC.
Optional Aliasing Pattern
# WRONG: Sampling 60Hz vibration at 50Hz
# Result: False 10Hz signal appears (aliasing)
while True:
reading = sensor.read()
time.sleep(0.02) # 50Hz sample rate - causes aliasing!
# CORRECT: Sample at least 2x highest frequency of interest
# For 60Hz signal, sample at 120Hz minimum (200Hz recommended)
while True:
reading = sensor.read()
time.sleep(0.005) # 200Hz sample rate - safe
# BEST: Use anti-aliasing filter before ADC
# Hardware low-pass filter removes frequencies above Nyquist
filtered_signal = low_pass_filter(raw_signal, cutoff=sampling_rate/2)
adc_value = adc.read(filtered_signal)Prevention: Sample at >2x the highest frequency of interest (4-10x is better practice). Use hardware anti-aliasing filters (low-pass RC filter) before the ADC.
Worked Example: Calculating Sampling Rate for Vibration Monitoring
Scenario: You are designing a predictive maintenance system for industrial motors. Bearing failures produce vibration signatures between 10 Hz (slow bearing wear) and 5 kHz (bearing race defects). What sampling rate do you need?
Step 1: Identify Highest Frequency Component
- Lowest frequency of interest: 10 Hz for slow wear patterns.
- Highest frequency of interest: 5000 Hz for bearing race defects.
Step 2: Apply Nyquist Theorem
The Nyquist-Shannon sampling theorem says the minimum sampling rate is two times the highest frequency: 2 x 5000 Hz = 10,000 Hz, or 10 kHz.
Putting Numbers to It
The Nyquist theorem requires the sampling rate to be greater than two times the highest frequency. For bearing defects at 5 kHz, the minimum is 2 x 5000 Hz = 10 kHz.
Real anti-aliasing filters have gradual rolloff. A 2nd-order Butterworth filter at a 5 kHz cutoff attenuates by only 12 dB at 10 kHz, still passing about 25% of the signal amplitude. Sampling at 20 kHz moves the Nyquist frequency to 10 kHz, giving the filter more room to reject unwanted high-frequency energy. This is why practical systems use 4x to 10x the highest frequency, not only the mathematical 2x minimum.
Step 3: Add Safety Margin
In practice, sample at 4-10× the highest frequency of interest for anti-aliasing margin:
Recommended sampling rate: 4 x 5000 Hz = 20 kHz.
Why the margin? Real-world anti-aliasing filters have gradual rolloff (a 2nd-order filter rolls off at 40 dB/decade, or 12 dB/octave). Sampling at exactly 2× the highest frequency leaves no margin for the filter’s transition band, allowing aliased energy to leak through.
Step 4: Choose ADC
| ADC | Max Sample Rate | Resolution | Cost | Suitable? |
|---|---|---|---|---|
| ESP32 built-in | 200 kSPS | 12-bit | $5 | Yes, overkill |
| ADS1115 (I2C) | 860 SPS | 16-bit | $8 | No, too slow |
| ADS1256 (SPI) | 30 kSPS | 24-bit | $20 | Yes, high precision |
| MCP3008 (SPI) | 200 kSPS | 10-bit | $3 | Yes, budget |
Choice: ESP32 built-in ADC at 20 kHz sampling rate
Step 5: Calculate Data Storage Requirements
- Sample rate: 20,000 samples per second.
- Sample size: 2 bytes per sample, storing a 12-bit ADC value in 16 bits.
- Data rate:
20,000 x 2 = 40,000 bytes/s, or about40 KB/s. - Per hour:
40 KB/s x 3600 s = 144 MB. - Per day:
144 MB/h x 24 h = 3.5 GB.
Optimization: Store only when vibration exceeds threshold (event-driven recording):
If the machine is above the vibration threshold only 1% of the time, storage falls from 3.5 GB/day to about 35 MB/day.
Step 6: Power Consumption
Continuous 20 kHz sampling is power-intensive:
- ESP32 ADC active current: about
80 mA. - Daily energy:
80 mA x 24 h = 1920 mAh/day. - A
2000 mAhbattery lasts roughly2000 / 1920 = 1.04 days, which is impractical.
Solution: Sample in bursts (1 second every 60 seconds):
- Active energy per minute: about
0.022 mAh. - Sleep energy for the remaining 59 seconds is negligible.
- Daily energy:
0.022 x 1440 = 31.7 mAh/day. - Battery life:
2000 / 31.7 = 63 days, much better than continuous sampling.
Real-World Implementation:
void loop() {
captureBurst(ADC_PIN, 20000, 20000); // 1 s at 20 kHz
float rms = calculateRMS(buffer);
if (rms > THRESHOLD) sendAlert(rms);
sleepSeconds(59);
}Key Insights:
- Always sample at 4-10× the highest frequency — not just 2× — to allow for anti-aliasing filter rolloff
- High sample rates generate massive data — use threshold-based recording or edge processing (FFT) to reduce storage
- Continuous high-speed ADC drains batteries fast — use burst sampling with sleep intervals for battery-powered deployments
- FFT on-device saves bandwidth — transmit vibration frequency spectrum (100 bytes) instead of raw waveform (40 KB/second)
15.2.1.1 Nyquist Sampling Rate Calculator
Use this calculator to determine the minimum and recommended sampling rates for your signal.
15.2.2 Ignoring Sensor Calibration Drift
The Mistake
Using factory calibration forever, assuming sensors maintain accuracy over time.
Symptoms:
- Gradual accuracy degradation over weeks/months
- Systematic bias in readings (consistently high or low)
- False anomaly detections as sensors drift past thresholds
Why it happens: Sensors drift due to aging, temperature cycling, humidity exposure, and contamination. Factory calibration is a snapshot that degrades over time.
The fix: Treat calibration as a repeating maintenance process, not a one-time factory setting.
Optional Drift-Correction Pattern
def read_temperature():
raw = adc.read()
days = (datetime.now() - last_calibration).days
drift = DRIFT_C_PER_DAY * days
return raw * scale + offset + drift
def recalibrate(reference_c):
global offset, last_calibration
offset = reference_c - adc.read() * scale
last_calibration = datetime.now()Prevention: Implement periodic recalibration procedures (quarterly for many sensors). Use reference sensors for cross-validation.
15.2.3 Using Raw Sensor Data Without Filtering
The Mistake
Using raw ADC values directly for decisions, displays, and storage without noise filtering.
Symptoms:
- Noisy, jumpy dashboards and charts
- False anomaly alerts from noise spikes
- Incorrect trend analysis due to noise obscuring patterns
- Unstable control loops that oscillate
The fix: Choose a filter based on the kind of noise you see. Use a moving average for random noise, an exponential average for low memory, and a median filter for spikes.
Optional Filter Function Patterns
from collections import deque
window = deque(maxlen=10)
spike_window = deque(maxlen=5)
ema = None
def moving_average(raw):
window.append(raw)
return sum(window) / len(window)
def exponential_average(raw, alpha=0.1):
global ema
ema = raw if ema is None else alpha * raw + (1 - alpha) * ema
return ema
def median_filter(raw):
spike_window.append(raw)
ordered = sorted(spike_window)
return ordered[len(ordered) // 2]15.3 Digital Filter Selection Guide
Choosing the right filter depends on your noise characteristics and application requirements:
Quick Reference Table:
| Noise Type | Best Filter | Parameters | Use Case |
|---|---|---|---|
| Random Gaussian | Moving Average | Window = 10-20 | Temperature, humidity |
| Spikes/Outliers | Median | Window = 5-7 | Distance sensors, IR |
| High-frequency | IIR Low-pass | fc = max signal freq | Vibration, audio |
| Known statistics | Kalman | Q, R from data | IMU, tracking |
| 50/60 Hz interference | Notch filter | fn = 50 or 60 Hz | Analog sensors near AC |
15.3.1 EMA Smoothing Factor Explorer
Adjust the EMA alpha parameter to see how it affects the time constant and responsiveness. A lower alpha gives smoother output but slower response to real changes.
15.4 Signal Conditioning Chain
The complete signal conditioning chain for a sensor transforms raw physical measurements into clean digital data:
Stage-by-Stage Explanation:
| Stage | Purpose | Key Parameters |
|---|---|---|
| Raw Sensor | Physical measurement | Sensitivity, range |
| Amplification | Scale signal to ADC range | Gain (1x-1000x) |
| Analog Filter | Remove frequencies above Nyquist | Cutoff frequency |
| Sample & Hold | Freeze signal during conversion | Acquisition time |
| ADC | Convert to digital | Resolution (bits), sample rate |
| Digital Filter | Remove noise, smooth data | Filter type, window size |
| Decimation | Reduce data rate | Decimation factor |
Real-Time Implementation for Microcontrollers:
For small microcontrollers, keep filter state tiny:
- EMA needs one state variable and one shift value. The update is
state += (input - state) >> shift. - A five-sample median filter needs only a five-value ring buffer, then sorts a copy and returns the middle value.
- Avoid floating point if the MCU is slow; fixed-point integer arithmetic is usually enough for sensor smoothing.
Optional Fixed-Point EMA Pattern
int32_t ema_update(int32_t input) {
state += (input - state) >> shift;
return state;
}Performance Comparison (ESP32, 240MHz):
| Filter | Execution Time | RAM Usage | Latency (samples) |
|---|---|---|---|
| Moving Average (N=10) | 0.8 us | 40 bytes | 5 |
| EMA (fixed-point) | 0.2 us | 8 bytes | ~3 |
| Median (N=5) | 1.5 us | 10 bytes | 2 |
| IIR Low-pass | 0.3 us | 8 bytes | ~2 |
| Kalman (1D) | 2.0 us | 16 bytes | ~1 |
15.4.1 Voltage Level Mismatch
Pitfall: Voltage Level Mismatch Between Sensor and Microcontroller
The Mistake: Connecting 5V sensor outputs directly to 3.3V microcontroller inputs, or powering 3.3V sensors from 5V rails.
The Fix: Always verify voltage compatibility and use level shifting when needed:
- 5V sensor to 3.3V MCU: Voltage divider (10k + 20k gives 3.3V from 5V) or bidirectional level shifter (BSS138-based)
- 3.3V sensor to 5V MCU: Usually OK for digital, but use level shifter for reliable operation
- I2C level shifting: Use dedicated I2C level shifters (PCA9306, TXS0102) – avoid TXB0104 which is push-pull only and incompatible with I2C open-drain signaling
Specific examples:
- ESP32 GPIO absolute max: 3.6V. 5V input = instant damage
- Raspberry Pi GPIO: 3.3V max. 5V input damages SOC
- Arduino Uno: 5V tolerant, but analog reference still 5V
15.4.1.1 Voltage Divider Calculator
Design a resistive voltage divider for level shifting. The output voltage is \(V_{out} = V_{in} \times \frac{R_2}{R_1 + R_2}\).
15.4.2 Self-Heating Errors
Pitfall: Sensor Self-Heating Causing Temperature Errors
The Mistake: Continuously powering temperature sensors and taking rapid readings without accounting for self-heating.
The Fix: Implement duty-cycled sensing with thermal recovery time:
- DHT22: Power consumption 1.5 mW during measurement. Allow minimum 2 seconds between readings (datasheet requirement). Self-heating error: ~0.3 °C with continuous polling
- DS18B20: 1.5 mA active current at 5 V = 7.5 mW. Use 750 ms conversion time, then power down. Self-heating: ~0.1 °C with 1 Hz sampling
- NTC Thermistors: Self-heating = \(I^2 \times R\). With 10 k\(\Omega\) thermistor at 100 \(\mu\)A: P = 0.1 mW (negligible). At 1 mA: P = 10 mW (significant)
Best practice: Power sensor only during measurement. If continuous monitoring needed, use 10-second intervals minimum for temperature sensors.
15.4.2.1 Self-Heating Power Calculator
Calculate the self-heating power dissipation for resistive sensors (thermistors, RTDs, strain gauges).
15.5 Summary
Key signal processing takeaways:
| Concept | Rule | Related To | Why It Matters |
|---|---|---|---|
| Sampling | Sample at >2x highest frequency | Aliasing | Prevents false low-frequency patterns |
| Outlier Rejection | Use median filter (window 5-7) | Spike noise | Sorts values, picks middle, rejects spikes completely |
| Noise Smoothing | Use moving average or EMA | Gaussian noise | Reduces random noise while preserving trends |
| Calibration | Recalibrate periodically (quarterly) | Sensor aging | Temperature cycling and contamination degrade accuracy |
| Voltage Matching | Use level shifters for 5V to 3.3V | GPIO protection | BSS138 or resistor divider prevents MCU damage |
Critical Mistakes to Avoid
- Never sample below Nyquist rate – You will see phantom frequencies that do not exist
- Never use raw ADC values for control – Noise causes oscillation and false triggers
- Never connect 5V sensors to 3.3V MCUs without protection – Instant, permanent damage
- Never assume factory calibration lasts forever – Drift is inevitable; recalibrate quarterly
15.6 Try It Yourself
Test your understanding by implementing a filter from scratch:
Exercise: Implement a Median Filter
Challenge: Write a median filter function that removes outlier spikes from ultrasonic distance sensor readings.
Given: An ultrasonic sensor outputs: [5, 5, 250, 5, 6, 5, 180, 5, 6, 5] cm
Your task: Implement a median filter with window size 5 that removes the 250cm and 180cm spikes.
Click for solution
from collections import deque
class MedianFilter:
def __init__(self, window_size=5):
self.buffer = deque(maxlen=window_size)
def filter(self, value):
self.buffer.append(value)
if len(self.buffer) < self.buffer.maxlen:
return value
ordered = sorted(self.buffer)
return ordered[len(ordered) // 2]
readings = [5, 5, 250, 5, 6, 5, 180, 5, 6, 5]
mf = MedianFilter(window_size=5)
for reading in readings:
print(reading, "->", mf.filter(reading))[5, 5, 250, 5, 6] into [5, 5, 5, 6, 250] and returns the middle value (5), rejecting the 250 cm outlier. During the fill-up phase, unfiltered values pass through; production code should wait until the buffer is full before acting on readings.
Common Pitfalls
1. Filter Window Too Large for Fast-Changing Signals
A moving average window sized for a slow temperature sensor will blur rapid vibration or impact events into meaningless flat lines. Match the filter window to the expected rate of change of the measured quantity — use a short window (3-5 samples) for fast signals and a longer window (10-50 samples) for slow, noisy signals.
2. Aliasing from Insufficient Sampling Rate
Sampling a 50 Hz vibration signal at 60 Hz (below 2x = 100 Hz Nyquist minimum) creates a false 10 Hz component in the output that does not exist in the real signal. Always sample at at least 2x the highest frequency component present in the sensor signal, and use an anti-aliasing filter before the ADC.
3. Calibrating After Filtering When the Opposite is Required
Some calibration algorithms (like two-point linear calibration) should be applied to raw ADC values before filtering, while others work better on filtered values. Define and document which stage of the processing pipeline calibration occurs at, and be consistent between calibration capture and normal operation.
4. Applying Median Filter to Streaming Data Without a Buffer
The median filter requires storing a window of recent samples to find the middle value. Implementing it without a properly sized ring buffer causes it to compare only newly arrived samples, defeating its outlier rejection purpose. Pre-allocate the full filter window buffer and initialize it before beginning normal sensor operation.
15.7 What’s Next
Now that you can apply signal processing techniques to sensor data: