545 Sensor Signal Processing and Filtering

Learning Objectives

After completing this chapter, you will be able to:

Understand the Nyquist sampling theorem and avoid aliasing
Apply digital filters (moving average, EMA, median) to sensor data
Implement hardware and software signal conditioning
Detect and compensate for sensor calibration drift
Choose appropriate filters for different noise types

545.1 Prerequisites

Sensor Introduction: Basic sensor concepts
Sensor Specifications: Accuracy, precision, resolution

545.2 Sensor Data Acquisition Pitfalls

These common mistakes cause incorrect sensor readings, false alarms, and wasted development time.

545.3 Common Pitfall: Ignoring Nyquist Sampling Rate (Aliasing)

The Mistake

Sampling signals below twice their highest frequency component, causing false low-frequency patterns to appear.

Symptoms: - False patterns appear in data that don’t exist in the real signal - High-frequency events (vibrations, transients) are missed entirely - Frequency analysis shows phantom signals at incorrect frequencies

Why it happens: Developers choose sample rates based on storage/bandwidth constraints rather than signal characteristics. The Nyquist theorem requirement (sample at >2x the highest frequency) isn’t understood.

How to diagnose: 1. Identify the highest frequency component in your signal of interest 2. Compare your sample rate to 2x that frequency 3. Look for patterns that vary with sample rate changes 4. Use an oscilloscope or high-speed capture to see the true signal

The fix:

# WRONG: Sampling 60Hz vibration at 50Hz
# Result: False 10Hz signal appears (aliasing)
while True:
    reading = sensor.read()
    time.sleep(0.02)  # 50Hz sample rate - causes aliasing!

# CORRECT: Sample at least 2x highest frequency of interest
# For 60Hz signal, sample at 120Hz minimum (200Hz recommended)
while True:
    reading = sensor.read()
    time.sleep(0.005)  # 200Hz sample rate - safe

# BEST: Use anti-aliasing filter before ADC
# Hardware low-pass filter removes frequencies above Nyquist
filtered_signal = low_pass_filter(raw_signal, cutoff=sampling_rate/2)
adc_value = adc.read(filtered_signal)

Prevention: Sample at >2x the highest frequency of interest (4-10x is better practice). Use hardware anti-aliasing filters (low-pass RC filter) before the ADC.

Show code

{
  const container = document.getElementById('kc-signal-1');
  if (container && typeof InlineKnowledgeCheck !== 'undefined') {
    container.innerHTML = '';
    container.appendChild(InlineKnowledgeCheck.create({
      question: "A vibration sensor on an industrial motor detects bearing defects at 120 Hz. You're sampling at 100 Hz and seeing strange 20 Hz patterns in your data that don't match the motor's actual behavior. What's happening?",
      options: [
        {text: "The sensor is broken and needs replacement", correct: false, feedback: "The sensor is likely working fine. The problem is in your sampling approach, not the hardware."},
        {text: "Aliasing - the 120 Hz signal appears as 20 Hz because you're sampling below Nyquist", correct: true, feedback: "Exactly right! When you sample a 120 Hz signal at 100 Hz (below the 240 Hz Nyquist rate), aliasing creates a phantom signal at |120-100| = 20 Hz. You need to sample at least 240 Hz (and ideally 480+ Hz) to capture the true signal."},
        {text: "The motor has a secondary vibration at 20 Hz", correct: false, feedback: "While motors can have multiple vibration frequencies, the 20 Hz pattern appearing specifically when sampling at 100 Hz is the signature of aliasing, not a real physical phenomenon."},
        {text: "Electrical interference from the motor's power supply", correct: false, feedback: "Power supply interference typically appears at 50/60 Hz (mains frequency) or its harmonics, not at 20 Hz. This is classic aliasing behavior."}
      ],
      difficulty: "hard",
      topic: "sensors"
    }));
  }
}

545.4 Common Pitfall: Ignoring Sensor Calibration Drift

The Mistake

Using factory calibration forever, assuming sensors maintain accuracy over time.

Symptoms: - Gradual accuracy degradation over weeks/months - Systematic bias in readings (consistently high or low) - False anomaly detections as sensors drift past thresholds

Why it happens: Sensors drift due to aging, temperature cycling, humidity exposure, and contamination. Factory calibration is a snapshot that degrades over time.

The fix:

# WRONG: Using factory calibration forever
def read_temperature():
    return adc.read() * FACTORY_SCALE + FACTORY_OFFSET

# CORRECT: Track and compensate for drift
class CalibratedSensor:
    def __init__(self, drift_rate_per_day=0.01):
        self.last_calibration = datetime.now()
        self.drift_rate = drift_rate_per_day
        self.scale = FACTORY_SCALE
        self.offset = FACTORY_OFFSET

    def read_temperature(self):
        raw = adc.read()
        days_since_cal = (datetime.now() - self.last_calibration).days
        drift_compensation = self.drift_rate * days_since_cal
        return raw * self.scale + self.offset + drift_compensation

    def recalibrate(self, reference_value):
        """Call with known reference temperature"""
        actual = self.read_raw()
        self.offset = reference_value - (actual * self.scale)
        self.last_calibration = datetime.now()

Prevention: Implement periodic recalibration procedures (quarterly for many sensors). Use reference sensors for cross-validation.

545.5 Common Pitfall: Using Raw Sensor Data Without Filtering

The Mistake

Using raw ADC values directly for decisions, displays, and storage without noise filtering.

Symptoms: - Noisy, jumpy dashboards and charts - False anomaly alerts from noise spikes - Incorrect trend analysis due to noise obscuring patterns - Unstable control loops that oscillate

The fix:

from collections import deque

# WRONG: Using raw ADC values directly
while True:
    temp = adc.read()
    if temp > THRESHOLD:
        alert()  # Noise spike causes false alert!
    display(temp)  # Noisy, jumpy display

# CORRECT: Apply moving average filter
readings = deque(maxlen=10)

def read_filtered():
    raw = adc.read()
    readings.append(raw)
    return sum(readings) / len(readings)

# BETTER: Exponential moving average (responds faster to changes)
class EMAFilter:
    def __init__(self, alpha=0.1):
        self.alpha = alpha  # 0.1 = smooth, 0.5 = responsive
        self.ema = None

    def filter(self, value):
        if self.ema is None:
            self.ema = value
        else:
            self.ema = self.alpha * value + (1 - self.alpha) * self.ema
        return self.ema

# BEST: Median filter for outlier rejection
def median_filter(new_value, buffer, size=5):
    buffer.append(new_value)
    if len(buffer) > size:
        buffer.pop(0)
    return sorted(buffer)[len(buffer) // 2]

545.6 Digital Filter Selection Guide

Noise Type	Best Filter	Parameters	Use Case
Random Gaussian	Moving Average	Window = 10-20	Temperature, humidity
Spikes/Outliers	Median	Window = 5-7	Distance sensors, IR
High-frequency	IIR Low-pass	fc = signal_bw/4	Vibration, audio
Known statistics	Kalman	Q, R from data	IMU, tracking
50/60 Hz interference	Notch filter	fn = 50 or 60 Hz	Analog sensors near AC

Show code

{
  const container = document.getElementById('kc-signal-2');
  if (container && typeof InlineKnowledgeCheck !== 'undefined') {
    container.innerHTML = '';
    container.appendChild(InlineKnowledgeCheck.create({
      question: "Your ultrasonic distance sensor occasionally produces wildly incorrect readings (e.g., 5cm, 5cm, 250cm, 5cm, 5cm) due to acoustic reflections. Which filter is BEST for removing these outliers while preserving true distance changes?",
      options: [
        {text: "Moving average filter with window size 10", correct: false, feedback: "Moving average includes the outlier (250cm) in the calculation, which would significantly shift the average. For example, average of [5,5,250,5,5] = 54cm - way off from the true 5cm."},
        {text: "Median filter with window size 5", correct: true, feedback: "Perfect! Median filter sorts the values and takes the middle one. For [5,5,250,5,5], the sorted values are [5,5,5,5,250], median = 5cm. The outlier is completely rejected without affecting the result. This is why median filters are ideal for spike/outlier rejection."},
        {text: "Exponential moving average with alpha = 0.1", correct: false, feedback: "EMA would still be influenced by the outlier, just more gradually. The 250cm spike would slowly push the filtered value upward, then slowly recover - creating a distorted response."},
        {text: "Low-pass filter with 1 Hz cutoff", correct: false, feedback: "Low-pass filters smooth high-frequency noise but don't specifically reject outliers. The 250cm spike would still pass through (it's not high-frequency, it's just wrong)."}
      ],
      difficulty: "medium",
      topic: "sensors"
    }));
  }
}

545.7 Signal Conditioning Chain

The complete signal conditioning chain for a sensor:

Raw Sensor -> Amplification -> Filtering -> ADC -> Digital Filtering -> Output
              (gain)          (analog)           (decimation, noise removal)

Real-Time Implementation for Microcontrollers:

// ESP32/Arduino optimized filter implementations

// Fixed-point EMA (no floating point)
typedef struct {
    int32_t state;
    uint8_t shift;  // alpha = 1/(2^shift), e.g., shift=3 -> alpha=0.125
} ema_filter_t;

int32_t ema_filter_update(ema_filter_t *f, int32_t input) {
    // Fixed-point EMA: state = state + (input - state) >> shift
    f->state += (input - f->state) >> f->shift;
    return f->state;
}

// Ring buffer median filter
typedef struct {
    int16_t buffer[5];
    uint8_t index;
} median_filter_t;

int16_t median_filter_update(median_filter_t *f, int16_t input) {
    f->buffer[f->index++] = input;
    if (f->index >= 5) f->index = 0;

    // Sort (optimized for small arrays)
    int16_t sorted[5];
    memcpy(sorted, f->buffer, sizeof(sorted));
    for (int i = 0; i < 4; i++) {
        for (int j = i+1; j < 5; j++) {
            if (sorted[i] > sorted[j]) {
                int16_t tmp = sorted[i];
                sorted[i] = sorted[j];
                sorted[j] = tmp;
            }
        }
    }
    return sorted[2];  // Middle element
}

Performance Comparison (ESP32, 240MHz):

Filter	Execution Time	RAM Usage	Latency (samples)
Moving Average (N=10)	0.8 us	40 bytes	5
EMA (fixed-point)	0.2 us	8 bytes	~3
Median (N=5)	1.5 us	10 bytes	2.5
IIR Low-pass	0.3 us	8 bytes	~2
Kalman (1D)	2.0 us	16 bytes	~1

545.8 Voltage Level Mismatch Protection

Pitfall: Voltage Level Mismatch Between Sensor and Microcontroller

The Mistake: Connecting 5V sensor outputs directly to 3.3V microcontroller inputs, or powering 3.3V sensors from 5V rails.

The Fix: Always verify voltage compatibility and use level shifting when needed:

5V sensor to 3.3V MCU: Voltage divider (10k + 20k gives 3.3V from 5V) or bidirectional level shifter (BSS138-based)
3.3V sensor to 5V MCU: Usually OK for digital, but use level shifter for reliable operation
I2C level shifting: Use dedicated I2C level shifters (PCA9306, TXB0104)

Specific examples: - ESP32 GPIO absolute max: 3.6V. 5V input = instant damage - Raspberry Pi GPIO: 3.3V max. 5V input damages SOC - Arduino Uno: 5V tolerant, but analog reference still 5V

545.9 Self-Heating Errors

Pitfall: Sensor Self-Heating Causing Temperature Errors

The Mistake: Continuously powering temperature sensors and taking rapid readings without accounting for self-heating.

The Fix: Implement duty-cycled sensing with thermal recovery time:

DHT22: Power consumption 1.5mW during measurement. Allow minimum 2 seconds between readings (datasheet requirement). Self-heating error: ~0.3C with continuous polling
DS18B20: 1.5mA active current at 5V = 7.5mW. Use 750ms conversion time, then power down. Self-heating: ~0.1C with 1Hz sampling
NTC Thermistors: Self-heating = I^2 x R. With 10kohm thermistor at 100uA: P = 0.1mW (negligible). At 1mA: P = 10mW (significant)

Best practice: Power sensor only during measurement. If continuous monitoring needed, use 10-second intervals minimum for temperature sensors.

545.10 Summary

Key signal processing takeaways:

Sample at >2x highest frequency to avoid aliasing
Use median filters for outlier rejection
Use moving average or EMA for general noise smoothing
Calibrate regularly to compensate for drift
Match voltage levels between sensors and microcontrollers

545.11 What’s Next

Now that you understand signal processing:

To learn sensor types: Sensor Classification - Categories and output types
To understand calibration: Calibration Techniques - How to calibrate sensors
To avoid mistakes: Common Mistakes - Top 10 pitfalls

Continue to Sensor Classification ->