Core Concept: Time-series methods exploit temporal patterns - anomalies are points where actual values significantly differ from what the model predicted based on historical patterns.
Why It Matters: IoT data is inherently temporal. A temperature of 30C might be normal at 2 PM but anomalous at 2 AM. Time-series methods capture these contextual patterns that simple statistical methods miss.
Key Takeaway: Use ARIMA when data has clear trends/seasonality, exponential smoothing for detecting level shifts, and STL decomposition when you need to separate seasonal patterns from anomalies.
1357.2 Prerequisites
Before diving into this chapter, you should be familiar with:
Anomaly Types: Understanding contextual anomalies that depend on temporal context
~15 min | Advanced | P10.C01.U03
1357.3 Introduction
IoT sensor data is inherently temporal - values are meaningful in sequence, not isolation. Time-series methods exploit temporal patterns for superior anomaly detection.
Core Concept: Model time series as a function of past values and past errors to forecast future values. Anomalies are points where actual value significantly differs from forecast.
When ARIMA Excels: - Data has clear trends or seasonal patterns - You need to forecast next value to compare against - Anomalies are deviations from expected trajectory
ARIMA Components: - AR (AutoRegressive): Value depends on previous values - I (Integrated): Differencing to make series stationary - MA (Moving Average): Value depends on previous errors
IoT Application Example:
from statsmodels.tsa.arima.model import ARIMAimport numpy as npclass ARIMADetector:def__init__(self, order=(1,1,1), threshold=3.0):""" order: (p, d, q) for ARIMA model threshold: number of std deviations for anomaly """self.order = orderself.threshold = thresholdself.history = []self.model =Nonedef train(self, historical_data):"""Train ARIMA model on historical data"""self.history =list(historical_data)self.model = ARIMA(self.history, order=self.order)self.model_fit =self.model.fit()def predict_and_detect(self, new_value):""" Predict next value and check if new_value is anomalous """iflen(self.history) <10:self.history.append(new_value)returnFalse, None, None# Forecast next value forecast =self.model_fit.forecast(steps=1)[0]# Calculate residual error residuals =self.model_fit.resid std_error = np.std(residuals)# Check if new value is anomalous error =abs(new_value - forecast) is_anomaly = error > (self.threshold * std_error)# Update history and retrain (in practice, retrain periodically)self.history.append(new_value)return is_anomaly, forecast, error# Example: Power consumption monitoring with daily patterns# Historical data: 24 hours of power usage (kW)historical_power = [12, 10, 9, 8, 8, 9, 15, 25, 30, 28, 26, 27, # Day pattern28, 29, 27, 25, 30, 35, 28, 22, 18, 15, 13, 11# Evening/night]detector = ARIMADetector(order=(2,1,2), threshold=2.5)detector.train(historical_power)# Next hour: should be ~12 kW, but reads 55 kW (anomaly)anomaly, forecast, error = detector.predict_and_detect(55)print(f"Expected: {forecast:.1f} kW")print(f"Actual: 55.0 kW")print(f"Error: {error:.1f} kW")print(f"Anomaly: {anomaly}")# Output: Expected: 12.3 kW, Actual: 55.0 kW, Error: 42.7 kW, Anomaly: True
1357.5 Exponential Smoothing
Core Concept: Weighted average where recent observations have more influence than older ones. Excellent for detecting level shifts and trend changes.
EWMA (Exponentially Weighted Moving Average):
EWMA(t) = alpha x X(t) + (1 - alpha) x EWMA(t-1)
Where:
- alpha = smoothing factor (0 < alpha < 1)
- alpha = 0.1: Heavily smoothed, slow to react
- alpha = 0.9: Light smoothing, fast to react
Use Case: Detecting sudden jumps in vibration, temperature, or current draw.
1357.6 Seasonal Decomposition (STL)
Core Concept: Separate time series into three components: Trend, Seasonal, and Residual. Anomalies show up in the residual component.
STL Process: 1. Trend: Long-term increase or decrease (e.g., equipment degradation) 2. Seasonal: Regular patterns (e.g., daily temperature cycles) 3. Residual: What’s left after removing trend and seasonality - this is where anomalies hide
Implementation:
from statsmodels.tsa.seasonal import seasonal_decomposeimport pandas as pdimport numpy as np# Temperature data with daily seasonalitytimestamps = pd.date_range('2024-01-01', periods=168, freq='H') # 1 weektemps = [20+5*np.sin(2*np.pi*i/24) + np.random.normal(0, 0.5)for i inrange(168)]# Inject anomaly at hour 100temps[100] =35# Sudden spikedf = pd.DataFrame({'timestamp': timestamps, 'temperature': temps})df.set_index('timestamp', inplace=True)# Decompose into trend, seasonal, residualdecomposition = seasonal_decompose(df['temperature'], model='additive', period=24)# Anomalies are large residualsresiduals = decomposition.residthreshold =3* residuals.std()anomalies =abs(residuals) > thresholdprint(f"Anomalies detected at hours: {df.index[anomalies].hour.values}")# Output: Anomalies detected at hours: [100]
Show code
{const container =document.getElementById('kc-anomaly-5');if (container &&typeof InlineKnowledgeCheck !=='undefined') { container.innerHTML=''; container.appendChild(InlineKnowledgeCheck.create({question:"A smart grid monitors power consumption with clear daily patterns (high during business hours, low at night) and weekly patterns (lower on weekends). You want to detect unusual consumption that deviates from these expected patterns. Which time-series method is most appropriate?",options: [ {text:"ARIMA with high-order differencing to remove all patterns",correct:false,feedback:"Incorrect. High-order differencing can make the series stationary but loses the seasonal context. You want to preserve seasonal patterns to detect deviations from them, not remove them entirely."}, {text:"Simple Z-score on raw readings",correct:false,feedback:"Incorrect. Z-score on raw data ignores temporal patterns entirely. A reading of 50 kW might be normal at 2 PM but anomalous at 2 AM - simple Z-score cannot distinguish these contexts."}, {text:"STL decomposition to separate trend, seasonality, and residuals, then detect anomalies in residuals",correct:true,feedback:"Correct! STL (Seasonal-Trend decomposition using Loess) explicitly separates daily/weekly seasonal patterns from the residual signal. Anomalies appear as large residuals - deviations that cannot be explained by the expected seasonal pattern. This directly addresses the temporal context."}, {text:"Exponential smoothing with very high alpha (0.99)",correct:false,feedback:"Incorrect. High alpha makes EWMA highly responsive to recent values but does not account for multiple seasonal patterns. Exponential smoothing is better for detecting level shifts than for seasonal pattern deviations."} ],difficulty:"medium",topic:"anomaly-detection" })); }}
1357.7 Method Comparison
Comparison of Time-Series Methods:
Method
Strength
Limitation
Best IoT Use Case
ARIMA
Captures complex temporal patterns
Requires stationarity, computationally expensive
Predictable systems (HVAC, production lines)
Exponential Smoothing
Fast, simple, adapts to level changes
Misses complex patterns
Real-time edge detection (motor current)
STL Decomposition
Handles seasonality explicitly
Needs full cycles of data (>=2 seasons)
Environmental monitoring (temp, humidity)
CautionPitfall: Using Static Thresholds for Dynamic Systems
The Mistake: Setting fixed anomaly detection thresholds (e.g., “alert if temperature > 80C”) based on initial observations, then leaving them unchanged as the system operates over months or years.
Why It Happens: Static thresholds are simple to implement and understand. Initial calibration produces reasonable results, creating false confidence. Gradual changes in “normal” baseline go unnoticed, and recalibration requires effort that gets deprioritized.
The Fix: Implement adaptive thresholds that evolve with system behavior:
Rolling baseline: Calculate thresholds based on recent history (e.g., last 7 days) rather than historical constants. Use exponentially weighted moving averages (EWMA) with decay factor 0.94-0.99 depending on expected change rate.
Contextual thresholds: Maintain separate thresholds for different operating modes - a motor at full load has different “normal” vibration than at idle. Use clustering to automatically discover operating modes from historical data.
Scheduled recalibration: For slow-drifting systems, automatically recalibrate monthly using unsupervised methods (rebuild IQR bounds, recompute seasonal decomposition).
Anomaly rate monitoring: If your system suddenly detects 10x more anomalies than baseline, investigate whether something changed in the environment OR if your thresholds have drifted out of calibration.
A factory motor that operated at 75C for 2 years may run at 82C after maintenance - that’s not an anomaly, it’s a new normal.
1357.8 Summary
Time-series methods excel at detecting contextual anomalies where temporal patterns matter:
ARIMA: Forecast-based detection for complex temporal patterns
Exponential Smoothing: Fast level-shift detection at the edge
STL Decomposition: Separates seasonality to isolate true anomalies
Key Takeaway: Use time-series methods when “normal” depends on when the reading occurred. Statistical methods miss these contextual patterns.