45 Edge Processing Patterns

analytics-ml

edge

patterns

processing

45.1 Start With the Story

Picture an IoT team using the ideas in Edge Processing Patterns during a live operations review. A device has produced messy evidence, an analytic step is about to change an alert or control decision, and someone has to explain why the result should be trusted.

Read this page as that path from sensor evidence to accountable action. Start with what the system observes, keep the model or data treatment visible, and finish with the check that would convince an operator, maintainer, or auditor to act.

In 60 Seconds

Edge computing uses four primary processing patterns – Filter, Aggregate, Infer, and Store-Forward – each optimized for different IoT scenarios. Choosing the right pattern depends on whether you need bandwidth reduction, real-time response, ML-based detection, or resilience during network outages, and cost analysis shows edge computing only saves money at scale or when latency and privacy requirements justify hardware investment.

Chapter Roadmap

This chapter turns patterns into placement decisions:

First separate Filter, Aggregate, Infer, and Store-Forward by the operational problem each one solves.
Then test threshold filtering, aggregation, inference, and outage buffering.
Next compare edge ML, cloud ML, batch, streaming, and cost breakpoints.
Finally apply the pattern mix to the Danfoss refrigeration case and practice quizzes.

Checkpoints recap the logic; optional calculators can wait.

45.2 Learning Objectives

By the end of this chapter, you will be able to:

Apply Four Edge Processing Patterns: Implement Filter, Aggregate, Infer, and Store-Forward patterns for different IoT scenarios
Select Optimal Patterns: Match processing patterns to application requirements based on latency, bandwidth, and reliability needs
Evaluate Trade-offs: Compare edge ML inference versus cloud ML, and batch processing versus real-time streaming
Avoid Common Mistakes: Understand when edge processing saves costs versus when it adds unnecessary complexity

Phoebe’s Field Notes: Aggregation Cannot Fix a Sample Rate

Phoebe the physics guide

Phoebe’s Why

The Aggregate pattern’s whole value proposition is that min, max, average, and standard deviation over a window can stand in for thousands of raw samples. That trade only works if every one of those raw samples is real. Sampling below twice the fastest real component in the signal does not add noise around the true value – it substitutes a fake, lower frequency for a real, higher one, and no statistic computed afterward can tell the difference. This is also a reminder that a “kHz” here is not a “kHz” everywhere in this book: a 1 kHz sample rate is a clock ticking in time, while a kHz radio band is an electromagnetic oscillation in space – both use the same unit because both count cycles per second, but folding one into the other is a unit-of-measure mistake, not a physics one.

The Derivation

Sampling copies the signal spectrum every $f_s$; the safe band before copies overlap is

\[f_s \geq 2f_{max}\]

A component above that band does not disappear – it reappears at

\[f_{alias} = |f_{signal} - k f_s|\]

For contrast, a radio’s carrier frequency and wavelength obey a different relationship entirely, fixed by the speed of light rather than a sample clock:

\[\lambda = \frac{c}{f}\]

Quantization adds a separate, bounded noise floor on top of whatever sampling already captured or lost:

\[q = \frac{V_{ref}}{2^N}, \qquad \mathrm{SNR}_{dB} = 6.02N+1.76\]

Worked Numbers: The Chapter’s Own Two Sample Rates

200 temperature sensors at 1 Hz (this chapter’s bandwidth-math scenario): Nyquist-safe content tops out at $f_s/2 = 0.5$ Hz. Ordinary room or HVAC temperature drifts on the order of minutes to hours, far under 0.5 Hz, so 1 Hz sampling has enormous safety margin before aggregation even starts
Vibration sensors at 1 kHz (this chapter’s Danfoss motor-monitoring scenario): the safe band is $f_s/2 = 500$ Hz. A bearing-defect harmonic at, say, 650 Hz – entirely plausible on a motor spinning at a few thousand RPM – is above that band, so it folds to $|650 - 1(1000)| = 350$ Hz
That aliased 350 Hz signal is not filtered out by the Aggregate pattern’s own code; it is averaged, min/maxed, and reported as if it were a real 350 Hz vibration, feeding a false signature into whatever anomaly threshold the Infer pattern applies downstream
The fix is a physical one, not a software one: an anti-alias filter ahead of the ADC, sized to the sensor’s actual sample rate, must remove content above 500 Hz before the 1,000-sample window this chapter’s own aggregate_pattern code collects ever sees it

The 250x and 99.89% bandwidth reductions this chapter computes for the Aggregate pattern are real savings – but only for samples that were valid in the first place. Aggregation compresses information; it has no way to manufacture the information back once aliasing has already destroyed it.

Edge Processing Patterns Check

Key Concepts

In-network processing: Executing data transformations within the network infrastructure (on switches, gateways, or routers) rather than at endpoints, reducing end-to-end latency and backbone bandwidth usage.
Stream operator: A discrete processing step in a streaming pipeline (filter, map, aggregate, join) that transforms one stream into another; composing stream operators builds a complete edge processing pipeline.
Event windowing: Grouping stream events into finite sets for batch processing based on time (tumbling windows, sliding windows) or event count (count windows) before applying aggregation or detection operators.
Stateful processing: Stream processing that maintains state between events (running averages, counters, session trackers) in contrast to stateless operations (format conversion, field extraction) that treat each event independently.
Processing latency budget: The maximum time allowed for all edge processing steps combined before data must be acted upon or forwarded, constraining the complexity and number of processing operators that can be applied.

In-network processing can be organised as a sense-and-execution pipeline rather than a single gateway script. Sensors, switches, routers, gateways, and actuators each hold part of the decision path: classify the incoming record, combine it with nearby context, choose whether the result belongs locally or upstream, and execute only the bounded action. Body-and-brain architectures use that split deliberately: the body performs local sensing and actuation, while the brain coordinates policy, learning, and cross-site optimisation.

For Beginners: Edge Data Processing

Edge data processing means analyzing IoT data right where it is collected, before sending anything to the cloud. Think of a store manager who handles routine customer questions on the spot instead of calling headquarters for every decision. This makes your IoT system faster, cheaper, and more reliable.

45.3 Prerequisites

Before diving into this chapter, you should be familiar with:

Edge Compute Patterns Overview: Introduction to edge computing concepts
IoT Reference Model: Understanding the seven-level architecture and Level 3 processing

Use those prerequisites as the frame: what stays local, and what evidence moves upstream?

45.4 Edge Processing Patterns

Edge computing employs four primary data processing patterns, each optimized for different IoT scenarios:

Filter

Send only threshold alerts.

Aggregate

Send summaries instead of raw streams.

Infer

Run ML locally and send anomalies.

Store-Forward

Buffer data during outages and sync later.

Mobile-friendly summary of the four core edge processing patterns.

Latency-Based Processing Location

This view helps decide whether a processing step belongs on the edge device, an intermediate fog gateway, or the cloud before selecting the specific pattern:

Decision tree that maps strict latency requirements under 10 milliseconds to edge processing, moderate latency requirements to fog processing, and delay-tolerant analytics to cloud processing, with a hybrid approach recommended for balanced systems

Edge, fog, and cloud processing decision tree

Match latency, compute, and bandwidth constraints to the processing location, then choose Filter, Aggregate, Infer, or Store-Forward inside that tier.

Edge Processing Timeline

This view shows how data flows through edge processing over time:

Timeline showing raw edge data reduced through filtering, aggregation, and compression before a smaller transmitted dataset is sent onward

Edge processing data reduction timeline

Edge processing pipelines execute in milliseconds while cloud sync happens asynchronously.

45.5 Pattern Selection Guide

Pattern	Best For	Bandwidth Savings	Use Case Example
Filter	Threshold monitoring	99%+	Send alerts only when temperature exceeds safe limits
Aggregate	Trend analysis	99%+	Send hourly averages instead of per-second readings
Infer	Anomaly detection	95%+	Visual inspection - send defect alerts, not all images
Store & Forward	Intermittent connectivity	N/A	Remote sites with satellite links - sync when online

Checkpoint: Choose the Pattern

Filter and Aggregate both aim for 99%+ bandwidth savings, but Filter preserves exception timing while Aggregate preserves trends.
Infer fits local ML decisions when edge latency matters more than cloud accuracy.
Store-Forward is the reliability pattern when the link is intermittent rather than merely expensive.

Read each pattern as a contract: what the gateway keeps, forwards, and can reconstruct.

45.5.1 Pattern 1: Filter at Edge

The filter pattern applies simple threshold checks to determine which data warrants transmission:

# Example: Temperature threshold filter
def filter_reading(temp_celsius, threshold=80.0):
    if temp_celsius > threshold:
        return {"alert": True, "value": temp_celsius, "timestamp": now()}
    return None  # Don't transmit normal readings

Best for:

Alarm/alert systems where only exceptions matter
High-frequency sensors where most readings are routine
Bandwidth-constrained links where every byte counts

Try It: Filter Threshold Simulator

Show code

viewof filterSampleRate = Inputs.range([1, 100], {value: 10, step: 1, label: "Sample rate (Hz)"})
viewof filterThreshold = Inputs.range([50, 95], {value: 80, step: 1, label: "Threshold (°C)"})
viewof filterNormalPct = Inputs.range([80, 99], {value: 97, step: 1, label: "Normal readings (%)"})

Show code

filterReadingsPerDay = filterSampleRate * 86400
filterAlertReadings = Math.round(filterReadingsPerDay * (1 - filterNormalPct / 100))
filterBytesRaw = filterReadingsPerDay * 8
filterBytesFiltered = filterAlertReadings * 8
filterReduction = ((filterBytesRaw - filterBytesFiltered) / filterBytesRaw) * 100

html`<div style="background: var(--bs-light, #f8f9fa); padding: 1rem; border-radius: 8px; border-left: 4px solid #E74C3C; margin-top: 0.5rem;">
<h4 style="color: #2C3E50; margin-top: 0;">Filter Pattern Results</h4>
<p><strong>Total readings/day:</strong> ${filterReadingsPerDay.toLocaleString()}</p>
<p><strong>Alert readings/day:</strong> ${filterAlertReadings.toLocaleString()} (above ${filterThreshold}°C)</p>
<p><strong>Raw data/day:</strong> ${(filterBytesRaw / 1024).toFixed(1)} KB</p>
<p><strong>Filtered data/day:</strong> ${(filterBytesFiltered / 1024).toFixed(1)} KB</p>
<p style="font-size: 1.1em;"><strong>Bandwidth reduction:</strong> <span style="color: #16A085; font-weight: bold;">${filterReduction.toFixed(2)}%</span></p>
<p style="color: #7F8C8D; font-size: 0.9em;">Only ${(100 - filterNormalPct).toFixed(0)}% of readings exceed the threshold and get transmitted.</p>
</div>`

45.5.2 Pattern 2: Aggregate at Edge

The aggregate pattern computes statistics locally and sends summaries:

# Example: Statistical aggregation
import statistics

def aggregate_window(readings, window_minutes=60):
    return {
        "min": min(readings),
        "max": max(readings),
        "avg": sum(readings) / len(readings),
        "stddev": statistics.stdev(readings),
        "count": len(readings),
        "window_end": now()
    }

Putting Numbers to It

Consider 200 temperature sensors sampling at 1 Hz:

Without aggregation (transmit every reading): \[\text{Per sensor} = 1\text{ Hz} \times 4\text{ bytes} = 4\text{ bytes/s}\] \[\text{Fleet total} = 4\text{ bytes/s} \times 200 = 800\text{ bytes/s}\] \[\text{Daily bandwidth} = 800\text{ bytes/s} \times 86{,}400\text{ s} = 69.1\text{ MB/day}\]

With 1-hour aggregation (transmit min/max/avg/stddev every hour): \[\text{Per sensor/hour} = 4\text{ values} \times 4\text{ bytes} = 16\text{ bytes}\] \[\text{Per sensor/day} = 16\text{ bytes} \times 24\text{ hours} = 384\text{ bytes}\] \[\text{Fleet daily bandwidth} = 384\text{ bytes} \times 200 = 76{,}800\text{ bytes} = 75\text{ KB/day}\]

Reduction ratio: \[\frac{69.1\text{ MB} - 0.075\text{ MB}}{69.1\text{ MB}} = 99.89\% \text{ reduction}\]

Aggregation compresses 3,600 readings into 4 statistical values, achieving near-perfect bandwidth savings while preserving trend information for analytics.

Show code

aggBytesPerReading = 4
aggRawPerDay = aggSensors * aggSampleRate * 86400 * aggBytesPerReading
aggAggPerDay = aggSensors * 24 * (60 / aggWindowMin) * 4 * aggBytesPerReading
aggReduction = ((aggRawPerDay - aggAggPerDay) / aggRawPerDay) * 100

Show code

html`<div style="background: var(--bs-light, #f8f9fa); padding: 1rem; border-radius: 8px; border-left: 4px solid #3498DB; margin-top: 0.5rem;">
<p><strong>Raw data per day:</strong> ${(aggRawPerDay / 1e6).toFixed(2)} MB</p>
<p><strong>Aggregated data per day:</strong> ${(aggAggPerDay / 1e3).toFixed(2)} KB</p>
<p><strong>Bandwidth reduction:</strong> ${aggReduction.toFixed(2)}%</p>
<p><strong>Compression ratio:</strong> ${(aggRawPerDay / aggAggPerDay).toFixed(0)}x</p>
</div>`

Best for:

Trend analysis where individual readings are less important than patterns
Environmental monitoring (temperature, humidity, air quality)
Capacity planning and historical analysis

45.5.3 Pattern 3: Infer at Edge

The infer pattern runs machine learning models locally:

# Example: Anomaly detection at edge
model = load_tflite_model("anomaly_detector.tflite")

def infer_anomaly(sensor_data):
    prediction = model.predict(sensor_data)
    if prediction["anomaly_score"] > 0.85:
        return {"anomaly": True, "score": prediction["anomaly_score"],
                "features": sensor_data}
    return None  # Normal - don't transmit

Best for:

Visual inspection (defect detection in manufacturing)
Predictive maintenance (vibration analysis)
Safety systems requiring immediate response

Try It: Edge ML vs Cloud ML Comparison

Show code

viewof inferModelSize = Inputs.select(["2 MB (TFLite micro)", "5 MB (TFLite)", "10 MB (ONNX quantized)", "100 MB (Full TF)", "500 MB (Large model)"], {value: "5 MB (TFLite)", label: "Model size"})
viewof inferEdgeDevice = Inputs.select(["ESP32 (240 MHz)", "Cortex-M4 (168 MHz)", "Raspberry Pi 4 (1.5 GHz)", "Jetson Nano (GPU)"], {value: "Cortex-M4 (168 MHz)", label: "Edge device"})
viewof inferNetworkLatency = Inputs.range([10, 500], {value: 100, step: 10, label: "Network round-trip (ms)"})

Show code

inferEdgeLatency = inferEdgeDevice === "ESP32 (240 MHz)" ? 200 :
                   inferEdgeDevice === "Cortex-M4 (168 MHz)" ? 50 :
                   inferEdgeDevice === "Raspberry Pi 4 (1.5 GHz)" ? 15 :
                   5
inferEdgeAccuracy = inferModelSize === "2 MB (TFLite micro)" ? 82 :
                    inferModelSize === "5 MB (TFLite)" ? 88 :
                    inferModelSize === "10 MB (ONNX quantized)" ? 91 :
                    inferModelSize === "100 MB (Full TF)" ? 95 :
                    97
inferCloudLatency = inferNetworkLatency + 30
inferCloudAccuracy = inferModelSize === "2 MB (TFLite micro)" ? 82 :
                     inferModelSize === "5 MB (TFLite)" ? 88 :
                     inferModelSize === "10 MB (ONNX quantized)" ? 91 :
                     inferModelSize === "100 MB (Full TF)" ? 96 :
                     98
inferEdgeFits = (inferModelSize === "2 MB (TFLite micro)" || inferModelSize === "5 MB (TFLite)" || inferModelSize === "10 MB (ONNX quantized)")

html`<div style="background: var(--bs-light, #f8f9fa); padding: 1rem; border-radius: 8px; border-left: 4px solid #9B59B6; margin-top: 0.5rem;">
<h4 style="color: #2C3E50; margin-top: 0;">Edge vs Cloud Inference</h4>
<div class="edge-processing-compare-grid">
  <div class="edge-processing-compare-card" style="background: white; padding: 0.75rem; border-radius: 6px; border: 2px solid #16A085;">
    <h5 style="color: #16A085; margin-top: 0;">Edge Inference</h5>
    <p><strong>Latency:</strong> ${inferEdgeLatency} ms</p>
    <p><strong>Accuracy:</strong> ~${inferEdgeAccuracy}%</p>
    <p><strong>Fits on device:</strong> ${inferEdgeFits ? "Yes" : "No (too large)"}</p>
    <p><strong>Works offline:</strong> Yes</p>
  </div>
  <div class="edge-processing-compare-card" style="background: white; padding: 0.75rem; border-radius: 6px; border: 2px solid #3498DB;">
    <h5 style="color: #3498DB; margin-top: 0;">Cloud Inference</h5>
    <p><strong>Latency:</strong> ${inferCloudLatency} ms</p>
    <p><strong>Accuracy:</strong> ~${inferCloudAccuracy}%</p>
    <p><strong>Fits on device:</strong> N/A (cloud GPU)</p>
    <p><strong>Works offline:</strong> No</p>
  </div>
</div>
<p style="color: #7F8C8D; font-size: 0.9em; margin-top: 0.75rem;">${inferEdgeLatency < inferCloudLatency ? "Edge wins on latency by " + (inferCloudLatency - inferEdgeLatency) + " ms." : "Cloud is faster in this scenario."} ${inferEdgeFits ? "" : "Warning: selected model is too large for this edge device -- cloud or a more powerful edge device is recommended."}</p>
</div>`

45.5.4 Pattern 4: Store and Forward

The store-and-forward pattern handles intermittent connectivity:

# Example: Store-and-forward buffer
class EdgeBuffer:
    def __init__(self, max_size_mb=100):
        self.buffer = []
        self.max_size = max_size_mb * 1024 * 1024

    def store(self, reading):
        self.buffer.append(reading)
        self.compact_if_needed()

    def forward_when_connected(self):
        if network_available():
            batch = self.buffer.copy()
            if upload_batch(batch):
                self.buffer.clear()

Best for:

Remote sites (oil rigs, agricultural fields, offshore platforms)
Mobile assets (vehicles, shipping containers, drones)
Any deployment with unreliable connectivity

Try It: Store-Forward Buffer Simulator

Show code

viewof sfDataRate = Inputs.range([1, 100], {value: 10, step: 1, label: "Data rate (KB/min)"})
viewof sfBufferSize = Inputs.range([10, 500], {value: 100, step: 10, label: "Buffer size (MB)"})
viewof sfOutageDuration = Inputs.range([1, 48], {value: 6, step: 1, label: "Outage duration (hours)"})
viewof sfUploadSpeed = Inputs.range([10, 1000], {value: 100, step: 10, label: "Upload speed (KB/s)"})

Show code

sfDataPerMinKB = sfDataRate
sfDataPerOutageKB = sfDataPerMinKB * sfOutageDuration * 60
sfDataPerOutageMB = sfDataPerOutageKB / 1024
sfBufferUsedPct = Math.min((sfDataPerOutageMB / sfBufferSize) * 100, 100)
sfBufferOverflow = sfDataPerOutageMB > sfBufferSize
sfSyncTimeSec = Math.min(sfDataPerOutageKB, sfBufferSize * 1024) / sfUploadSpeed
sfSyncTimeMin = sfSyncTimeSec / 60

html`<div style="background: var(--bs-light, #f8f9fa); padding: 1rem; border-radius: 8px; border-left: 4px solid #E67E22; margin-top: 0.5rem;">
<h4 style="color: #2C3E50; margin-top: 0;">Buffer Status After ${sfOutageDuration}-Hour Outage</h4>
<div style="background: #ecf0f1; border-radius: 6px; height: 30px; margin: 0.5rem 0; overflow: hidden;">
  <div style="background: ${sfBufferOverflow ? '#E74C3C' : '#16A085'}; height: 100%; width: ${Math.min(sfBufferUsedPct, 100)}%; transition: width 0.3s; display: flex; align-items: center; justify-content: center; color: white; font-weight: bold; font-size: 0.85em;">
    ${sfBufferUsedPct.toFixed(1)}%
  </div>
</div>
<p><strong>Data accumulated:</strong> ${sfDataPerOutageMB.toFixed(1)} MB of ${sfBufferSize} MB buffer</p>
<p><strong>Buffer status:</strong> ${sfBufferOverflow ? '<span style="color: #E74C3C; font-weight: bold;">OVERFLOW -- oldest data lost (' + (sfDataPerOutageMB - sfBufferSize).toFixed(1) + ' MB dropped)</span>' : '<span style="color: #16A085; font-weight: bold;">OK -- all data retained</span>'}</p>
<p><strong>Sync time when reconnected:</strong> ${sfSyncTimeMin < 1 ? (sfSyncTimeSec.toFixed(0) + " seconds") : (sfSyncTimeMin.toFixed(1) + " minutes")} at ${sfUploadSpeed} KB/s</p>
<p style="color: #7F8C8D; font-size: 0.9em;">${sfBufferOverflow ? "Increase buffer size or reduce data rate to avoid data loss during long outages." : "Buffer is adequately sized for this outage scenario."}</p>
</div>`

Checkpoint: Run the Four Contracts

A threshold filter sends exceptions and discards routine readings.
A windowed aggregate compresses many samples into min, max, avg, stddev, count, and window metadata.
An inference step sends anomaly evidence, while Store-Forward keeps buffered readings until the network returns.

Next, decide which trade-off the deployment can defend when latency, accuracy, power, connectivity, and cost conflict.

45.6 Trade-off Analysis

45.6.1 Edge ML Inference vs Cloud ML Inference

Edge ML vs Cloud ML Inference

Option A (Edge ML): Deploy 2-10 MB quantized models on gateways (TensorFlow Lite, ONNX Runtime), achieving 10-50 ms inference latency with 85-92% accuracy for classification tasks on Cortex-M4/ESP32 class devices.

Option B (Cloud ML): Run 100 MB - 1 GB full-precision models in cloud (TensorFlow Serving, AWS SageMaker), achieving 200-500 ms round-trip latency with 95-99% accuracy using GPUs for complex pattern recognition.

Decision Factors: Choose edge ML when latency requirements are under 100 ms (safety systems, real-time control), privacy mandates data cannot leave premises (healthcare, industrial IP), or connectivity is unreliable (remote assets, mobile equipment). Choose cloud ML when model complexity requires GPU acceleration (video analytics, NLP), training data continuously improves models (recommendation systems), or centralized management simplifies updates. Hybrid architectures run simple detection at edge (anomaly flags, threshold checks) while cloud handles deep analysis (root cause diagnosis, long-term forecasting). A factory safety system needs 20 ms edge response, but weekly predictive maintenance reports can use cloud-trained models updated monthly.

45.6.2 Batch Processing vs Real-Time Streaming

Batch vs Streaming at Edge

Option A (Batch Processing): Collect sensor data locally for 1-60 minutes, process in batches, upload aggregated results. Reduces compute cycles by 80-95%, extends battery life 3-5x on constrained devices, but introduces 1-60 minute detection latency.

Option B (Real-Time Streaming): Process each sensor reading immediately as it arrives with sub-second latency. Enables instant anomaly detection and immediate control responses, but requires 5-10x more edge compute power and continuous network connectivity for cloud integration.

Decision Factors: Choose batch processing for delay-tolerant analytics (hourly environmental reports, daily asset utilization), bandwidth-constrained links (satellite, cellular metered), and battery-powered devices where duty cycling extends deployment life from weeks to years. Choose real-time streaming for safety-critical monitoring (gas leaks, machine failures requiring less than 1 s response), control systems with tight feedback loops (HVAC, robotics), and applications where stale data has no value (live tracking, interactive systems). A smart meter can batch hourly readings, but a pipeline pressure sensor must stream in real-time to detect ruptures within seconds.

45.7 When Edge Saves Money

Edge Does Not Always Save Cost

The Myth: “Processing at the edge always saves money compared to cloud processing.”

The Reality: Edge computing reduces bandwidth costs but introduces hardware and maintenance costs that can exceed cloud savings in many scenarios.

Real-World Example: Agricultural Soil Monitoring

A precision agriculture company deployed 10,000 soil moisture sensors across 5,000 acres:

Cloud-Only Approach (Initial Design):

10,000 sensors x 1 reading/hour x 24 hours x 30 days = 7.2M readings/month
Data size: 7.2M readings x 50 bytes = 360 MB/month
Cloud egress cost: 360 MB / 1,024 MB/GB x $0.09/GB = $0.03/month
Cloud compute/storage: ~$50/month
Total: ~$50/month

Edge Gateway Approach (Actual Deployment):

50 edge gateways ($200 each): $10,000 upfront capital
Edge aggregation reduces cloud traffic by 90%: 36 MB/month
Cloud costs: $5/month (minimal compute)
Gateway maintenance: $100/month (cellular data, power, repairs)
Amortized gateway cost over 3 years: $278/month
Total: $383/month

The Hidden Costs:

Hardware depreciation: $10,000 / 36 months = $278/month
Cellular connectivity: 50 gateways x $2/month = $100/month
Maintenance visits: 1 failed gateway/month x $150/visit = $150/month (later reduced with better hardware)
Software updates: Edge devices require OTA update infrastructure

When Edge Actually Saved Money (Year 2): After initial deployment issues were resolved and maintenance costs dropped to $50/month:

Edge total: $278 (amortized) + $100 (cellular) + $50 (maintenance) + $5 (cloud) = $433/month
Cloud total: $50/month

Edge remained more expensive until the company expanded to 100,000 sensors in Year 3. At that scale, cloud compute costs grew to ~$500/month while edge gateways were already provisioned with spare capacity, making the per-sensor cost of edge processing lower than cloud.

Key Takeaway: Edge computing provides the greatest cost savings when:

High data volumes overwhelm bandwidth costs (>10 GB/month)
Hardware is amortized over multi-year deployments
Maintenance is minimal (reliable hardware, remote updates)
Latency/privacy requirements justify the investment regardless of cost

Decision Framework:

Small deployments (<1,000 sensors, <1 GB/month): Cloud is usually cheaper
Medium deployments (1,000-10,000 sensors): Hybrid (edge aggregation + cloud) often optimal
Large deployments (>10,000 sensors, >10 GB/month): Edge pays for itself within 6-12 months

Edge vs Cloud Breakeven Calculator

Show code

viewof costSensorCount = Inputs.range([100, 50000], {value: 5000, step: 100, label: "Number of sensors"})
viewof costDataPerSensorMB = Inputs.range([0.01, 10], {value: 0.5, step: 0.01, label: "Data per sensor (MB/month)"})
viewof costCloudPerGB = Inputs.range([0.01, 1.00], {value: 0.09, step: 0.01, label: "Cloud egress ($/GB)"})
viewof costCloudCompute = Inputs.range([10, 500], {value: 50, step: 10, label: "Cloud compute ($/month)"})
viewof costGatewayPrice = Inputs.range([50, 500], {value: 200, step: 10, label: "Gateway price ($)"})
viewof costGatewayCount = Inputs.range([1, 200], {value: 25, step: 1, label: "Number of gateways"})
viewof costMaintenancePerGw = Inputs.range([0, 20], {value: 5, step: 1, label: "Maintenance per gateway ($/month)"})

Show code

costTotalDataMB = costSensorCount * costDataPerSensorMB
costTotalDataGB = costTotalDataMB / 1024
costCloudOnly = costCloudCompute + (costTotalDataGB * costCloudPerGB)
costEdgeReduction = 0.95
costEdgeDataGB = costTotalDataGB * (1 - costEdgeReduction)
costEdgeCloud = (costCloudCompute * 0.1) + (costEdgeDataGB * costCloudPerGB)
costEdgeHardwareMonthly = (costGatewayCount * costGatewayPrice) / 36
costEdgeMaintenance = costGatewayCount * costMaintenancePerGw
costEdgeTotal = costEdgeCloud + costEdgeHardwareMonthly + costEdgeMaintenance
costEdgeSaves = costCloudOnly > costEdgeTotal
costBreakevenSensors = costEdgeTotal > 0 ? Math.round(costSensorCount * (costEdgeTotal / costCloudOnly)) : 0

html`<div style="background: var(--bs-light, #f8f9fa); padding: 1rem; border-radius: 8px; border-left: 4px solid #2C3E50; margin-top: 0.5rem;">
<h4 style="color: #2C3E50; margin-top: 0;">Monthly Cost Comparison</h4>
<div class="edge-processing-compare-grid">
  <div class="edge-processing-compare-card" style="background: white; padding: 0.75rem; border-radius: 6px; border: 2px solid ${costEdgeSaves ? '#7F8C8D' : '#16A085'};">
    <h5 style="color: #3498DB; margin-top: 0;">Cloud-Only</h5>
    <p>Compute: $${costCloudCompute.toFixed(0)}</p>
    <p>Egress (${costTotalDataGB.toFixed(1)} GB): $${(costTotalDataGB * costCloudPerGB).toFixed(2)}</p>
    <p style="font-weight: bold; font-size: 1.1em;">Total: $${costCloudOnly.toFixed(2)}/month</p>
  </div>
  <div class="edge-processing-compare-card" style="background: white; padding: 0.75rem; border-radius: 6px; border: 2px solid ${costEdgeSaves ? '#16A085' : '#7F8C8D'};">
    <h5 style="color: #E67E22; margin-top: 0;">Edge + Cloud</h5>
    <p>Cloud (reduced): $${costEdgeCloud.toFixed(2)}</p>
    <p>Hardware (amortized 3yr): $${costEdgeHardwareMonthly.toFixed(2)}</p>
    <p>Maintenance: $${costEdgeMaintenance.toFixed(2)}</p>
    <p style="font-weight: bold; font-size: 1.1em;">Total: $${costEdgeTotal.toFixed(2)}/month</p>
  </div>
</div>
<p style="margin-top: 0.75rem; font-size: 1.05em;"><strong>Verdict:</strong> ${costEdgeSaves ? '<span style="color: #16A085;">Edge saves $' + (costCloudOnly - costEdgeTotal).toFixed(2) + '/month (' + (((costCloudOnly - costEdgeTotal) / costCloudOnly) * 100).toFixed(0) + '% savings)</span>' : '<span style="color: #E74C3C;">Cloud-only is cheaper by $' + (costEdgeTotal - costCloudOnly).toFixed(2) + '/month. Edge adds cost without sufficient data volume to justify it.</span>'}</p>
<p style="color: #7F8C8D; font-size: 0.9em;">Total fleet data: ${costTotalDataGB.toFixed(1)} GB/month. ${costTotalDataGB > 10 ? "High data volume -- edge typically justified." : "Low data volume -- cloud may be more cost-effective."}</p>
</div>`

Checkpoint: Price the Trade-off

Edge ML is attractive for 10-50 ms decisions, while cloud ML often pays a 200-500 ms round trip.
Batch processing can reduce compute cycles by 80-95%, but it introduces 1-60 minute detection latency.
Cost savings need enough volume: the chapter’s decision framework separates small deployments, medium deployments, and large deployments above 10,000 sensors or 10 GB/month.

Those trade-offs are clearest in a fleet where each stream can choose its own pattern.

45.7.1 Danfoss Refrigeration Edge

Scenario: Danfoss, a Danish climate technology company, manages refrigeration controllers in 40,000 European supermarkets. Each store has 12 display cases with 3 sensors each (temperature, defrost status, compressor current), reporting every 30 seconds.

Given:

40,000 stores x 12 cases x 3 sensors = 1,440,000 sensors
30-second intervals = 86,400 s / 30 s = 2,880 readings/sensor/day
Raw payload: 8 bytes per reading (timestamp + value)
Cellular backhaul at EUR 0.50/MB

Step 1: Calculate raw data volume

Metric	Value
Daily readings (total)	1,440,000 x 2,880 = 4.15 billion
Daily raw data	4.15 x 10^9 x 8 bytes = 33.2 GB
Daily cellular cost	33,200 MB x EUR 0.50/MB = EUR 16,600

Step 2: Apply edge patterns per sensor type

Sensor	Pattern	Logic	Reduction
Temperature	Filter	Transmit only if outside -22 to -18 C (freezer) or 2 to 4 C (chiller)	97% (normal 97% of time)
Defrost status	Filter	Transmit only on state change (start/stop)	99.5% (2 events/day vs 2,880)
Compressor current	Aggregate	Transmit 5-minute RMS + peak	90% (10 summaries/hour vs 120 raw)

Step 3: Calculate optimized volumes

Each sensor type accounts for one-third of the 1,440,000 sensors (480,000 sensors each):

Metric	Raw	After Edge	Savings
Temperature data/day	11.1 GB	333 MB	97%
Defrost data/day	11.1 GB	55 MB	99.5%
Compressor data/day	11.1 GB	1.1 GB	90%
Total daily	33.2 GB	1.49 GB	95.5%
Cellular cost/day	EUR 16,600	EUR 745	EUR 15,855 saved

Result: Edge processing reduces daily cellular costs from EUR 16,600 to EUR 745 – a 95.5% reduction. The fleet-wide gateway investment (40,000 gateways x EUR 120 = EUR 4.8 million) recovers in about 10 months through cellular savings of EUR 15,855 per day.

Show code

danfossSensors = danfossStores * danfossCasesPerStore * 3
danfossReadingsPerDay = 2880
danfossRawBytes = danfossSensors * danfossReadingsPerDay * 8
danfossRawMB = danfossRawBytes / 1e6
danfossRawCost = danfossRawMB * danfossCellularRate
danfossEdgeMB = danfossRawMB * (1 - 0.955)
danfossEdgeCost = danfossEdgeMB * danfossCellularRate
danfossSavings = danfossRawCost - danfossEdgeCost
danfossGatewayCost = 120
danfossFleetGatewayCost = danfossStores * danfossGatewayCost
danfossPaybackDays = danfossFleetGatewayCost / danfossSavings

Show code

html`<div style="background: var(--bs-light, #f8f9fa); padding: 1rem; border-radius: 8px; border-left: 4px solid #E67E22; margin-top: 0.5rem;">
<p><strong>Total sensors:</strong> ${danfossSensors.toLocaleString()}</p>
<p><strong>Raw data/day:</strong> ${(danfossRawMB / 1000).toFixed(1)} GB (${danfossRawMB.toFixed(0).toLocaleString()} MB)</p>
<p><strong>Raw cellular cost/day:</strong> EUR ${danfossRawCost.toFixed(0).toLocaleString()}</p>
<p><strong>After edge (95.5% reduction):</strong> EUR ${danfossEdgeCost.toFixed(0).toLocaleString()}/day</p>
<p><strong>Daily savings:</strong> EUR ${danfossSavings.toFixed(0).toLocaleString()}</p>
<p><strong>Fleet gateway investment:</strong> EUR ${danfossFleetGatewayCost.toLocaleString()} (${danfossStores.toLocaleString()} x EUR ${danfossGatewayCost})</p>
<p><strong>Payback period:</strong> ${danfossPaybackDays.toFixed(1)} days</p>
</div>`

Key Insight: The three sensor types in the same deployment use three different edge patterns. Temperature uses Filter (only exceptions matter), defrost uses Filter (only state changes matter), and compressor current uses Aggregate (trend data matters). Pattern selection is per-sensor, not per-deployment.

Checkpoint: Validate the Fleet Mix

The Danfoss case starts with 40,000 stores, 12 display cases per store, and 3 sensors per case.
The raw daily total is 33.2 GB, and the mixed edge plan reduces it to 1.49 GB.
The same calculation drops daily cellular cost from EUR 16,600 to EUR 745, but only because each sensor type gets the pattern that matches its evidence need.

Interactive Quiz: Match Concepts

Interactive Quiz: Sequence the Steps

Label the Diagram

45.8 Continue: Edge Stream Window Contracts

The four processing patterns above still need precise stream semantics before a gateway emits reliable metrics. Continue to Edge Stream Window Contracts for tumbling, sliding, and session window rules, state budgets, late-event policy, and replay metadata.

45.9 Summary

Four edge processing patterns address different IoT requirements: Filter (threshold alerts), Aggregate (statistical summaries), Infer (ML-based detection), and Store-Forward (intermittent connectivity)
Pattern selection depends on primary priority: bandwidth reduction, real-time response, reliability, or privacy
Edge ML trade-offs balance latency (10-50 ms edge vs 200-500 ms cloud) against accuracy (85-92% edge vs 95-99% cloud)
Batch vs streaming trade-offs balance power efficiency against detection latency
Cost analysis shows edge saves money only at scale (>10,000 sensors) or when latency/privacy requirements justify hardware investment
Hybrid architectures typically provide the best balance: edge for time-critical decisions, cloud for complex analytics

Key Takeaway

The four edge processing patterns – Filter, Aggregate, Infer, and Store-Forward – each solve a different problem. Filter reduces bandwidth by sending only exceptions. Aggregate computes local statistics for trend analysis. Infer runs ML models for intelligent detection. Store-Forward handles intermittent connectivity. Hybrid architectures combining edge speed with cloud analytics consistently outperform either approach alone, but edge only saves money at scale (>10,000 sensors) or when latency and privacy requirements justify hardware investment.

45.10 Concept Check

Quick Knowledge Check

Q: How do you choose between Filter, Aggregate, Infer, and Store-Forward patterns?

Filter when only exceptions matter (threshold violations). Aggregate for trend analysis (hourly statistics). Infer when ML-based detection adds value over simple rules. Store-Forward for intermittent connectivity (buffer during outages).

Q: When does edge computing NOT save money?

Edge computing costs more than cloud when: (1) data volumes are low (<1 GB/month), (2) deployment scale is small (<1,000 sensors), (3) bandwidth is free or cheap, or (4) data reduction factor is under 10x. The Danfoss supermarket example saves money at 40,000 stores, but a single store would lose money on the edge investment.

45.11 Concept Relationships

How This Concept Connects

Builds on:

Edge Compute Patterns Overview - Foundation concepts
Edge IoT Reference Model - Level 3 processing context

Four Patterns in Detail:

Filter Pattern - 99%+ bandwidth reduction by sending only threshold violations
Aggregate Pattern - Statistical summaries (min/max/avg) for trend analysis
Infer Pattern - ML models run locally, send anomaly alerts only
Store-Forward Pattern - Buffers data during outages, syncs when reconnected

Real-World Examples:

Danfoss Supermarkets - 95.5% data reduction across 40,000 stores using three different patterns
Agricultural Soil Monitoring - Cost analysis shows cloud-only cheaper until very large scale

45.12 See Also

Related Resources

Pattern Implementation:

Edge Patterns Practical Guide - Interactive tools and worked examples
Edge Cyber-Foraging - Opportunistic compute offloading

Architecture Context:

Edge, Fog, and Cloud Overview - Three-tier architecture
Multi-Sensor Data Fusion - Combining patterns for sensor fusion

Case Studies:

Danfoss Case Study (in chapter) - Three patterns for three sensor types
Agricultural Cost Analysis (in chapter) - When edge investment does not pay off

45.13 Try It Yourself

Four Edge Patterns Exercise

Scenario: Industrial motor monitoring with vibration sensors at 1 kHz sampling.

Pattern 1: Filter

def filter_pattern(vibration_g, threshold=5.0):
    """Send only threshold violations"""
    if vibration_g > threshold:
        return {"alert": True, "value": vibration_g, "pattern": "filter"}
    return None  # Don't transmit normal readings

# Test: 8 samples, 3 exceed threshold (5.0)
readings = [2.1, 3.5, 2.8, 6.2, 4.1, 7.8, 3.2, 5.9]
alerts = [filter_pattern(r) for r in readings if filter_pattern(r)]
print(f"Filter: {len(readings)} readings -> {len(alerts)} alerts")

Pattern 2: Aggregate

import statistics

def aggregate_pattern(readings_1sec):
    """Compute 1-second statistics from 1000 Hz samples"""
    return {
        "min": min(readings_1sec),
        "max": max(readings_1sec),
        "avg": statistics.mean(readings_1sec),
        "stddev": statistics.stdev(readings_1sec),
        "pattern": "aggregate"
    }

# Test: 1000 samples -> 4 values = 250x reduction
samples_1khz = [2.1 + (i * 0.01) for i in range(1000)]
summary = aggregate_pattern(samples_1khz)
print(f"Aggregate: 1000 samples -> 4 values = 250x reduction")

Pattern 3: Infer (simplified)

def infer_pattern(vibration_rms, threshold_trained=4.5):
    """Simplified ML inference: anomaly score based on trained threshold"""
    anomaly_score = vibration_rms / threshold_trained
    if anomaly_score > 1.2:  # 20% above normal
        return {"anomaly": True, "score": round(anomaly_score, 2),
                "pattern": "infer"}
    return None  # Normal operation

# Test: 5.8 g RMS is above trained baseline of 4.5 g
result = infer_pattern(5.8)
print(f"Infer: {result}")

Pattern 4: Store-Forward

class StoreForwardBuffer:
    def __init__(self, max_size_mb=100):
        self.buffer = []
        self.max_size = max_size_mb * 1024 * 1024

    def store(self, reading):
        self.buffer.append(reading)
        if len(self.buffer) > 1000:  # Simplified FIFO
            self.buffer.pop(0)

    def forward(self, network_available):
        if network_available and self.buffer:
            print(f"Forwarding {len(self.buffer)} buffered readings")
            self.buffer.clear()
            return True
        return False

# Test: buffer during outage, then forward
buffer = StoreForwardBuffer()
for i in range(50):
    buffer.store({"reading": i})

buffer.forward(network_available=False)  # No sync
print(f"Buffer size: {len(buffer.buffer)} readings")
buffer.forward(network_available=True)   # Sync successful
print(f"After sync: {len(buffer.buffer)} readings")

What to Observe:

Filter drastically reduces transmissions (99%+) but loses trend information
Aggregate preserves statistics while achieving 250x reduction
Infer requires trained model but enables intelligent detection
Store-Forward ensures no data loss during outages

Extension Challenge: Combine patterns: Filter + Aggregate. Send hourly aggregates for normal operation, immediate alerts for threshold violations.

Quiz: Edge Processing Patterns

45.14 What’s Next

Next Topic	Description
Edge Stream Window Contracts	Window semantics, state budgets, and replay metadata for edge processing outputs
Cyber-Foraging and Caching	Opportunistic compute offloading and caching strategies
Edge Patterns Practical Guide	Interactive tools, worked examples, and common pitfalls