1326  Edge Computing: Processing Patterns

1326.1 Learning Objectives

By the end of this chapter, you will be able to:

  • Apply Four Edge Processing Patterns: Implement Filter, Aggregate, Infer, and Store-Forward patterns for different IoT scenarios
  • Select Optimal Patterns: Match processing patterns to application requirements based on latency, bandwidth, and reliability needs
  • Evaluate Trade-offs: Compare edge ML inference versus cloud ML, and batch processing versus real-time streaming
  • Avoid Common Mistakes: Understand when edge processing saves costs versus when it adds unnecessary complexity

1326.2 Prerequisites

Before diving into this chapter, you should be familiar with:

1326.3 Edge Processing Patterns

Edge computing employs four primary data processing patterns, each optimized for different IoT scenarios:

%%{init: {'theme': 'base', 'themeVariables': { 'primaryColor': '#2C3E50', 'primaryTextColor': '#fff', 'primaryBorderColor': '#16A085', 'lineColor': '#16A085', 'secondaryColor': '#E67E22', 'tertiaryColor': '#ecf0f1'}}}%%
flowchart TB
    subgraph Pattern1["1. Filter at Edge"]
        P1A[Raw Sensor Data<br/>100 readings/sec]
        P1B[Edge Filter<br/>Threshold Check]
        P1C[Cloud<br/>5 alerts/hour]

        P1A -->|"Only send if temp > 80C"| P1B
        P1B -->|"99.8% reduction"| P1C
    end

    subgraph Pattern2["2. Aggregate at Edge"]
        P2A[Raw Data<br/>1000 samples/min]
        P2B[Edge Aggregator<br/>Compute Statistics]
        P2C[Cloud<br/>1 summary/min]

        P2A -->|"Min, Max, Avg, StdDev"| P2B
        P2B -->|"99.9% reduction"| P2C
    end

    subgraph Pattern3["3. Infer at Edge"]
        P3A[Sensor Stream<br/>Video/Audio/Time-series]
        P3B[Edge ML Model<br/>Anomaly Detection]
        P3C[Cloud<br/>Anomalies Only]

        P3A -->|"Run inference locally"| P3B
        P3B -->|"Alert on defects"| P3C
    end

    subgraph Pattern4["4. Store and Forward"]
        P4A[Continuous Data<br/>All Readings]
        P4B[Edge Storage<br/>Local Buffer]
        P4C[Cloud<br/>Batch Upload]

        P4A -->|"Store locally"| P4B
        P4B -->|"Sync when connected"| P4C
    end

    style P1B fill:#16A085,stroke:#2C3E50,color:#fff
    style P2B fill:#16A085,stroke:#2C3E50,color:#fff
    style P3B fill:#E67E22,stroke:#2C3E50,color:#fff
    style P4B fill:#2C3E50,stroke:#16A085,color:#fff

Figure 1326.1: Four Edge Processing Patterns: Filter, Aggregate, Infer, Store-Forward

This view helps select the right edge pattern based on your specific requirements:

%% fig-alt: "Edge pattern selection guide organized by use case priority. For bandwidth reduction priority, choose Filter or Aggregate patterns. For real-time response priority, choose Infer pattern with local ML. For reliability priority in intermittent connectivity, choose Store-and-Forward pattern. For privacy priority when data cannot leave premises, choose local processing with any pattern. Shows decision criteria and expected outcomes for each selection path."
%%{init: {'theme': 'base', 'themeVariables': { 'primaryColor': '#2C3E50', 'primaryTextColor': '#fff', 'primaryBorderColor': '#16A085', 'lineColor': '#16A085', 'secondaryColor': '#E67E22', 'tertiaryColor': '#ecf0f1'}}}%%
flowchart TD
    Start[Select Edge Pattern] --> Priority{Primary<br/>Priority?}

    Priority -->|Bandwidth<br/>Reduction| BW[Filter + Aggregate<br/>99%+ data reduction]
    Priority -->|Real-time<br/>Response| RT[Infer at Edge<br/>Local ML models<br/>< 50ms latency]
    Priority -->|Reliability| REL[Store and Forward<br/>Offline operation<br/>Eventual consistency]
    Priority -->|Privacy| PRIV[All Local Processing<br/>Data never leaves<br/>premises]

    BW --> Example1[Smart Meters<br/>Environmental Sensors]
    RT --> Example2[Safety Systems<br/>Visual Inspection]
    REL --> Example3[Remote Sites<br/>Mobile Assets]
    PRIV --> Example4[Healthcare<br/>Industrial Secrets]

    style Start fill:#2C3E50,stroke:#16A085,color:#fff
    style BW fill:#16A085,stroke:#2C3E50,color:#fff
    style RT fill:#E67E22,stroke:#2C3E50,color:#fff
    style REL fill:#7F8C8D,stroke:#2C3E50,color:#fff
    style PRIV fill:#2C3E50,stroke:#16A085,color:#fff

Match your primary constraint to the optimal edge processing pattern.

This view shows how data flows through edge processing over time:

%% fig-alt: "Timeline view of edge data processing showing temporal progression. At T0 sensor generates raw data. At T+5ms edge filters irrelevant readings. At T+10ms remaining data is aggregated into statistics. At T+15ms local ML model performs inference. At T+20ms decision made locally if urgent or queued for cloud sync. At T+100ms batch upload of summaries to cloud. Shows how edge processing reduces latency for critical decisions while optimizing bandwidth for cloud transmission."
%%{init: {'theme': 'base', 'themeVariables': { 'primaryColor': '#2C3E50', 'primaryTextColor': '#fff', 'primaryBorderColor': '#16A085', 'lineColor': '#16A085', 'secondaryColor': '#E67E22', 'tertiaryColor': '#ecf0f1'}}}%%
flowchart LR
    subgraph T0["T+0ms"]
        S[Sensor<br/>Raw Data<br/>100 readings/s]
    end

    subgraph T5["T+5ms"]
        F[Filter<br/>Threshold Check<br/>5 relevant/s]
    end

    subgraph T10["T+10ms"]
        A[Aggregate<br/>Compute Stats<br/>1 summary/min]
    end

    subgraph T15["T+15ms"]
        I[Infer<br/>ML Model<br/>Anomaly?]
    end

    subgraph T20["T+20ms"]
        D{Decision}
        D -->|Urgent| Local[Local Action<br/>Immediate]
        D -->|Normal| Queue[Queue for<br/>Cloud Sync]
    end

    subgraph T100["T+100ms"]
        C[Cloud Upload<br/>Batch Summary]
    end

    S --> F --> A --> I --> D
    Queue --> C

    style S fill:#E67E22,stroke:#2C3E50,color:#fff
    style F fill:#16A085,stroke:#2C3E50,color:#fff
    style A fill:#16A085,stroke:#2C3E50,color:#fff
    style I fill:#E67E22,stroke:#2C3E50,color:#fff
    style Local fill:#27AE60,stroke:#2C3E50,color:#fff
    style C fill:#2C3E50,stroke:#16A085,color:#fff

Edge processing pipelines execute in milliseconds while cloud sync happens asynchronously.

Four Edge Processing Patterns: Filter pattern sends only threshold-exceeding events (99.8% reduction); Aggregate pattern computes statistics locally (99.9% reduction); Infer pattern runs ML models at edge and sends anomalies only; Store-and-forward pattern buffers data during outages and syncs when reconnected.

1326.4 Pattern Selection Guide

Pattern Best For Bandwidth Savings Use Case Example
Filter Threshold monitoring 99%+ Send alerts only when temperature exceeds safe limits
Aggregate Trend analysis 99%+ Send hourly averages instead of per-second readings
Infer Anomaly detection 95%+ Visual inspection - send defect alerts, not all images
Store & Forward Intermittent connectivity N/A Remote sites with satellite links - sync when online

1326.4.1 Pattern 1: Filter at Edge

The filter pattern applies simple threshold checks to determine which data warrants transmission:

# Example: Temperature threshold filter
def filter_reading(temp_celsius, threshold=80.0):
    if temp_celsius > threshold:
        return {"alert": True, "value": temp_celsius, "timestamp": now()}
    return None  # Don't transmit normal readings

Best for: - Alarm/alert systems where only exceptions matter - High-frequency sensors where most readings are routine - Bandwidth-constrained links where every byte counts

1326.4.2 Pattern 2: Aggregate at Edge

The aggregate pattern computes statistics locally and sends summaries:

# Example: Statistical aggregation
def aggregate_window(readings, window_minutes=60):
    return {
        "min": min(readings),
        "max": max(readings),
        "avg": sum(readings) / len(readings),
        "stddev": statistics.stdev(readings),
        "count": len(readings),
        "window_end": now()
    }

Best for: - Trend analysis where individual readings are less important than patterns - Environmental monitoring (temperature, humidity, air quality) - Capacity planning and historical analysis

1326.4.3 Pattern 3: Infer at Edge

The infer pattern runs machine learning models locally:

# Example: Anomaly detection at edge
model = load_tflite_model("anomaly_detector.tflite")

def infer_anomaly(sensor_data):
    prediction = model.predict(sensor_data)
    if prediction["anomaly_score"] > 0.85:
        return {"anomaly": True, "score": prediction["anomaly_score"],
                "features": sensor_data}
    return None  # Normal - don't transmit

Best for: - Visual inspection (defect detection in manufacturing) - Predictive maintenance (vibration analysis) - Safety systems requiring immediate response

1326.4.4 Pattern 4: Store and Forward

The store-and-forward pattern handles intermittent connectivity:

# Example: Store-and-forward buffer
class EdgeBuffer:
    def __init__(self, max_size_mb=100):
        self.buffer = []
        self.max_size = max_size_mb * 1024 * 1024

    def store(self, reading):
        self.buffer.append(reading)
        self.compact_if_needed()

    def forward_when_connected(self):
        if network_available():
            batch = self.buffer.copy()
            if upload_batch(batch):
                self.buffer.clear()

Best for: - Remote sites (oil rigs, agricultural fields, offshore platforms) - Mobile assets (vehicles, shipping containers, drones) - Any deployment with unreliable connectivity

1326.5 Trade-off Analysis

1326.5.1 Edge ML Inference vs Cloud ML Inference

WarningTradeoff: Edge ML Inference vs Cloud ML Inference

Option A (Edge ML): Deploy 2-10MB quantized models on gateways (TensorFlow Lite, ONNX Runtime), achieving 10-50ms inference latency with 85-92% accuracy for classification tasks on Cortex-M4/ESP32 class devices.

Option B (Cloud ML): Run 100MB-1GB full-precision models in cloud (TensorFlow Serving, AWS SageMaker), achieving 200-500ms round-trip latency with 95-99% accuracy using GPUs for complex pattern recognition.

Decision Factors: Choose edge ML when latency requirements are under 100ms (safety systems, real-time control), privacy mandates data cannot leave premises (healthcare, industrial IP), or connectivity is unreliable (remote assets, mobile equipment). Choose cloud ML when model complexity requires GPU acceleration (video analytics, NLP), training data continuously improves models (recommendation systems), or centralized management simplifies updates. Hybrid architectures run simple detection at edge (anomaly flags, threshold checks) while cloud handles deep analysis (root cause diagnosis, long-term forecasting). A factory safety system needs 20ms edge response, but weekly predictive maintenance reports can use cloud-trained models updated monthly.

1326.5.2 Batch Processing vs Real-Time Streaming

WarningTradeoff: Batch Processing vs Real-Time Streaming at Edge

Option A (Batch Processing): Collect sensor data locally for 1-60 minutes, process in batches, upload aggregated results. Reduces compute cycles by 80-95%, extends battery life 3-5x on constrained devices, but introduces 1-60 minute detection latency.

Option B (Real-Time Streaming): Process each sensor reading immediately as it arrives with sub-second latency. Enables instant anomaly detection and immediate control responses, but requires 5-10x more edge compute power and continuous network connectivity for cloud integration.

Decision Factors: Choose batch processing for delay-tolerant analytics (hourly environmental reports, daily asset utilization), bandwidth-constrained links (satellite, cellular metered), and battery-powered devices where duty cycling extends deployment life from weeks to years. Choose real-time streaming for safety-critical monitoring (gas leaks, machine failures requiring <1s response), control systems with tight feedback loops (HVAC, robotics), and applications where stale data has no value (live tracking, interactive systems). A smart meter can batch hourly readings, but a pipeline pressure sensor must stream in real-time to detect ruptures within seconds.

1326.6 Cost Analysis: When Edge Saves Money (and When It Doesn’t)

WarningCommon Misconception: “Edge Always Reduces Costs”

The Myth: “Processing at the edge always saves money compared to cloud processing.”

The Reality: Edge computing reduces bandwidth costs but introduces hardware and maintenance costs that can exceed cloud savings in many scenarios.

Real-World Example: Agricultural Soil Monitoring

A precision agriculture company deployed 10,000 soil moisture sensors across 5,000 acres:

Cloud-Only Approach (Initial Design): - 10,000 sensors x 1 reading/hour x 24 hours x 30 days = 7.2M readings/month - Data size: 7.2M readings x 50 bytes = 360 MB/month - Cloud egress cost: 360 MB x $0.09/GB = $0.03/month - Cloud compute/storage: ~$50/month - Total: $50.03/month

Edge Gateway Approach (Actual Deployment): - 50 edge gateways ($200 each): $10,000 upfront capital - Edge aggregation reduces cloud traffic by 90%: 36 MB/month - Cloud costs: $5/month (minimal compute) - Gateway maintenance: $100/month (cellular data, power, repairs) - Amortized gateway cost over 3 years: $278/month - Total: $383/month

The Hidden Costs: - Hardware depreciation: $10,000 / 36 months = $278/month - Cellular connectivity: 50 gateways x $2/month = $100/month - Maintenance visits: 1 failed gateway/month x $150/visit = $150/month (later reduced with better hardware) - Software updates: Edge devices require OTA update infrastructure

When Edge Actually Saved Money (Year 2): After initial deployment issues were resolved and maintenance costs dropped to $50/month: - Edge total: $278 (amortized) + $100 (cellular) + $50 (maintenance) + $5 (cloud) = $433/month - Cloud total: $50/month + 360 MB data = $50/month

Edge remained more expensive until the company expanded to 100,000 sensors in Year 3: - Cloud scaling: 3,600 MB/month x $0.09/GB = $3.24/month bandwidth + $500/month compute = $503/month - Edge scaling: Added 450 gateways but cellular costs negotiated to $0.50/gateway = $475/month

Key Takeaway: Edge computing provides the greatest cost savings when: 1. High data volumes overwhelm bandwidth costs (>10 GB/month) 2. Hardware is amortized over multi-year deployments 3. Maintenance is minimal (reliable hardware, remote updates) 4. Latency/privacy requirements justify the investment regardless of cost

Decision Framework: - Small deployments (<1,000 sensors, <1 GB/month): Cloud is usually cheaper - Medium deployments (1,000-10,000 sensors): Hybrid (edge aggregation + cloud) often optimal - Large deployments (>10,000 sensors, >10 GB/month): Edge pays for itself within 6-12 months

1326.7 Summary

  • Four edge processing patterns address different IoT requirements: Filter (threshold alerts), Aggregate (statistical summaries), Infer (ML-based detection), and Store-Forward (intermittent connectivity)
  • Pattern selection depends on primary priority: bandwidth reduction, real-time response, reliability, or privacy
  • Edge ML trade-offs balance latency (10-50ms edge vs 200-500ms cloud) against accuracy (85-92% edge vs 95-99% cloud)
  • Batch vs streaming trade-offs balance power efficiency against detection latency
  • Cost analysis shows edge saves money only at scale (>10,000 sensors) or when latency/privacy requirements justify hardware investment
  • Hybrid architectures typically provide the best balance: edge for time-critical decisions, cloud for complex analytics

1326.8 What’s Next

Continue exploring edge computing patterns: