54  Edge Architecture Review

In 60 Seconds

The Edge-Fog-Cloud continuum processes IoT data across multiple tiers, each with different latency, bandwidth, and cost trade-offs. The Seven-Level IoT Reference Model maps processing capabilities from physical devices (Level 1) through edge computing (Level 3) to cloud analytics (Level 7), enabling data reductions of 14,000x or more before data reaches the cloud.

54.1 Learning Objectives

By the end of this chapter, you will be able to:

  • Explain the Edge-Fog-Cloud Continuum: Describe how data flows through multiple processing tiers with increasing latency but decreasing bandwidth requirements
  • Apply the Seven-Level IoT Reference Model: Map processing capabilities, latency characteristics, and use cases to each level
  • Identify Processing Trade-offs: Evaluate latency, bandwidth, processing power, and cost at each architectural tier
  • Design Tiered Architectures: Apply the golden rule of edge computing to determine optimal processing placement

Key Concepts

  • Edge processing hierarchy: The ordered set of processing capabilities from sensor (simplest) to microcontroller to gateway to fog server to cloud (most powerful), each tier handling more complex tasks.
  • Resource provisioning: Allocating sufficient memory, CPU, and network bandwidth at each edge tier for the processing tasks assigned to it, with safety margins for unexpected load spikes.
  • Edge orchestration: Automated management of which tasks run on which edge nodes, including dynamic reallocation when nodes fail or become overloaded, typically using Kubernetes at the edge (K3s, KubeEdge).
  • Stateless vs stateful edge processing: Stateless edge operations (format conversion, threshold check) can be run on any node without coordination; stateful operations (running averages, anomaly models) require state management and careful placement.

54.2 Prerequisites

Before studying this chapter, complete:

Think of edge computing like a postal system:

  • Edge (Level 1-3): Your local post office - handles urgent letters quickly, sorts mail before sending
  • Fog (Level 4-5): Regional distribution center - accumulates mail from multiple local offices, makes routing decisions
  • Cloud (Level 6-7): National headquarters - handles complex logistics, long-term records, cross-country coordination

Data flows the same way: urgent processing happens locally, while complex analysis travels to centralized systems.

54.3 Edge-Fog-Cloud Architecture Overview

The following diagram illustrates the complete edge computing continuum, showing how data flows from sensors through multiple processing tiers to the cloud, with increasing latency but decreasing bandwidth requirements at each level.

Four-layer architecture diagram showing the Edge-Fog-Cloud computing continuum. The Edge Layer (Levels 1-2) handles raw sensor data with sub-millisecond latency. The Edge Gateway Layer (Level 3) performs downsampling and statistical aggregation, reducing data volume by orders of magnitude. The Fog Layer (Levels 4-5) conducts regional analytics and ML inference with 10-100ms latency. The Cloud Layer (Levels 6-7) provides deep analytics, model training, and long-term storage with 100-500ms latency. Arrows show data flowing upward with decreasing volume and increasing latency at each tier.

Edge-Fog-Cloud Computing Continuum
Figure 54.1: Edge-Fog-Cloud Computing Continuum showing data flow from sensors through multiple processing tiers. The Edge Layer (Level 1-2) provides sub-millisecond latency for raw sensor data. The Edge Gateway Layer (Level 3) performs downsampling and aggregation with 1-10ms latency. The Fog Layer (Level 4-5) conducts analytics and ML inference at 10-100ms latency. The Cloud Layer (Level 6-7) handles deep analytics and long-term storage at 100-500ms latency. The architecture demonstrates fundamental trade-offs: latency increases from edge to cloud, bandwidth requirements decrease through data reduction at each tier, processing complexity grows from simple filtering to advanced ML, and cost shifts from distributed low-cost edge devices to centralized high-capability cloud infrastructure.

54.4 Seven-Level IoT Reference Model

The following table summarizes the seven-level IoT reference model, mapping each level to its processing capabilities, latency characteristics, and typical use cases.

Level Name Processing Capabilities Latency Data Volume Typical Use Cases
Level 1 Physical Devices & Controllers Raw sensing, basic actuation, signal conditioning <1 ms Very High (GB/hour) Sensor sampling, emergency shutoffs, real-time control
Level 2 Connectivity Protocol translation, network routing, device addressing <1 ms Very High Data transmission, network protocols, device communication
Level 3 Edge Computing (Fog) Evaluation (filtering), Formatting (standardization), Distillation (aggregation), Assessment (thresholds) 1-10 ms High (MB/hour) Data reduction (100-1000x), downsampling, statistical aggregation, anomaly detection
Level 4 Data Accumulation Time-series storage, buffer management, data persistence 10-100 ms Medium Local databases, recent data cache, query processing
Level 5 Data Abstraction Data modeling, semantic integration, format conversion 10-100 ms Medium Data normalization, schema mapping, API abstraction
Level 6 Application Business logic, analytics, ML inference 100-500 ms Low (MB/day) Dashboards, reporting, predictive models
Level 7 Collaboration & Processes Cross-system integration, workflow automation, enterprise services 100-500 ms Very Low ERP integration, business processes, multi-tenant services

54.4.1 Key Data Reduction Example

Scenario: 500 vibration sensors, 1 kHz sampling, 16-byte readings

Processing Stage Data Rate Cumulative Reduction Operations Applied
Raw Sensors (Level 1) 28.8 GB/hour Baseline None
After Downsampling (Level 3) 288 MB/hour 100x Frequency: 1 kHz to 10 Hz
After Aggregation (Level 3) 2 MB/hour ~14,400x Spatial grouping (100 sensors per group) + statistical summarization at 1 Hz

Cost Impact: ~$25,000/year savings in cloud ingress costs (@$0.10/GB)

The ~14,400x data reduction ratio comes from chaining two processing stages at the edge gateway:

\[\text{Total Reduction} = \text{Downsampling Ratio} \times \text{Aggregation Ratio}\]

For the vibration sensor example:

Stage 1 - Temporal Downsampling (100x):

\[500 \text{ sensors} \times 1000 \text{ Hz} \times 16 \text{ B} = 8{,}000{,}000 \text{ B/s} = 28.8 \text{ GB/hr}\]

Downsample each sensor from 1000 Hz to 10 Hz:

\[500 \text{ sensors} \times 10 \text{ Hz} \times 16 \text{ B} = 80{,}000 \text{ B/s} = 288 \text{ MB/hr}\]

Stage 2 - Spatial Aggregation (~144x):

Group 100 sensors into 5 groups. For each group, compute a statistical summary (min, max, mean, std dev, median, peak frequency, count – 7 metrics at 16 bytes each = 112 bytes) once per second:

\[5 \text{ groups} \times 1 \text{ Hz} \times 112 \text{ B} = 560 \text{ B/s} \approx 2 \text{ MB/hr}\]

Each group reduces 100 sensors at 10 Hz (16,000 B/s input) to a 112-byte summary (112 B/s output), a ~143x reduction per group.

\[R_{total} = 100 \times 143 \approx 14{,}300\times\]

(Rounded to ~14,400x in practice, as summary sizes vary slightly with encoding.)

Annual cost impact at $0.10/GB cloud ingress:

  • Without edge: \(28.8 \text{ GB/hr} \times 24 \text{ hr} \times 365 \text{ days} \times \$0.10 = \$25{,}228/\text{year}\)
  • With edge: \(0.002 \text{ GB/hr} \times 24 \text{ hr} \times 365 \text{ days} \times \$0.10 = \$1.75/\text{year}\)
  • Savings: ~$25,226/year (99.99% cost reduction)

54.4.2 Edge Data Reduction Calculator

Use this interactive calculator to explore how different sensor configurations affect data reduction and cost savings.

54.4.3 Processing Trade-off Summary

Factor Edge Layer Fog Layer Cloud Layer
Latency <1 ms (Best) 10-100 ms (Moderate) 100-500 ms (Highest)
Bandwidth Very High (Worst) Medium Low (Best)
Processing Power Limited Moderate Unlimited
Data Retention Seconds-Minutes Hours-Days Unlimited
Cost per Node Low Medium High (centralized)
Scalability Distributed Regional Global
Use Cases Real-time control, safety Analytics, ML inference Training, long-term storage

54.5 Architecture Design Principle

The Golden Rule of Edge Computing: Process data as close to the source as possible, but only as close as necessary.

  • Edge (Level 1-3): Latency-critical operations, data reduction, real-time decisions
  • Fog (Level 4-5): Regional analytics, ML inference, medium-term storage
  • Cloud (Level 6-7): Deep analytics, model training, historical analysis, global coordination

54.6 Knowledge Check: Architecture Concepts

## Edge-Cloud Symbiosis: Why Hybrid Architectures Win {#edge-review-arch-hybrid}

A common misconception is that edge computing means devices process data completely independently without cloud connectivity. In reality, edge computing is about intelligent data reduction and latency-critical processing, not replacing the cloud.

The Misconception: Many students believe edge computing means fully autonomous devices that never need the cloud.

Reality – Hybrid Edge-Cloud Model:

  • Edge processing: ~95% of raw data volume (filtered and aggregated locally)
  • Cloud transmission: ~5% of data (summaries, anomalies, important events)
  • Cloud computation: ~80% of ML training workload (requires historical fleet-wide data)
  • Edge inference: ~20% of ML workload (simple threshold-based and lightweight model decisions)

When to Use Each:

  • Edge: Real-time safety shutdowns (<10ms), data reduction (100-1000x), privacy filtering
  • Cloud: ML model training, historical analytics, cross-site correlation, firmware updates
  • Wrong approach: Trying to do all ML training on edge devices, or sending all raw sensor data to cloud

Cost Impact of Misconception: Companies over-investing in edge infrastructure waste $50,000-$200,000 per deployment site, while companies under-utilizing edge spend $25,000-$80,000/year in unnecessary cloud ingress costs.

The edge-cloud relationship is symbiotic, not competitive. The following table illustrates how hybrid approaches outperform either tier alone:

Capability Edge Strength Cloud Strength Hybrid Approach
Anomaly Detection Rules-based, <1ms latency ML-based, learns new patterns Edge: immediate alerts; Cloud: model retraining
Model Accuracy Fixed rules, ~85% accuracy Continuous learning, ~98% accuracy Edge runs cloud-trained models
Failure Modes Detects known signatures only Discovers unknown patterns Cloud identifies new failure types, pushes to edge
Resource Usage 2 GB RAM, 10W power 128 GB RAM, 500W GPUs Edge inference, cloud training
Data Requirements Single site Fleet-wide (1000+ sites) Edge sends summaries, cloud correlates

Phase 1 – Edge-Only Deployment:

A manufacturing company deployed edge-only predictive maintenance:

  • Edge gateway: Rule-based anomaly detection
  • 6 months operation: Detected 18 known failure modes successfully
  • Problem: Missed 3 novel failure types (bearing cage fracture, new vibration signature)
  • Cost of missed failures: $220,000 (unplanned downtime)

Phase 2 – Adding Cloud-in-the-Loop:

  • Edge: Continues real-time detection (<1ms)
  • Cloud: Trains on 12 months of fleet data (50 factories)
  • Discovers 5 new failure signatures
  • Pushes updated models to all edge gateways monthly
  • Year 2: Detected 23 failure modes (5 discovered by cloud ML)
  • Prevented failures: $180,000 saved
  • Cloud training cost: $12,000/year
  • Net benefit: $168,000/year

Quantified Benefits of Hybrid Edge-Cloud:

Metric Edge-Only Hybrid Edge-Cloud Improvement
Detection latency <1ms <1ms No change (edge handles)
False positive rate 15% 4% 73% reduction (cloud training)
Novel failure detection 0% 85% New capability
Model staleness Permanent 30-day refresh Continuous improvement
Development cost $80K (rules) $120K (ML pipeline) +$40K
Annual value $200K $380K +$180K

Architecture Best Practices:

  1. Edge handles time-critical decisions: Sub-second response using current model
  2. Cloud handles model improvement: Train on months of data, push updates monthly/quarterly
  3. Edge sends representative samples: Not all data, just anomalies + 1% baseline
  4. Cloud correlates across sites: Patterns invisible at single-site level
  5. Graceful degradation: Edge continues working if cloud offline, using last-known-good model

The Lesson: Edge computing is not about replacing the cloud – it is about intelligently dividing labor. Edge handles real-time decisions with local data. Cloud handles complex analysis with global data. Together, they create systems smarter than either alone.

Scenario: A city deploys IoT for real-time traffic management across 200 intersections.

Requirements:

  • Traffic light control: <100ms response time
  • Pedestrian detection: Real-time video processing
  • Traffic flow analytics: Historical pattern analysis
  • Incident detection: Automated alerts for accidents/congestion

Data Generation:

  • 200 intersections x 4 cameras = 800 cameras
  • Each camera: 1080p @ 30 fps = ~8 Mbps (H.264 compressed)
  • Total raw video: 800 x 8 Mbps = 6,400 Mbps = 6.4 Gbps
  • Per day: 6.4 Gbps x 86,400 s / 8 bits per byte = 69,120 GB/day = ~69 TB/day

Architecture by Level:

Level 1 - Physical Devices:

  • 800 cameras + 200 traffic light controllers
  • Edge compute modules (NVIDIA Jetson) at each intersection
  • Local object detection (pedestrians, vehicles)
  • Raw data: ~69 TB/day

Level 2 - Connectivity:

  • Intersection-to-gateway: Gigabit Ethernet
  • Gateway-to-fog: Fiber optic (100 Gbps backbone)
  • Fog-to-cloud: Internet uplink (10 Gbps)

Level 3 - Edge Computing (Per Intersection):

  • Vehicle counting and classification
  • Pedestrian detection
  • License plate reading (when needed)
  • Local traffic light optimization
  • Data reduction: 8 Mbps (1 MB/s) video per camera reduced to 2 KB/s metadata
  • Reduction factor per camera: ~500x
  • Output: 800 cameras x 2 KB/s = 1.6 MB/s = 138 GB/day

Level 4 - Fog Node (5 Regional Hubs):

  • Each hub aggregates 40 intersections
  • Traffic flow optimization across corridors
  • Incident detection via pattern matching
  • 7-day data retention for local queries
  • Further aggregation: 138 GB/day reduced to 15 GB/day sent to cloud
  • Reduction factor: ~9.2x

Level 5 - Data Abstraction (Fog):

  • Normalize traffic data formats
  • Spatial-temporal correlation
  • Generate traffic congestion heat maps
  • API layer for city dashboards

Level 6 - Cloud Application:

  • City-wide traffic analytics
  • ML model training on months of data
  • Long-term trend analysis
  • Integration with public transit systems
  • Final data: 15 GB/day (down from ~69 TB/day)
  • Total reduction: ~4,600x

Level 7 - Collaboration:

  • Integration with emergency services (ambulance routing)
  • Public APIs for traffic apps
  • Inter-city coordination for highway traffic

Latency by Tier:

Decision Type Processing Tier Latency Example
Traffic light change Edge (Level 1) <50ms Pedestrian button pressed
Corridor optimization Fog (Level 4) 1-2 seconds Adjust 10-light timing
City-wide rerouting Cloud (Level 6) 30-60 seconds Major incident detected

Cost Analysis:

Component Cost per Unit Quantity Total
Edge compute (Jetson Xavier) $600 200 $120,000
Fog servers (regional hubs) $15,000 5 $75,000
Network infrastructure $50,000 1 $50,000
Cloud services (annual) $180,000/year - $180,000
Total (Year 1) - - $425,000

Savings vs Cloud-Only Architecture:

If all video sent to cloud:

  • Bandwidth: 69 TB/day x 30 days x $0.05/GB = 69,000 GB/day x 30 x $0.05 = $103,500/month = $1.24M/year
  • Cloud processing: Equivalent compute for video analysis at scale = $600,000/year
  • Total cloud-only cost: $1.84M/year

With edge-fog architecture:

  • Edge hardware (amortized over 5 years): $245,000 / 5 = $49,000/year
  • Cloud costs (reduced data): $180,000/year
  • Total hybrid cost: $229,000/year

Annual savings: $1.84M - $229K = $1.61M (87.5% reduction) Payback period: ~3 months

Use this decision framework to determine optimal processing placement:

Step 1: Assess Latency Requirements

Latency Requirement Recommended Tier Rationale
<10ms Edge (Level 1-3) Safety-critical, real-time control
10-100ms Fog (Level 4-5) Local coordination, multi-device
100ms-1s Fog or Cloud Real-time analytics
1-10s Cloud (Level 6-7) Complex analytics
>10s Cloud Historical analysis, ML training

Step 2: Evaluate Data Volume

Daily Data Volume Action Reason
>1 TB/day per site Edge processing mandatory Bandwidth costs prohibitive
100GB-1TB Fog aggregation beneficial Regional optimization
10-100GB Hybrid approach Balance cost and capability
<10GB Cloud-first acceptable Bandwidth costs manageable

Step 3: Consider Processing Complexity

Processing Type Recommended Tier Example
Simple thresholds Edge Temperature >50C alert
Statistical aggregation Edge/Fog Hourly min/max/avg
Pattern recognition (rules) Fog Known anomaly signatures
ML inference (small model) Edge/Fog Classification, 1-10 classes
ML inference (large model) Cloud NLP, complex vision tasks
ML training Cloud Requires fleet-wide data

Step 4: Evaluate Connectivity

Connectivity Profile Architecture Notes
Always connected, reliable Cloud-first Leverage cloud scale
Intermittent (hourly drops) Edge-fog hybrid Buffer at edge
Frequently offline (daily) Edge-primary Local autonomy required
Rarely connected (weekly+) Edge-only Full local processing

Decision Algorithm (Scoring):

def recommend_processing_tier(latency_ms, data_volume_gb_day,
                                processing_complexity, connectivity):
    """
    Recommend optimal processing tier based on requirements.

    Returns: ("EDGE", "FOG", "CLOUD", or "HYBRID") with justification
    """
    score_edge = 0
    score_fog = 0
    score_cloud = 0

    # Latency scoring
    if latency_ms < 10:
        score_edge += 10
    elif latency_ms < 100:
        score_edge += 5
        score_fog += 5
    elif latency_ms < 1000:
        score_fog += 5
        score_cloud += 3
    else:
        score_cloud += 10

    # Data volume scoring
    if data_volume_gb_day > 1000:
        score_edge += 10
        score_fog += 5
    elif data_volume_gb_day > 100:
        score_edge += 5
        score_fog += 5
        score_cloud += 2
    elif data_volume_gb_day > 10:
        score_fog += 3
        score_cloud += 5
    else:
        score_cloud += 10

    # Processing complexity
    complexity_map = {
        'simple': {'edge': 10, 'fog': 5, 'cloud': 2},
        'moderate': {'edge': 5, 'fog': 8, 'cloud': 5},
        'complex': {'edge': 2, 'fog': 5, 'cloud': 10}
    }
    scores = complexity_map.get(processing_complexity, {})
    score_edge += scores.get('edge', 0)
    score_fog += scores.get('fog', 0)
    score_cloud += scores.get('cloud', 0)

    # Connectivity
    connectivity_map = {
        'always': {'edge': 2, 'fog': 5, 'cloud': 10},
        'intermittent': {'edge': 7, 'fog': 8, 'cloud': 3},
        'rare': {'edge': 10, 'fog': 5, 'cloud': 1}
    }
    scores = connectivity_map.get(connectivity, {})
    score_edge += scores.get('edge', 0)
    score_fog += scores.get('fog', 0)
    score_cloud += scores.get('cloud', 0)

    # Determine recommendation
    max_score = max(score_edge, score_fog, score_cloud)
    if abs(score_edge - score_fog) < 5 and \
       abs(score_edge - score_cloud) < 5:
        return ("HYBRID",
                f"Mixed requirements (E:{score_edge} "
                f"F:{score_fog} C:{score_cloud})")
    if score_edge == max_score:
        return ("EDGE",
                f"Best fit (E:{score_edge} "
                f"F:{score_fog} C:{score_cloud})")
    elif score_fog == max_score:
        return ("FOG",
                f"Best fit (E:{score_edge} "
                f"F:{score_fog} C:{score_cloud})")
    else:
        return ("CLOUD",
                f"Best fit (E:{score_edge} "
                f"F:{score_fog} C:{score_cloud})")

# Example: Industrial predictive maintenance
tier, reason = recommend_processing_tier(
    latency_ms=50,
    data_volume_gb_day=800,
    processing_complexity='moderate',
    connectivity='intermittent'
)
print(f"Recommendation: {tier}")
print(f"Reasoning: {reason}")
# Output: Recommendation: FOG
# Reasoning: Best fit (E:22 F:26 C:10)

54.6.1 Workload Placement Calculator

54.7 Chapter Summary

  • The Edge-Fog-Cloud Continuum provides progressive data processing where latency increases (under 1ms at edge to 100-500ms at cloud) while bandwidth requirements decrease dramatically through data reduction at each tier.

  • The Seven-Level Reference Model guides processing decisions: Levels 1-2 handle physical sensing and connectivity, Level 3 performs edge computing (filtering, aggregation, format standardization), Levels 4-5 provide fog-layer storage and abstraction, and Levels 6-7 enable cloud analytics and enterprise integration.

  • Processing trade-offs must balance latency requirements, bandwidth constraints, processing power availability, data retention needs, and cost considerations when determining where to place computation.

  • The Golden Rule states: process data as close to the source as possible, but only as close as necessary, based on latency requirements and processing complexity.

  • Edge-cloud symbiosis means edge computing reduces data volume and handles real-time decisions, while the cloud provides model training, historical analytics, and cross-site correlation. Neither tier replaces the other.

Key Takeaway

The Edge-Fog-Cloud continuum follows a golden rule: process data as close to the source as possible, but only as close as necessary. Edge handles latency-critical operations and data reduction (achieving ~14,000x compression), fog provides regional analytics and ML inference, and cloud delivers deep analytics and global coordination. This is a hybrid model – edge computing reduces data volume, not replaces cloud computing.

“The Relay Race of Data!”

Sammy the Sensor was collecting vibration readings from a factory floor – thousands every second! “I’m making SO much data!” Sammy cheered.

Max the Microcontroller, sitting in the edge gateway nearby, shook his head. “Sammy, you can’t send ALL of that to the cloud. That’s like trying to pour a swimming pool through a garden hose!”

“So what do we do?” asked Lila the LED, blinking nervously.

“It’s like a relay race!” Max explained. “Sammy runs the first leg – collecting data super fast. I run the second leg – I take Sammy’s thousands of readings and squeeze them into a tiny summary. Then the cloud runs the final leg – it takes my summaries and does the really brainy stuff, like predicting when a machine might break.”

“So each runner does what they’re best at?” Sammy asked.

“Exactly! I’m close to you so I can react in a flash – under one millisecond! The cloud is far away but super smart. Together, we’re an unstoppable team!”

Bella the Battery smiled. “And because Max filters out 99% of the data before sending, we don’t waste energy transmitting things nobody needs. That means I last WAY longer!”

The Sensor Squad learned: Edge, fog, and cloud each have their own superpower. The trick is giving each one the right job!

54.8 Concept Relationships

Edge architecture builds on:

Edge architecture enables:

  • Edge Data Reduction - Level 3 EFR pipeline achieves ~14,000x compression through the architectural framework
  • Gateway Security - Gateway placement at Level 3 creates security perimeter for non-IP devices
  • Power Optimization - Architecture decisions determine device power profiles

Parallel concepts:

  • Edge-Fog-Cloud continuum and Tiered storage (hot/warm/cold): Both use hierarchical placement based on access patterns
  • Golden rule of edge computing and Workload placement algorithms: Process as close to source as necessary

54.9 See Also

Related review chapters:

Foundational chapters:

Interactive tools:

54.10 What’s Next

Current Next
Edge Architecture Review Edge Review: Data Reduction

Related chapters in this review series:

Chapter Focus
Edge Review: Gateway and Security Protocol translation and fail-closed security
Edge Review: Power Optimization Deep sleep and battery life calculations
Edge Review: Storage and Economics Tiered storage and ROI analysis

Common Pitfalls

An architecture review that uses ‘edge’ to mean both microcontrollers and gateways conflates systems differing by 3–4 orders of magnitude in resources. Always specify which tier and its exact resource constraints.

Edge deployments experience node failures, network partitions, and load spikes. Reviewing only the steady-state architecture without analysing failure scenarios leaves critical operational questions unanswered.

The data path (sensor → cloud) is typically well-reviewed, but the management path (cloud → device configuration, firmware, diagnostics) is equally critical and equally complex. Include both in any architecture review.

Lab conditions with stable power, Ethernet connectivity, and controlled temperature do not represent field conditions with intermittent wireless, temperature extremes, and physical vibration. Architecture reviews must include a field conditions stress test plan.