7  Edge and Fog Computing: Architecture

In 60 Seconds

Fog computing architecture distributes processing across three tiers – edge devices (1-10ms latency, MCUs with KB of RAM), fog nodes (10-100ms, gateways with GB of RAM), and cloud data centers (100-500ms, elastic resources) – each optimized for distinct functions. The critical design insight is that data flows upward with 80-99% reduction at the fog layer through aggregation and filtering, while control flows downward with increasing specificity as ML models become actuator commands. A smart building with 2,000 sensors generating 200 KB/s raw data can reduce cloud-bound traffic to just 8 KB/s through fog-tier processing, achieving 96% bandwidth savings while maintaining 50ms local control latency.

MVU: Minimum Viable Understanding

In 60 seconds, understand Fog Computing Architecture:

Fog computing architecture distributes processing across three tiers – edge, fog, and cloud – each with distinct capabilities. The critical design insight is that data flows up with progressive reduction (90–99% at the fog layer) while control flows down with increasing specificity (ML models become actuator commands):

Edge (1-10ms)  --[raw data]--> Fog (10-100ms) --[5-10% filtered]--> Cloud (100-500ms)
Edge <--[commands]-- Fog <--[models, policies]-- Cloud

The architecture decision checklist:

Decision Edge Tier Fog Tier Cloud Tier
Process here if… Sub-10ms response needed Multi-sensor aggregation Cross-site analytics
Store here if… Seconds of buffer Hours to days of history Months to years of archive
Typical hardware MCU, SoC ($5-50) Gateway, edge server ($200-5,000) Data center (elastic)
Data reduction 0-20% (basic filtering) 80-99% (aggregation) N/A (receives reduced data)

Common mistake: Designing “edge-only” or “cloud-only” architectures. Most IoT systems need all three tiers working together, with each tier handling what it does best.

Read on for detailed architecture patterns and worked examples, or jump to Knowledge Check: Three-Tier Architecture to test your understanding.

Key Concepts
  • Three-Tier Architecture: Hierarchical model with edge devices (sensors/actuators), fog nodes (local gateways/servers), and cloud (central data center) each handling distinct processing roles
  • Fog Node: Intermediate compute device (industrial PC, router with compute, SBC) that aggregates data from multiple edge devices and performs local analytics
  • Offloading Decision: Logic determining which computations run locally vs. remotely, based on latency requirements, available compute, and bandwidth cost
  • Horizontal Scalability: Fog nodes can be added regionally without restructuring cloud infrastructure, distributing load geographically
  • Data Aggregation: Combining readings from multiple sensors at a fog node before cloud transmission, reducing bandwidth by aggregating many low-value events into high-value summaries
  • Service Placement: Deciding which microservices or ML models run at which tier, balancing compute requirements against latency and connectivity constraints
  • Southbound Interface: Communication channel between fog node and edge devices (Zigbee, BLE, Modbus, MQTT)
  • Northbound Interface: Communication channel between fog node and cloud tier (HTTPS, AMQP, MQTT over TLS)

7.1 Learning Objectives

By the end of this chapter, you will be able to:

  • Design three-tier architectures: Structure edge, fog, and cloud layers with appropriate resource allocation
  • Distinguish fog node capabilities: Compare computation, storage, networking, and security functions at each tier
  • Analyze data flow: Map bidirectional data paths through hierarchical processing with quantified reduction ratios
  • Implement processing pipelines: Design collection, aggregation, and forwarding stages with concrete latency budgets
  • Evaluate architectural patterns: Select and justify appropriate configurations for different deployment scenarios
  • Calculate bandwidth savings: Quantify the data reduction achieved by fog-layer processing

Edge and fog computing architecture is the blueprint for distributing work between IoT devices, nearby processors, and the cloud. Think of a chain of command in an organization: local decisions are made quickly on the spot (edge), regional coordination happens at the branch office (fog), and company-wide strategy is set at headquarters (cloud).

Fog Computing Architecture is like a school with three levels of helpers that make everything run super smoothly!

7.1.1 The Sensor Squad Adventure: The Three-Level Helper System

Sammy the Temperature Sensor was having a busy day at Smart School. “I’ve been measuring the temperature every second!” said Sammy. “That’s 3,600 readings every hour! But does the Principal really need ALL of those numbers?”

Lila the Light Sensor agreed. “I measure brightness 100 times a second! If we send everything to Cloud City, the mailbox will overflow!”

Max the Motion Detector had a brilliant idea. “What if we organize helpers into THREE LEVELS, like a school?”

So they created the Three-Level Helper System:

Level 1 – The Desk Buddy (Edge): Right at your desk! Sammy’s desk buddy checks each temperature reading instantly. “Is it too hot? Too cold? Same as before? If nothing changed, I don’t even write it down!” The desk buddy is super fast (answers in 1 second!) but can only handle simple yes-or-no questions.

Level 2 – The Classroom Teacher (Fog): The teacher collects reports from ALL the desk buddies in the room. “Okay, Sammy says it’s warm, Lila says lights are bright, Max saw someone walk by, and Bella’s door sensor says the door is open. Putting it all together… someone just entered a warm, bright room! I’ll turn the AC up a little and dim the lights.” The teacher is smart enough to combine information and make good decisions!

Level 3 – The Principal’s Office (Cloud): The Principal doesn’t need to know every tiny detail. The teacher sends a daily summary: “Room 101 was used 6 hours today, average temp 72F, energy usage was normal.” The Principal uses these summaries from ALL classrooms to plan for next month – maybe Room 101 needs a bigger AC unit!

Bella the Button summed it up: “The desk buddy handles the quick stuff, the teacher handles the classroom, and the principal handles the whole school! Everyone does what they’re best at!”

Remember: In fog computing architecture, the Edge is your desk buddy (fast, simple), the Fog is your teacher (smart, combines info), and the Cloud is the principal (big picture, long-term planning)!

7.2 Architecture of Fog Computing

Fog computing architectures organize computing resources across multiple tiers, each optimized for specific functions and constraints. The OpenFog Reference Architecture (published by the OpenFog Consortium in 2017, now part of the Industrial Internet Consortium) formalizes this approach with eight pillars: security, scalability, openness, autonomy, reliability, agility, hierarchy, and programmability.

7.2.1 Three-Tier Fog Architecture

The three-tier architecture distributes processing based on latency constraints. Formula: Total system latency \(L_{total} = L_{edge} + L_{fog} + L_{cloud}\) where typical values are \(L_{edge} = 1-10\)ms, \(L_{fog} = 10-100\)ms, \(L_{cloud} = 100-500\)ms. Data reduction at each tier follows \(D_{out} = D_{in} \times (1-r_{tier})\) where \(r_{tier}\) is the reduction ratio.

Worked example: A smart building with 2,000 sensors generates \(D_{edge} = 2000 \times 0.1 \text{ KB/s} = 200\) KB/s raw. Fog tier reduces by 96% (\(r_{fog}=0.96\)): \(D_{fog} = 200 \times 0.04 = 8\) KB/s. Cloud receives only \(8 \times 86400 = 691\) MB/day instead of 17.28 GB/day, achieving 96% bandwidth savings while maintaining fog latency \(L_{fog} \approx 50\)ms for HVAC control.

Cloud computing tier architecture with scalable storage and processing for IoT analytics

Tier 1: Edge Devices (Things Layer)

  • IoT sensors and actuators
  • Smart devices and appliances
  • Wearables and mobile devices
  • Embedded systems (microcontrollers, SoCs)

Characteristics:

Property Typical Value
Processing power 8-bit to 32-bit MCU (16-240 MHz)
RAM 2 KB – 512 KB
Storage 32 KB – 16 MB flash
Power Battery (months to years) or harvested
Latency contribution 1-10ms
Typical protocols Bluetooth LE, Zigbee, LoRa, GPIO

Tier 2: Fog Nodes (Fog Layer)

  • Gateways and routers (e.g., Cisco IOx, Dell Edge Gateway)
  • Base stations and access points
  • Micro data centers and cloudlets
  • Edge servers (e.g., NVIDIA Jetson, Intel NUC)

Characteristics:

Property Typical Value
Processing power Multi-core ARM/x86 (1-3 GHz), optional GPU
RAM 1-32 GB
Storage 32 GB – 2 TB SSD
Power Mains-powered (10-150W)
Latency contribution 10-100ms
Typical protocols Wi-Fi, Ethernet, MQTT, CoAP, HTTP

Tier 3: Cloud Data Centers (Cloud Layer)

  • Large-scale data centers (AWS, Azure, GCP)
  • Virtually unlimited resources (elastic scaling)
  • Global reach and availability
  • Advanced analytics, ML training, and long-term storage

Characteristics:

Property Typical Value
Processing power Thousands of CPU/GPU cores
RAM Terabytes (elastic)
Storage Petabytes (elastic)
Power Megawatts (data center level)
Latency contribution 100-500ms (WAN round-trip)
Typical protocols HTTPS, WebSocket, gRPC

Three-Tier Architecture Summary:

Tier Components Functions Data Flow
Cloud Data centers Unlimited compute/storage, global analytics, ML training Receives aggregated insights, sends commands/ML models
Fog Gateways, edge servers, base stations Data aggregation, filtering, protocol translation Receives raw data (90–99% reduction), sends to cloud
Edge Sensors, actuators, smart devices Data collection, simple filtering, threshold detection Sends raw/filtered data via Bluetooth/Zigbee

Data Flow: Edge –> (raw data) –> Fog –> (90–99% reduction) –> Cloud Control Flow: Cloud –> (ML models, commands) –> Fog –> (<10ms response) –> Edge

Detailed three-tier fog computing architecture with bidirectional data flow: Tier 1 Edge (navy, 1-10ms) with sensors, actuators, simple filtering, battery-powered devices sends raw data upward to Tier 2 Fog (teal, 10-100ms) with gateways, edge servers, local analytics, 90-99% data reduction sending 5-10% filtered data to Tier 3 Cloud (gray, 100-300ms) with unlimited compute, ML training, long-term storage which sends ML models downward to fog layer that sends commands (under 10ms) back to edge tier

Detailed three-tier fog computing architecture with bidirectional data flow: Tier 1 Edge (navy, 1-10ms) with sensors, actuators, simple filtering, battery-powered devices sends raw data upward to Tier 2 Fog (teal, 10-100ms) with gateways, edge servers, local analytics, 90-99% data reduction sending 5-10% filtered data to Tier 3 Cloud (gray, 100-300ms) with unlimited compute, ML training, long-term storage which sends ML models downward to fog layer that sends commands (under 10ms) back to edge tier
Figure 7.1: Three-tier fog computing architecture detailing components, functions, and metrics at each layer. Tier 1 (Edge) provides 1-10ms responses with battery-powered sensors performing simple filtering. Tier 2 (Fog) offers 10-100ms local analytics with 90-99% data reduction at gateways and edge servers. Tier 3 (Cloud) delivers unlimited compute with 100-300ms latency for global ML training and long-term storage. Bidirectional data flow shows raw data flowing upward (5-10% selection) while ML models and control commands flow downward.

7.2.2 Fog Node Capabilities

Fog nodes serve as the critical middle tier, providing four key capability categories:

Fog node internal architecture showing compute, storage, and networking components

Computation:

  • Data preprocessing and filtering (e.g., moving average, threshold detection)
  • Local analytics and decision-making (e.g., rule engines, event processing)
  • Machine learning inference (e.g., TensorFlow Lite, ONNX Runtime on edge servers)
  • Event detection and correlation (e.g., combining motion + door sensor = “person entered”)

Storage:

  • Temporary data buffering (minutes to hours for burst handling)
  • Caching frequently accessed data (recent sensor readings, ML model parameters)
  • Local databases for recent history (SQLite, InfluxDB on gateway)
  • Offline operation support (store-and-forward when cloud connection is lost)

Networking:

  • Protocol translation (e.g., Zigbee to IP, BLE to MQTT, Modbus to HTTP)
  • Data aggregation from multiple sensors (spatial and temporal fusion)
  • Load balancing and traffic management (distributing queries across edge servers)
  • Quality of Service (QoS) enforcement (prioritizing safety-critical data)

Security:

  • Local authentication and authorization (device identity verification)
  • Data encryption/decryption (TLS termination at gateway)
  • Intrusion detection (anomalous traffic pattern detection)
  • Privacy-preserving processing (anonymization before cloud upload)

Pitfall 1: “Fog node as dumb pipe.” Deploying gateways that merely forward data without local processing wastes the fog tier’s potential. If your fog node only does protocol translation, you are paying for hardware that provides minimal value – at minimum, add data aggregation and threshold alerting.

Pitfall 2: “Single point of failure.” Deploying exactly one fog node per site without redundancy. When that node fails, all edge devices lose their processing and cloud connectivity. Design with at least N+1 redundancy for critical deployments.

Pitfall 3: “Storing everything locally.” Fog nodes have limited storage. If you buffer all raw data waiting for cloud connectivity, you will fill the disk within hours during an outage. Implement data retention policies: keep 1-hour raw, 24-hour aggregated, 7-day summaries.

Pitfall 4: “Ignoring the return path.” Architects focus on data flowing up (edge to cloud) but neglect the control path flowing down (cloud to edge). ML model updates, configuration changes, and firmware updates must be designed into the architecture from the start. Without this, fog nodes become stale and drift from intended behavior.

Pitfall 5: “One-size-fits-all fog nodes.” Not all fog nodes need the same capabilities. A fog node for a camera network needs GPU acceleration, while one for environmental sensors needs multi-protocol support. Right-size your fog hardware to the workload.

7.3 Worked Example: Designing a Smart Building Fog Architecture

Scenario: A 10-floor commercial building has 2,000 IoT sensors:

  • 500 temperature/humidity sensors (1 sample/minute = 8.3 samples/sec total)
  • 200 occupancy sensors (PIR motion, 10 events/sec total)
  • 100 air quality sensors (CO2, PM2.5; 1 sample/30sec = 3.3 samples/sec total)
  • 800 lighting sensors (1 sample/minute = 13.3 samples/sec total)
  • 200 energy meters (1 sample/15sec = 13.3 samples/sec total)
  • 200 door/window contact sensors (event-driven, ~2 events/sec total)

Step 1: Calculate raw data rate

Each sensor reading is approximately 50 bytes (timestamp + sensor ID + value + metadata):

  • Total raw rate: ~50.2 samples/sec x 50 bytes = 2,510 bytes/sec = ~2.5 KB/s
  • Daily raw volume: 2.5 KB/s x 86,400 sec = 216 MB/day
  • With overhead (headers, acknowledgments): ~300 MB/day

Step 2: Design fog tier

Deploy one fog gateway per floor (10 gateways total), each handling ~200 sensors:

  • Hardware per floor: Raspberry Pi 4 or Intel NUC (4 GB RAM, 128 GB SSD) – cost: ~$200-500
  • Protocol handling: BLE for temperature/occupancy, Zigbee for lighting, Wi-Fi for energy meters
  • Local processing per gateway:
    • Aggregate temperature readings: report floor average + min/max every 5 minutes (instead of per-sensor per-minute)
    • Occupancy: convert raw PIR triggers into “occupied/unoccupied” state per zone (10 zones/floor)
    • Air quality: alert only if CO2 > 1,000 ppm or PM2.5 > 35 ug/m3
    • Lighting: report changes only (not repeated steady-state values)
    • Energy: compute 15-minute interval totals per floor

Step 3: Calculate fog data reduction

Data Type Raw Rate After Fog Processing Reduction
Temperature 50/floor/min 1 summary/floor/5min 99.6%
Occupancy ~1 event/sec/floor State change/zone (~0.1/sec) 90%
Air quality 10/floor/30sec Alert only (rare) ~99%
Lighting 80/floor/min Change events only (~5/min) 94%
Energy 20/floor/15sec 1 total/floor/15min 98.5%

Overall reduction: ~96% – cloud receives ~12 MB/day instead of 300 MB/day.

Step 4: Design cloud tier

  • Store aggregated data in time-series database (InfluxDB Cloud)
  • Run daily ML models: predict next-day energy consumption, detect HVAC anomalies
  • Serve dashboard for building managers
  • Monthly bandwidth cost: ~360 MB/month (cloud uplink) vs. ~9 GB/month (without fog) – 96% cost savings

Step 5: Design control flow (cloud to edge)

  • Cloud trains comfort optimization model weekly, pushes to fog gateways
  • Fog gateways apply model locally: adjust HVAC setpoints, lighting schedules
  • Edge actuators receive commands from fog within 50ms

Architecture decision record: This design uses 10 fog gateways at ~$300 each ($3,000 total) to reduce monthly cloud bandwidth costs from ~$50/month to ~$2/month and enable local HVAC control with 50ms latency instead of 200-500ms cloud round-trip. ROI from bandwidth savings alone: 5+ years. ROI including energy optimization: 6-12 months.

7.4 Applications of Fog Computing

Building on the use cases below, these deployment patterns show where fog-tier processing is essential and how they connect to other parts of the module.

  • Real-time rail monitoring: Fog nodes along tracks analyse vibration and axle temperature locally to flag anomalies within milliseconds. See also Predictive Maintenance.
  • Pipeline optimization: Gateways near pumps and valves aggregate high-frequency pressure/flow signals, run anomaly detection, and stream compressed alerts upstream
  • Wind farm operations: Turbine controllers optimize blade pitch at the edge; fog aggregators coordinate farm-level balancing
  • Smart home orchestration: Gateways fuse motion, environmental, and camera signals to automate lighting/HVAC without WAN dependency; cloud receives summaries and model updates. See also Smart Home Use Cases.

7.4.1 Hierarchical Processing

Edge computing layer with on-device real-time processing for IoT sensor data

Data Flow (upward):

  1. Edge devices collect raw data at source-specific sampling rates
  2. Fog nodes filter, aggregate, and process locally (90-99% data reduction)
  3. Refined data, alerts, and insights forwarded to cloud
  4. Cloud performs global analytics, ML training, and long-term storage

Control Flow (downward):

  1. Cloud pushes updated ML models, configuration changes, and policies to fog
  2. Fog nodes apply models locally, generate commands for edge actuators
  3. Edge actuators receive and execute commands within the local latency budget

Processing Distribution:

  • Time-Critical (sub-100ms): Processed at fog layer – safety alerts, actuator control, anomaly response
  • Local Scope (minutes): Handled by fog nodes – zone-level aggregation, local optimization
  • Global Analytics (hours to days): Sent to cloud – cross-site correlation, trend analysis, model training
  • Long-Term Storage (months to years): Cloud repositories – compliance archives, historical baselines

7.5 Working of Fog Computing

Understanding the operational flow of fog computing systems illustrates how distributed components collaborate to deliver responsive, efficient IoT services. The following sections trace a data packet through the complete lifecycle.

7.5.1 Data Collection Phase

  1. Sensing:
    • Edge devices continuously or periodically sense environment
    • Data includes temperature, motion, images, location, vibration, etc.
    • Sampling rates vary by application: 1/hour (weather) to 100 kHz (vibration)
    • Each reading includes timestamp, sensor ID, value, and quality indicator
  2. Local Processing (Device Level):
    • Basic filtering and validation (range checks, stuck-at detection)
    • Analog-to-digital conversion (12-bit ADC typical for IoT sensors)
    • Initial compression or feature extraction (delta encoding, min/max/avg)
    • Energy-efficient operation (duty cycling between samples)
  3. Communication:
    • Transmission to nearby fog nodes via short-range protocols
    • Protocol selection based on range, power, and data rate needs:
      • Bluetooth LE: Wearables, indoor proximity (10-30m, ~1 Mbps)
      • Zigbee: Mesh sensor networks (10-100m, 250 kbps)
      • Wi-Fi: High-bandwidth devices (50m, 50+ Mbps)
      • LoRa: Long-range, low-power (2-15 km, 0.3-50 kbps)
    • Energy-efficient due to proximity to fog nodes

7.5.2 Fog Processing Phase

  1. Data Aggregation:
    • Combining data from multiple sensors (spatial fusion)
    • Time synchronization (aligning timestamps from different sources)
    • Spatial correlation (associating readings from the same physical area)
    • Redundancy elimination (deduplicating overlapping sensor coverage)
  2. Preprocessing:
    • Noise filtering and smoothing (moving average, Kalman filter)
    • Outlier detection and correction (statistical bounds, physics constraints)
    • Data normalization and formatting (unit conversion, schema alignment)
    • Missing value handling (interpolation, last-known-good substitution)
  3. Local Analytics:
    • Pattern recognition (time-series pattern matching)
    • Anomaly detection (threshold violations, statistical deviations)
    • Event classification (categorizing detected events by type and severity)
    • Threshold monitoring (multi-variable compound conditions)
  4. Decision Making:
    • Rule-based responses (if temperature > 80C AND rising, trigger cooling)
    • Local control commands (actuator setpoints, valve positions)
    • Alert generation (severity-graded notifications)
    • Adaptive behavior (adjusting thresholds based on context)
  5. Selective Forwarding:
    • Sending only relevant data to cloud (alerts, summaries, anomalies)
    • Summaries and statistics instead of raw data (5-minute averages, daily min/max)
    • Triggered transmission on significant events (state changes, threshold crossings)
    • Bandwidth optimization (batch uploads during off-peak, compression)

7.5.3 Cloud Processing Phase

  1. Global Analytics:
    • Cross-location correlation (comparing patterns across sites)
    • Long-term trend analysis (seasonal patterns, degradation curves)
    • Complex machine learning (deep learning training, ensemble models)
    • Predictive modeling (failure prediction, demand forecasting)
  2. Storage:
    • Long-term archival (compliance-mandated retention: 1-10 years)
    • Historical databases (time-series with downsampled resolution)
    • Data lake creation (raw + processed data for future analysis)
    • Backup and redundancy (geo-replicated for disaster recovery)
  3. Coordination:
    • Multi-site orchestration (coordinating actions across locations)
    • Resource allocation (balancing workloads across fog nodes)
    • Software updates distribution (firmware, ML models, configurations)
    • Configuration management (centralized policy enforcement)

7.5.4 Action Phase

  1. Local Response (Fog Level):
    • Immediate actuator control (sub-10ms for safety systems)
    • Real-time alerts (push notifications to operators)
    • Emergency responses (automatic shutdown, isolation)
    • Automatic adjustments (PID control loops, setpoint optimization)
  2. Global Response (Cloud Level):
    • Strategic decisions (production scheduling, maintenance planning)
    • Resource optimization across sites (load balancing, energy trading)
    • Long-term planning (capacity upgrades, equipment replacement)
    • Policy updates (new rules pushed to fog nodes)

7.6 Context Awareness and Location

7.6.1 Location Awareness

Proximity-Based Processing: Fog nodes leverage knowledge of device locations for intelligent data routing and processing. This enables spatial queries (“which sensors are in Building A, Floor 3?”), location-based rules (“if motion detected near exit AND after hours, alert security”), and geofenced processing (“process all data from Zone B locally, never send to cloud”).

Example: Smart parking system knows which sensors are in which parking lot, enabling lot-specific availability calculations without cloud involvement. The fog node at each parking garage maintains a real-time occupancy map: “Level 1: 45/50 occupied, Level 2: 30/50 occupied, Level 3: 12/50 occupied.” Drivers querying availability get sub-100ms responses from the local fog node rather than waiting for a cloud round-trip.

7.6.2 Environmental Context

Local weather, traffic, events, and conditions provide context for intelligent interpretation of sensor data. Without context, a temperature reading of 35C might be normal (summer afternoon) or alarming (winter server room).

Example: Smart city fog node near intersection combines:

  • Traffic camera data (vehicle count, speed, queue length)
  • Inductive loop sensors (vehicle presence, classification)
  • Local event calendar (football match ending at 10 PM)
  • Weather conditions (rain detected, reduced visibility)
  • Time of day (rush hour vs. off-peak)

Result: Optimizes traffic light timing based on complete local context. During the football match end, extends green phases on stadium exit roads by 40%. During rain, increases all-red clearance intervals for safety. This multi-source fusion is only practical at the fog tier because it requires low-latency access to diverse local sensors.

7.6.3 Data Gravity

Concept: Large datasets have “gravity” – moving them is costly in time, bandwidth, and money. The larger the dataset, the stronger its gravitational pull on computation.

Implication: Bringing computation to data (fog) is often more efficient than bringing data to computation (cloud). This is especially true when:

  • Data volume is high (video, audio, high-frequency sensors)
  • Only a small fraction of data contains actionable information
  • Privacy regulations prohibit moving raw data off-premises
  • Network bandwidth is limited or expensive

Example – Video Surveillance:

Metric Cloud-Only Fog Processing
Data generated 1 TB/day per camera 1 TB/day per camera
Data transmitted to cloud 1 TB/day 1 GB/day (motion events only)
Cloud bandwidth cost (at $0.09/GB) $92/day per camera $0.09/day per camera
Alert latency 2-5 seconds (upload + process) 100-500ms (local process)
Privacy exposure Full video leaves premises Only metadata leaves premises

With 100 cameras, fog processing saves ~$9,200/day in bandwidth costs alone, while delivering 10x faster alerts and better privacy compliance.

7.7 Edge Computing Architecture: GigaSight Framework

GigaSight represents an exemplary edge computing framework designed for large-scale video analytics, illustrating practical fog computing architecture patterns. Originally developed at Carnegie Mellon University, GigaSight demonstrates how cloudlet-based architectures handle the massive data volumes and low-latency requirements of citywide video analytics.

7.7.1 Architecture Overview

Problem: Real-time video processing from thousands of cameras generates petabytes of data with latency requirements incompatible with cloud-only processing. A single 1080p camera at 30 fps generates ~5 Mbps; 1,000 cameras produce 5 Gbps – far exceeding typical WAN capacity.

Solution: Hierarchical edge computing architecture distributing processing across three tiers, where each tier handles what it does best.

7.7.2 GigaSight Three-Tier Architecture

Three-tier architecture showing data flow distribution across edge, fog, and cloud layers

Tier 1: Camera Edge Devices

  • Smart cameras with embedded processors (NVIDIA Jetson Nano or similar)
  • Perform basic video preprocessing (noise reduction, stabilization)
  • Motion detection and key frame extraction (discard static periods: ~60-80% of footage)
  • H.264/H.265 video compression (reduce bandwidth by 100-1000x vs. raw)

Tier 2: Edge Servers (Cloudlets)

  • Deployed near camera clusters (e.g., one per building or city block)
  • GPU-accelerated video analytics (NVIDIA T4 or similar)
  • Object detection and tracking (YOLO, SSD models at 30+ fps)
  • Face recognition and classification (with privacy-preserving local processing)
  • Event extraction (identifying interesting events from continuous streams)

Tier 3: Cloud Data Center

  • Long-term video clip and metadata storage
  • Cross-location analytics (city-wide traffic patterns, crowd flow)
  • Model training and updates (retrain YOLO on new labeled data, push to cloudlets)
  • Dashboard and user interfaces for operators and analysts

7.7.3 Processing Pipeline

  1. Capture: Cameras capture video streams at 1080p/30fps (~5 Mbps each)
  2. Filter: Motion detection filters static periods (removes 60-80% of footage)
  3. Extract: Key frames and events extracted (further 50-70% reduction)
  4. Analyze: Cloudlet GPU runs ML models (YOLO for objects, CNNs for classification)
  5. Index: Metadata and events indexed (object type, location, timestamp, confidence)
  6. Store: Relevant clips and metadata stored locally on cloudlet (7-30 day retention)
  7. Forward: Only summaries, alerts, and queried clips sent to cloud (~1% of original data)
  8. Query: Users query metadata index, retrieve specific clips on demand

7.7.4 Benefits Demonstrated

Metric Cloud-Only Architecture GigaSight (Fog Architecture) Improvement
Alert latency 2-10 seconds 100-500ms 10-20x faster
Bandwidth per camera 5 Mbps continuous 50 Kbps average 99% reduction
Privacy Full video in cloud Video stays local Much stronger
Cameras per cloudlet N/A 20-50 cameras Scales linearly
Cost per camera/month ~$150 (bandwidth + cloud compute) ~$15 (cloudlet amortized) 90% savings

7.8 Fog Architecture Patterns

Beyond the basic three-tier model, several architectural patterns address specific deployment needs:

7.8.1 Pattern 1: Hierarchical Fog (Multi-Level)

For large-scale deployments spanning buildings, campuses, or cities, a single fog tier may not suffice. Hierarchical fog adds intermediate levels:

Edge --> Fog-L1 (room/floor) --> Fog-L2 (building) --> Fog-L3 (campus) --> Cloud

When to use: Deployments with 10,000+ devices spanning multiple physical locations. Each fog level aggregates further, reducing data volume by an additional 80-90% at each stage.

7.8.2 Pattern 2: Peer Fog (Collaborative)

Fog nodes at the same level communicate directly for tasks requiring cross-zone coordination:

Fog-A <--peer--> Fog-B <--peer--> Fog-C
  |                |                |
Edge-A           Edge-B           Edge-C

When to use: Applications where events in one zone affect decisions in adjacent zones (e.g., coordinated traffic lights, building HVAC zones, manufacturing assembly lines).

7.8.3 Pattern 3: Mobile Fog (Vehicular/Drone)

Fog nodes are not fixed – vehicles, drones, or mobile robots carry fog processing capability:

When to use: Connected vehicle platooning, drone swarm coordination, mobile healthcare monitoring. The fog node moves with the user or asset, maintaining low-latency processing regardless of location.

7.8.4 Pattern 4: Federated Fog (Multi-Tenant)

Multiple organizations share fog infrastructure while maintaining data isolation:

When to use: Smart buildings with multiple tenants, shared industrial parks, multi-operator telecom edge. Each tenant’s data is processed in isolated containers on shared fog hardware, reducing per-tenant infrastructure costs.

Scenario: A manufacturing plant with 500 CNC machines, each with 20 sensors (vibration, temperature, current draw, tool wear), generating 100 readings/second per sensor.

Step 1: Calculate raw data volume

  • Total sensors: 500 machines × 20 sensors = 10,000 sensors
  • Sampling rate: 100 Hz per sensor
  • Data per reading: 16 bytes (timestamp + sensor_id + value + status)
  • Raw data rate: 10,000 × 100 × 16 = 16 MB/s = 1.38 TB/day

Step 2: Design Edge Tier (Machine Level) Each CNC machine controller (PLC or embedded PC) performs: - Threshold detection: Flag any sensor exceeding operational limits (temp >85°C, vibration >5g) - Delta encoding: Only transmit readings that differ from previous by >2% - Buffer management: Store last 60 seconds locally for fault analysis - Data reduction: 100 readings/second → 5 events/second (when changes occur) - Edge bandwidth per machine: 20 sensors × 5 events/s × 16 bytes = 1.6 KB/s

Step 3: Design Fog Tier (Factory Floor Level) Deploy 5 fog gateways (one per 100 machines): - Hardware: Industrial PC (Intel i5, 16 GB RAM, 500 GB SSD) - Processing: - Aggregate 100 machines × 1.6 KB/s = 160 KB/s per gateway - Run FFT analysis on vibration data for predictive maintenance - Detect cross-machine patterns (multiple machines on same line showing similar wear) - Store 24 hours of aggregated data locally for offline operation - Fog output per gateway: 10 KB/s of anomaly alerts + hourly summaries - Total fog → cloud bandwidth: 5 gateways × 10 KB/s = 50 KB/s = 4.3 GB/day

Step 4: Calculate data reduction | Tier | Data Volume | Reduction from Previous Tier | |——|————-|—————————-| | Edge (raw) | 1.38 TB/day | — | | Fog (filtered) | 4.3 GB/day | 99.7% reduction | | Cloud (insights) | 4.3 GB/day | 0% (all fog data reaches cloud) | | Overall reduction | Edge to Cloud | 99.7% |

Step 5: Cost comparison (monthly)

Cloud-Only Approach:

  • Bandwidth: 1.38 TB/day × 30 days = 41.4 TB/month × $0.09/GB = $3,726/month
  • Cloud ingestion: 41.4 TB × $0.05/GB = $2,070/month
  • Cloud processing (time-series DB, analytics): $1,500/month
  • Total cloud-only: $7,296/month = $87,552/year

Three-Tier Approach:

  • Fog hardware (5 gateways): $15,000 ÷ 60 months = $250/month
  • Fog power/cooling: $100/month
  • Cloud bandwidth (4.3 GB/day): 129 GB/month × $0.09 = $11.61/month
  • Cloud services (reduced load): $200/month
  • Total three-tier: $561.61/month = $6,739/year

Savings: $87,552 - $6,739 = $80,813/year (92% reduction)

Key Architectural Decisions:

  1. Edge tier handles real-time anomaly detection (<10 ms) for safety shutdowns
  2. Fog tier provides factory-floor coordination and 24-hour offline capability
  3. Cloud tier performs cross-factory analytics and long-term trend analysis
  4. Bidirectional flow: cloud pushes updated ML models weekly to fog gateways
Factor Guideline Calculation Example
Geographic Scope One fog node per physical location requiring coordinated processing Factory floor (100 machines), building (200 sensors), retail store (50 cameras)
Device Density Target 50-500 devices per fog node; split if exceeding 1,000 10,000 sensors ÷ 100-200 per gateway = 50-100 fog nodes
Data Volume Size fog node to handle 10-50 MB/s sustained ingestion 500 sensors × 100 Hz × 20 bytes = 1 MB/s → entry-level fog gateway OK
Processing Complexity CPU-only for aggregation; add GPU if running ML inference Computer vision (YOLO, pose estimation) → Jetson Xavier; sensor analytics → Raspberry Pi 4
Latency Budget Place fog nodes within 10 ms network latency of edge devices Latency = 5 μs/km in fiber; 10 ms budget = 2,000 km max (but typically <1 km for LAN)
Redundancy Deploy N+1 fog nodes for critical sites Factory with 500 machines → 5 active gateways + 1 standby

Sizing Calculation Example (smart building with 1,000 IoT devices):

Input Parameters:

  • 800 temperature/humidity sensors: 1 reading/minute, 50 bytes each
  • 150 occupancy sensors: 10 events/hour, 100 bytes each
  • 50 air quality sensors: 1 reading/5 minutes, 200 bytes each

Data Rate Calculation:

Temperature: (800 × 50 bytes) ÷ 60 s = 667 bytes/s
Occupancy: (150 × 10 × 100 bytes) ÷ 3600 s = 42 bytes/s
Air Quality: (50 × 200 bytes) ÷ 300 s = 33 bytes/s
Total sustained: ~740 bytes/s (trivial for any fog node)
Peak burst (all devices report simultaneously): 56 KB

Fog Node Selection:

  • Entry-level (Raspberry Pi 4, 4 GB RAM): Handles up to 5 MB/s → adequate for this deployment
  • Mid-tier (Intel NUC i3, 8 GB RAM): Handles 20 MB/s → overkill unless adding video analytics
  • High-end (Jetson Xavier NX): Only needed if running computer vision on 50+ camera streams

Decision: Deploy 2 Raspberry Pi 4 (8 GB) in active-standby configuration ($150 total hardware cost).

Common Mistake: Single Fog Node as Single Point of Failure

The Mistake: Deploying exactly one fog gateway per site without redundancy or failover planning. When that gateway experiences hardware failure, power loss, or software crash, all edge devices lose their local coordinator and the entire site goes offline.

Real-World Consequence: A manufacturing plant deploys a single fog gateway managing 200 machines. Six months into operation, the gateway’s power supply fails on a Friday evening. The factory operates 24/7, but IT staff are off-site. For 18 hours until Monday morning, the plant operates without predictive maintenance alerts, production monitoring, or quality control analytics. Three machines suffer avoidable failures due to missing vibration alerts. Cost of downtime + repairs: $47,000. Cost of a redundant gateway: $1,200.

Why It Happens: Architects treat fog gateways like cloud services (“cloud is always available, fog should be too”) without recognizing that fog nodes are physical hardware subject to failures. Budget constraints lead to cost-cutting on “redundant” infrastructure.

The Fix (implement all three layers):

Layer 1: Hardware Redundancy

  • Deploy N+1 fog gateways (minimum 2 per site for critical applications)
  • Active-standby configuration: standby node monitors active via heartbeat
  • Automatic failover when active node misses 3 consecutive heartbeats (15-30 seconds)
  • Shared persistent storage (NAS or replicated SSD) so standby has same state

Layer 2: Edge Autonomy

  • Program edge devices with basic autonomous behavior for fog outages:

    if fog_connection_lost():
        # Continue operating with cached rules
        apply_local_threshold_alerts()
        buffer_data_locally(max_hours=24)
        attempt_reconnect_every(minutes=5)
  • Local data buffering: each edge device stores 24-48 hours of data in circular buffer

  • Emergency shutdown rules: critical safety logic must run at edge, not fog

Layer 3: Graceful Degradation

  • Define degraded-mode capabilities when fog is unavailable:
    • Tier 1 (safety): Always available at edge (emergency stops, threshold alerts)
    • Tier 2 (monitoring): Buffered locally, synced when fog returns (production metrics)
    • Tier 3 (analytics): Unavailable until fog restores (predictive maintenance, optimization)

Cost-Benefit Analysis:

  • Single fog node failure:
    • Downtime cost (manufacturing): $2,000-10,000/hour
    • Average MTTR (mean time to repair): 4-24 hours
    • Expected cost per failure: $8,000-240,000
  • Redundancy cost:
    • Second fog gateway: $500-2,000 (one-time)
    • Shared storage (NAS): $300-1,000 (one-time)
    • Automatic failover software: $0 (open-source, e.g., Keepalived + VRRP)
    • Total redundancy investment: $800-3,000
  • Break-even: Redundancy pays for itself after preventing just one fog outage

Monitoring and Alerting:

  • Fog node health metrics to track:
    • CPU/RAM utilization (alert if >80% for 5+ minutes)
    • Disk space (alert if <20% free)
    • Network latency to edge devices (alert if >50 ms)
    • Process uptime (alert if core services restart)
  • Implement predictive alerts: “Gateway CPU trending toward 100%, add capacity or failover in 2 hours”

7.9 Summary

Fog computing architecture provides a structured approach to distributing computation across edge, fog, and cloud tiers. Each tier has distinct capabilities and responsibilities, working together to deliver responsive, efficient, and resilient IoT systems. The architecture is not merely a theoretical model – deployed systems like GigaSight demonstrate that fog computing achieves 90–99%+ bandwidth reduction, 10–20x latency improvement, and significant cost savings compared to cloud-only approaches.

Key Takeaways:

  • Three-tier architecture (Edge, Fog, Cloud) provides hierarchical processing where each tier handles what it does best
  • Fog nodes offer four capability categories: computation, storage, networking, and security – properly utilizing all four is essential for effective architecture
  • Data flows upward with progressive filtering and aggregation (typically 90–99% reduction at the fog tier)
  • Control flows downward with ML models trained in cloud deployed for inference at fog, and commands pushed to edge actuators
  • Context awareness (location, environment, data gravity) enables intelligent local processing that would be impossible or impractical in the cloud
  • Architecture patterns (hierarchical, peer, mobile, federated) address different deployment scales and coordination needs
  • GigaSight demonstrates practical edge architecture achieving 99% bandwidth reduction and sub-second alerts for video analytics
Connecting the Dots
  • Architecture foundations: See IoT Reference Architectures for how fog fits into the broader IoT architecture landscape
  • Advantages and trade-offs: Continue to Advantages and Challenges for a detailed analysis of fog computing benefits and limitations
  • Real-world deployments: See Use Cases for domain-specific fog architecture implementations
  • Hands-on practice: Try the Edge-Fog Simulator to experiment with tier allocation interactively
  • Decision support: Use the Decision Framework to evaluate whether fog computing is right for your application

7.10 Concept Relationships

Concept Relates To Relationship Type Why It Matters
Three-Tier Architecture Cloud Computing Hierarchical extension Edge and fog tiers distribute processing that would otherwise overwhelm cloud bandwidth and latency
Fog Node Capabilities Gateway Devices Implementation layer Gateways physically implement fog nodes, performing protocol translation and aggregation
Data Reduction (90-99%) Bandwidth Optimization Economic enabler Without fog-layer filtering, cloud-only costs become economically unviable ($50K/month vs $2K/month)
GigaSight Framework Video Analytics Real-world case study Demonstrates how fog computing achieves 99% bandwidth reduction and 10-20x latency improvement for video
Hierarchical Processing WSN Clustering Parallel concept WSN cluster heads perform similar aggregation to fog nodes, reducing multi-hop transmission costs
Store-and-Forward Offline Operation Resilience pattern Fog nodes buffer data during cloud outages, enabling autonomous operation critical for industrial systems
Context Awareness Location Services Intelligence enhancement Proximity-based processing enables spatial queries and geofenced rules that cloud cannot provide efficiently

7.11 See Also

Explore related chapters to deepen your understanding of fog computing architecture:

  • Edge-Fog Decision Framework - Systematic approach to choosing where to process data across the three tiers
  • IoT Reference Architectures - Foundational architectural patterns that fog computing builds upon
  • WSN Routing Fundamentals - Similar hierarchical aggregation patterns in wireless sensor network clustering protocols
  • Edge Compute Patterns - Data processing design patterns optimized for edge and fog tiers
  • Gateway Integration - Protocol translation and device management patterns at the fog layer

7.12 How It Works: Fog-Based Smart Building HVAC Optimization

The Challenge: A 50-floor commercial building with 2,000 HVAC zones generates 5 MB/s of temperature, occupancy, and airflow sensor data. Cloud-only processing would cost $25,000/month in bandwidth and introduce 200ms+ control latency, causing occupant discomfort.

The Solution: Three-tier fog architecture implementing local intelligence.

Edge Tier (Per-Zone Controller): Each HVAC zone has an ESP32 microcontroller running PID control:

// Edge: Real-time zone temperature control
float currentTemp = readTemperatureSensor();
float setpoint = 72.0; // Degrees Fahrenheit
float error = setpoint - currentTemp;
float controlSignal = pidController.compute(error);
setDamperPosition(controlSignal);
sendToFog(currentTemp, setpoint, error); // Only 50 bytes every 5 minutes

What happens here: The edge controller adjusts HVAC dampers every 10 seconds based on local PID algorithm. No cloud latency. Sends zone status to fog only once per 5 minutes (99% data reduction).

Fog Tier (Floor Gateway - Raspberry Pi 4): Aggregates 40 zones per floor, performs multi-zone optimization:

# Fog: Floor-level coordination and predictive control
def optimize_floor(zones):
    # Aggregate zone data
    avg_temp = mean([z.temp for z in zones])
    total_occupancy = sum([z.occupied for z in zones])

    # Predictive model (trained weekly in cloud, runs locally)
    predicted_load = ml_model.predict(time_of_day, occupancy, outdoor_temp)

    # Adjust setpoints for unoccupied zones
    for zone in zones:
        if not zone.occupied and zone.temp > avg_temp + 2:
            zone.send_setpoint_adjustment(-2)  # Save energy

    # Send hourly summary to cloud
    send_to_cloud({
        'floor': floor_id,
        'avg_temp': avg_temp,
        'energy_used_kwh': calculate_energy(),
        'occupancy_pattern': [z.occupied for z in zones]
    })

What happens here: Fog gateway coordinates 40 zones, redistributes cooling across occupied vs unoccupied areas, runs locally cached ML model for next-hour load prediction. Sends only hourly summaries to cloud (96% reduction from edge data).

Cloud Tier (Azure IoT Hub + Analytics): Receives hourly summaries from 50 floors, performs building-wide optimization: - Historical trend analysis: “Floor 23 uses 15% more energy than similar floors - investigate insulation” - Occupancy pattern learning: “Friday afternoons have 40% lower occupancy - adjust schedules” - Predictive maintenance: “HVAC unit 17 showing degrading efficiency - schedule service” - Model retraining: Update floor-level ML models weekly based on actual vs predicted loads

The Result:

  • Comfort: Sub-1s response to occupant adjustments (edge PID), no cloud latency
  • Efficiency: 28% energy savings through fog coordination ($74,200/year saved)
  • Cost: Bandwidth reduced from 12.96 TB/month to 259 GB/month ($25,000/month → $2,072/month)
  • Resilience: Building continues operation during internet outages (edge + fog autonomous)

Key Insight: Each tier does what it does best - edge provides instant response, fog coordinates locally, cloud learns globally. The system operates correctly even if any single tier fails.

7.13 Try It Yourself

Interactive: Fog Gateway ROI Calculator

Adjust the parameters below to estimate your fog gateway return on investment.

Exercise 1: Calculate Your Fog Gateway ROI

Use your own IoT deployment (or use these example values) to calculate payback period:

Given:

  • Number of sensors: 500
  • Sampling rate: 10 Hz (10 readings/second)
  • Bytes per reading: 50 bytes (timestamp + sensor ID + value + metadata)
  • Cloud bandwidth cost: $8/GB (typical cellular IoT rate)
  • Fog gateway cost: $5,000 (one-time hardware investment)

Tasks:

  1. Calculate monthly raw data volume if all data sent to cloud (show work in GB/month)
  2. Calculate monthly cloud bandwidth cost
  3. Apply 98% fog-layer reduction (typical for threshold filtering + aggregation)
  4. Calculate new monthly bandwidth cost after fog processing
  5. Compute monthly savings
  6. Determine payback period in months (gateway cost / monthly savings)

Hint: Data per second = sensors × rate × bytes. Then convert seconds → month (86,400 sec/day × 30 days).

Expected result: If you did it correctly, you should find payback period under 6 months.

Exercise 2: Design a Three-Tier Architecture

Scenario: Smart parking garage with 200 parking spots, each with: - Ultrasonic occupancy sensor (detects car presence) - LED indicator light (green = available, red = occupied) - Camera at entrance/exit (license plate recognition)

Your task: Design the three-tier architecture by answering:

  1. Edge tier: What processing happens at each parking spot? What data is sent upward?
  2. Fog tier: What does the garage-level gateway do? How does it reduce data before cloud?
  3. Cloud tier: What analytics/services run in the cloud that can’t run locally?
  4. Bonus: If the internet goes down, what functionality continues vs stops?

Hints:

  • Edge should handle real-time indicator light control (no cloud latency)
  • Fog should aggregate spot availability: “Level 2: 47/50 occupied”
  • Cloud should handle billing, historical trends, mobile app integration
  • Consider what MUST work offline (driver guidance) vs what can wait (monthly billing reports)

Exercise 3: Implement Data Filtering Logic

Write pseudocode for an edge device that filters temperature sensor data before transmission:

Requirements:

  • Read temperature every 1 second
  • Send to fog only if: (a) temperature changed by >0.5°C since last sent, OR (b) 5 minutes elapsed since last send (heartbeat)
  • Track: total readings taken vs total readings sent

Starting template:

last_sent_temp = None
last_sent_time = 0
readings_taken = 0
readings_sent = 0

while True:
    current_temp = read_temperature()
    current_time = get_time_seconds()
    readings_taken += 1

    should_send = False
    # YOUR CODE HERE: Implement the filtering logic

    if should_send:
        send_to_fog(current_temp)
        last_sent_temp = current_temp
        last_sent_time = current_time
        readings_sent += 1

    sleep(1)  # Wait 1 second

Test your logic: After 1 hour (3,600 readings) in a stable 22°C room, how many readings were sent? (Answer should be ~12 - just the heartbeats, since temperature didn’t change)

7.14 What’s Next

Topic Chapter Description
Advantages and Challenges Advantages and Challenges Quantified trade-offs for latency, bandwidth, cost, and operational complexity in fog deployments
Use Cases Use Cases Domain-specific fog architecture implementations across smart cities, healthcare, and industry
Decision Framework Decision Framework Systematic approach to evaluating whether fog computing is right for your application