%% fig-alt: "Sensor data aggregation architecture showing multiple sensor stations (measuring temperature, humidity, soil moisture, light, and wind) feeding data into an edge gateway that combines readings by geographic location and bundles them hourly with metadata before transmitting to cloud storage. The architecture demonstrates a 60x reduction in transmissions by aggregating 60 individual sensor readings per hour into a single bundled transmission."
%%{init: {'theme': 'base', 'themeVariables': { 'primaryColor': '#2C3E50', 'primaryTextColor': '#fff', 'primaryBorderColor': '#16A085', 'lineColor': '#16A085', 'secondaryColor': '#E67E22', 'tertiaryColor': '#ecf0f1'}}}%%
flowchart LR
subgraph Sensors["Sensor Stations (100s)"]
S1[Station 1<br/>Temp, Humidity]
S2[Station 2<br/>Soil Moisture]
S3[Station N<br/>Light, Wind]
end
subgraph Gateway["Edge Gateway"]
Combine[Combine by<br/>Geographic<br/>Location]
Bundle[Bundle Hourly<br/>with Metadata]
end
subgraph Cloud["Cloud"]
Store[Aggregated<br/>Results<br/>1 transmission/hr]
end
S1 & S2 & S3 --> Combine
Combine --> Bundle
Bundle -->|60x reduction| Store
style S1 fill:#7F8C8D,stroke:#2C3E50,color:#fff
style S2 fill:#7F8C8D,stroke:#2C3E50,color:#fff
style S3 fill:#7F8C8D,stroke:#2C3E50,color:#fff
style Combine fill:#2C3E50,stroke:#16A085,color:#fff
style Bundle fill:#16A085,stroke:#2C3E50,color:#fff
style Store fill:#27AE60,stroke:#2C3E50,color:#fff
1321 Edge Review: Data Reduction Calculations
1321.1 Learning Objectives
By the end of this chapter, you will be able to:
- Calculate Data Reduction Ratios: Compute bandwidth savings from downsampling, aggregation, and filtering
- Apply Multi-Stage Processing: Design data pipelines that combine multiple reduction techniques
- Evaluate Cost Savings: Quantify cloud ingress savings from edge processing
- Solve Real-World Scenarios: Apply mathematical models to factory, agriculture, and smart city deployments
1321.2 Prerequisites
Before studying this chapter, complete:
- Edge Review: Architecture and Reference Model - Reference model context
- Edge Compute Patterns - Processing patterns
- Basic understanding of data rates and network bandwidth
Imagine you’re writing a summary of a book:
- Raw data: The entire 500-page book (huge)
- Downsampling: Reading every 10th page (10x smaller)
- Aggregation: Combining chapter summaries into one paragraph each (another 10x smaller)
- Filtering: Removing chapters that don’t matter to your summary (removes noise)
Edge computing does the same thing with sensor data - it compresses terabytes into megabytes before sending to the cloud.
1321.3 Data Reduction Fundamentals
1321.3.1 Reduction Techniques at Level 3
Level 3 edge computing applies four key operations:
| Operation | Description | Typical Reduction |
|---|---|---|
| Evaluation | Filter out bad/invalid data | 10-30% |
| Formatting | Standardize data structures | 0% (no size change) |
| Distillation | Downsample frequency | 10-100x |
| Aggregation | Combine multiple streams | 10-100x |
1321.3.2 Compound Reduction Formula
When multiple techniques are applied sequentially:
Total Reduction = Downsample Ratio x Aggregation Ratio x (1 - Filter Percentage)
Example: 100x downsampling x 100x aggregation x 0.8 (20% filtered) = 8,000x reduction
1321.4 Industrial Vibration Monitoring Example
1321.4.1 Scenario Parameters
| Parameter | Value |
|---|---|
| Number of sensors | 500 |
| Sampling frequency | 1 kHz (1000 Hz) |
| Reading size | 16 bytes |
| Downsampling target | 10 Hz |
| Aggregation group size | 100 sensors |
| Summary record size | 200 bytes |
1321.4.2 Step-by-Step Calculation
Step 1: Raw Data Rate (Level 1)
Raw rate = sensors x frequency x bytes
= 500 x 1000 x 16
= 8,000,000 bytes/second
= 8 MB/s
Per hour = 8 MB/s x 3600 s = 28,800 MB = 28.8 GB/hour
Step 2: After Downsampling (Level 3)
Downsample ratio = 1000 Hz / 10 Hz = 100x
Downsampled rate = 500 x 10 x 16 = 80,000 bytes/s = 80 KB/s
Per hour = 80 KB/s x 3600 = 288,000 KB = 288 MB/hour
Step 3: After Aggregation (Level 3)
Number of groups = 500 sensors / 100 per group = 5 groups
Aggregated rate = 5 groups x 10 Hz x 200 bytes
= 10,000 bytes/s = 10 KB/s
Per hour = 10 KB/s x 3600 = 36,000 KB approximately 2 MB/hour
Total Reduction: 28,800 MB / 2 MB = 14,400x
1321.4.3 Cost Savings Analysis
| Metric | Without Edge | With Edge | Savings |
|---|---|---|---|
| Hourly data | 28.8 GB | 2 MB | 14,400x |
| Daily data | 691 GB | 48 MB | 14,400x |
| Monthly data | 20.7 TB | 1.44 GB | 14,400x |
| Annual cloud cost | $25,550 | $1.77 | $25,548/year |
Assuming $0.10/GB cloud ingress
1321.5 Knowledge Check: Data Reduction Calculations
Question: A smart agriculture system has 50 sensor stations, each with temperature (5 bytes), soil moisture (8 bytes), and metadata (20 bytes). Current design: each sensor transmits individually every minute. Proposed: bundle at gateway and transmit once per hour. Assuming LoRa transmission costs 1 mAh per 10 KB transmitted, what is the monthly power savings for the transmission subsystem?
Explanation: This bundling strategy demonstrates Level 3 Edge Computing data aggregation.
Current Design - Individual Transmissions:
- 50 stations x 60 minutes/hour x 24 hours x 30 days = 2,160,000 transmissions/month
- Power per transmission approximately 0.00367 mAh
- Total: 2,160,000 x 0.00367 = 7,920 mAh/month
Proposed Design - Hourly Bundling:
- 50 stations x 24 hours x 30 days = 36,000 transmissions/month
- Power: 36,000 x 0.00367 = 132 mAh/month
Reduction: (7,920 - 132) / 7,920 = 98.3%
Bundling benefits:
- Reduced transmission count: 60 transmissions/hour to 1 transmission/hour (60x reduction)
- Lower protocol overhead: One header for bundle vs. individual headers
- Power savings: Radio transmit is 120 mA vs. 0.01 mA sleep (12,000x difference)
- Network efficiency: Fewer packets = less network congestion
Battery life impact: 7,920 / 132 = 60x longer battery life with bundling
1321.6 Sensor Aggregation Architecture
The following diagram shows how agricultural sensors feed into edge gateways for bundling:
1321.7 Fog Node Downsampling Example
Another common scenario from the reference material:
Scenario: 100 fog nodes with 5 sensors each, downsampling from 10 readings/second to 1 reading/minute
| Stage | Calculation | Data Volume |
|---|---|---|
| Raw | 100 nodes x 5 sensors x 10 Hz x 100 bytes | 500 KB/s = 43.2 GB/day |
| Downsampled | 100 nodes x 5 sensors x 1/60 Hz x 100 bytes | 833 B/s = 72 MB/day |
Reduction: 43,200 MB / 72 MB = 600x
1321.8 Quality-Aware Data Filtering
Not all data is equal. Level 3 edge gateways can apply quality scoring to make filtering decisions.
1321.8.1 Quality Score Components
| Factor | Weight | Scoring Method |
|---|---|---|
| Battery Voltage | 33% | voltage / max_voltage |
| Signal Strength | 33% | (dBm + 90) / 30 |
| Data Freshness | 34% | 1 - (age / max_age) |
1321.8.2 Example Calculation
Reading parameters:
- Battery: 3.0V (rated 2.0-3.3V)
- Signal: -75 dBm (range -90 to -60)
- Age: 1800 seconds (decay over 3600 seconds)
Scores:
- Battery: 3.0 / 3.3 = 0.909
- Signal: (-75 + 90) / 30 = 0.500
- Freshness: 1 - (1800 / 3600) = 0.500
Overall Quality: (0.909 + 0.500 + 0.500) / 3 = 0.636 approximately 0.67
1321.8.3 Quality-Based Filtering Actions
| Score Range | Quality | Action |
|---|---|---|
| 0.0 - 0.4 | Poor | Filter out or deprioritize |
| 0.4 - 0.7 | Acceptable | Process normally |
| 0.7 - 0.9 | Good | Priority processing |
| 0.9 - 1.0 | Excellent | Critical data, immediate action |
Question: A data quality framework at Level 3 edge gateway computes quality scores based on battery voltage (33% weight), signal strength (33% weight), and data freshness (34% weight). A reading has: battery 3.0V (rated 2.0-3.3V), signal -75 dBm (range -90 to -60), age 1800 seconds (decay over 3600 seconds). What is the quality score?
Explanation: This quality scoring demonstrates Level 3 Edge Processing data assessment.
Component Score Calculations:
- Battery Score (33% weight):
- Battery voltage: 3.0V
- Formula: min(1.0, voltage / 3.3)
- Score: 3.0 / 3.3 = 0.909
- Signal Strength Score (33% weight):
- Signal: -75 dBm
- Range: -90 dBm (weak) to -60 dBm (strong)
- Formula: (dBm + 90) / 30
- Score: (-75 + 90) / 30 = 0.500
- Freshness Score (34% weight):
- Data age: 1800 seconds (30 minutes)
- Decay period: 3600 seconds (60 minutes)
- Formula: max(0.0, 1 - (age / decay))
- Score: 1 - (1800 / 3600) = 0.500
Overall Quality Score:
quality_score = (0.909 + 0.500 + 0.500) / 3 = 0.636 approximately 0.67
Action: Process and include in aggregation (acceptable quality range 0.4-0.7)
1321.9 Chapter Summary
Compound reduction from multiple techniques (downsampling, aggregation, filtering) multiplies together, enabling 10,000x+ data reduction in industrial scenarios.
Cost savings from edge processing can exceed $25,000/year for a single factory deployment with 500 high-frequency sensors.
Bundling transmissions at gateways reduces transmission count by 60x (minute-to-hourly), extending battery life proportionally and reducing network congestion.
Quality scoring enables intelligent filtering where poor-quality data is deprioritized while maintaining visibility into sensor health issues.
The formula Total Reduction = Downsample Ratio x Aggregation Ratio x (1 - Filter Percentage) guides data pipeline design.
1321.10 What’s Next
Continue to Edge Review: Gateway and Security to learn about protocol translation, the Non-IP Things problem, and fail-closed security models for industrial IoT.
Related chapters in this review series: