1321  Edge Review: Data Reduction Calculations

1321.1 Learning Objectives

By the end of this chapter, you will be able to:

  • Calculate Data Reduction Ratios: Compute bandwidth savings from downsampling, aggregation, and filtering
  • Apply Multi-Stage Processing: Design data pipelines that combine multiple reduction techniques
  • Evaluate Cost Savings: Quantify cloud ingress savings from edge processing
  • Solve Real-World Scenarios: Apply mathematical models to factory, agriculture, and smart city deployments

1321.2 Prerequisites

Before studying this chapter, complete:

Imagine you’re writing a summary of a book:

  • Raw data: The entire 500-page book (huge)
  • Downsampling: Reading every 10th page (10x smaller)
  • Aggregation: Combining chapter summaries into one paragraph each (another 10x smaller)
  • Filtering: Removing chapters that don’t matter to your summary (removes noise)

Edge computing does the same thing with sensor data - it compresses terabytes into megabytes before sending to the cloud.

1321.3 Data Reduction Fundamentals

1321.3.1 Reduction Techniques at Level 3

Level 3 edge computing applies four key operations:

Operation Description Typical Reduction
Evaluation Filter out bad/invalid data 10-30%
Formatting Standardize data structures 0% (no size change)
Distillation Downsample frequency 10-100x
Aggregation Combine multiple streams 10-100x

1321.3.2 Compound Reduction Formula

When multiple techniques are applied sequentially:

Total Reduction = Downsample Ratio x Aggregation Ratio x (1 - Filter Percentage)

Example: 100x downsampling x 100x aggregation x 0.8 (20% filtered) = 8,000x reduction

1321.4 Industrial Vibration Monitoring Example

1321.4.1 Scenario Parameters

Parameter Value
Number of sensors 500
Sampling frequency 1 kHz (1000 Hz)
Reading size 16 bytes
Downsampling target 10 Hz
Aggregation group size 100 sensors
Summary record size 200 bytes

1321.4.2 Step-by-Step Calculation

Step 1: Raw Data Rate (Level 1)

Raw rate = sensors x frequency x bytes
         = 500 x 1000 x 16
         = 8,000,000 bytes/second
         = 8 MB/s

Per hour = 8 MB/s x 3600 s = 28,800 MB = 28.8 GB/hour

Step 2: After Downsampling (Level 3)

Downsample ratio = 1000 Hz / 10 Hz = 100x

Downsampled rate = 500 x 10 x 16 = 80,000 bytes/s = 80 KB/s

Per hour = 80 KB/s x 3600 = 288,000 KB = 288 MB/hour

Step 3: After Aggregation (Level 3)

Number of groups = 500 sensors / 100 per group = 5 groups

Aggregated rate = 5 groups x 10 Hz x 200 bytes
                = 10,000 bytes/s = 10 KB/s

Per hour = 10 KB/s x 3600 = 36,000 KB approximately 2 MB/hour

Total Reduction: 28,800 MB / 2 MB = 14,400x

1321.4.3 Cost Savings Analysis

Metric Without Edge With Edge Savings
Hourly data 28.8 GB 2 MB 14,400x
Daily data 691 GB 48 MB 14,400x
Monthly data 20.7 TB 1.44 GB 14,400x
Annual cloud cost $25,550 $1.77 $25,548/year

Assuming $0.10/GB cloud ingress

1321.5 Knowledge Check: Data Reduction Calculations

Question: A smart agriculture system has 50 sensor stations, each with temperature (5 bytes), soil moisture (8 bytes), and metadata (20 bytes). Current design: each sensor transmits individually every minute. Proposed: bundle at gateway and transmit once per hour. Assuming LoRa transmission costs 1 mAh per 10 KB transmitted, what is the monthly power savings for the transmission subsystem?

Explanation: This bundling strategy demonstrates Level 3 Edge Computing data aggregation.

Current Design - Individual Transmissions:

  • 50 stations x 60 minutes/hour x 24 hours x 30 days = 2,160,000 transmissions/month
  • Power per transmission approximately 0.00367 mAh
  • Total: 2,160,000 x 0.00367 = 7,920 mAh/month

Proposed Design - Hourly Bundling:

  • 50 stations x 24 hours x 30 days = 36,000 transmissions/month
  • Power: 36,000 x 0.00367 = 132 mAh/month

Reduction: (7,920 - 132) / 7,920 = 98.3%

Bundling benefits:

  1. Reduced transmission count: 60 transmissions/hour to 1 transmission/hour (60x reduction)
  2. Lower protocol overhead: One header for bundle vs. individual headers
  3. Power savings: Radio transmit is 120 mA vs. 0.01 mA sleep (12,000x difference)
  4. Network efficiency: Fewer packets = less network congestion

Battery life impact: 7,920 / 132 = 60x longer battery life with bundling

1321.6 Sensor Aggregation Architecture

The following diagram shows how agricultural sensors feed into edge gateways for bundling:

%% fig-alt: "Sensor data aggregation architecture showing multiple sensor stations (measuring temperature, humidity, soil moisture, light, and wind) feeding data into an edge gateway that combines readings by geographic location and bundles them hourly with metadata before transmitting to cloud storage. The architecture demonstrates a 60x reduction in transmissions by aggregating 60 individual sensor readings per hour into a single bundled transmission."
%%{init: {'theme': 'base', 'themeVariables': { 'primaryColor': '#2C3E50', 'primaryTextColor': '#fff', 'primaryBorderColor': '#16A085', 'lineColor': '#16A085', 'secondaryColor': '#E67E22', 'tertiaryColor': '#ecf0f1'}}}%%
flowchart LR
    subgraph Sensors["Sensor Stations (100s)"]
        S1[Station 1<br/>Temp, Humidity]
        S2[Station 2<br/>Soil Moisture]
        S3[Station N<br/>Light, Wind]
    end

    subgraph Gateway["Edge Gateway"]
        Combine[Combine by<br/>Geographic<br/>Location]
        Bundle[Bundle Hourly<br/>with Metadata]
    end

    subgraph Cloud["Cloud"]
        Store[Aggregated<br/>Results<br/>1 transmission/hr]
    end

    S1 & S2 & S3 --> Combine
    Combine --> Bundle
    Bundle -->|60x reduction| Store

    style S1 fill:#7F8C8D,stroke:#2C3E50,color:#fff
    style S2 fill:#7F8C8D,stroke:#2C3E50,color:#fff
    style S3 fill:#7F8C8D,stroke:#2C3E50,color:#fff
    style Combine fill:#2C3E50,stroke:#16A085,color:#fff
    style Bundle fill:#16A085,stroke:#2C3E50,color:#fff
    style Store fill:#27AE60,stroke:#2C3E50,color:#fff

Figure 1321.1: Sensor data aggregation architecture showing multiple sensor stations feeding into a gateway that combines and aggregates data before bundling hourly transmissions to the cloud.

1321.7 Fog Node Downsampling Example

Another common scenario from the reference material:

Scenario: 100 fog nodes with 5 sensors each, downsampling from 10 readings/second to 1 reading/minute

Stage Calculation Data Volume
Raw 100 nodes x 5 sensors x 10 Hz x 100 bytes 500 KB/s = 43.2 GB/day
Downsampled 100 nodes x 5 sensors x 1/60 Hz x 100 bytes 833 B/s = 72 MB/day

Reduction: 43,200 MB / 72 MB = 600x

1321.8 Quality-Aware Data Filtering

Not all data is equal. Level 3 edge gateways can apply quality scoring to make filtering decisions.

1321.8.1 Quality Score Components

Factor Weight Scoring Method
Battery Voltage 33% voltage / max_voltage
Signal Strength 33% (dBm + 90) / 30
Data Freshness 34% 1 - (age / max_age)

1321.8.2 Example Calculation

Reading parameters:

  • Battery: 3.0V (rated 2.0-3.3V)
  • Signal: -75 dBm (range -90 to -60)
  • Age: 1800 seconds (decay over 3600 seconds)

Scores:

  • Battery: 3.0 / 3.3 = 0.909
  • Signal: (-75 + 90) / 30 = 0.500
  • Freshness: 1 - (1800 / 3600) = 0.500

Overall Quality: (0.909 + 0.500 + 0.500) / 3 = 0.636 approximately 0.67

1321.8.3 Quality-Based Filtering Actions

Score Range Quality Action
0.0 - 0.4 Poor Filter out or deprioritize
0.4 - 0.7 Acceptable Process normally
0.7 - 0.9 Good Priority processing
0.9 - 1.0 Excellent Critical data, immediate action

Question: A data quality framework at Level 3 edge gateway computes quality scores based on battery voltage (33% weight), signal strength (33% weight), and data freshness (34% weight). A reading has: battery 3.0V (rated 2.0-3.3V), signal -75 dBm (range -90 to -60), age 1800 seconds (decay over 3600 seconds). What is the quality score?

Explanation: This quality scoring demonstrates Level 3 Edge Processing data assessment.

Component Score Calculations:

  1. Battery Score (33% weight):
    • Battery voltage: 3.0V
    • Formula: min(1.0, voltage / 3.3)
    • Score: 3.0 / 3.3 = 0.909
  2. Signal Strength Score (33% weight):
    • Signal: -75 dBm
    • Range: -90 dBm (weak) to -60 dBm (strong)
    • Formula: (dBm + 90) / 30
    • Score: (-75 + 90) / 30 = 0.500
  3. Freshness Score (34% weight):
    • Data age: 1800 seconds (30 minutes)
    • Decay period: 3600 seconds (60 minutes)
    • Formula: max(0.0, 1 - (age / decay))
    • Score: 1 - (1800 / 3600) = 0.500

Overall Quality Score:

quality_score = (0.909 + 0.500 + 0.500) / 3 = 0.636 approximately 0.67

Action: Process and include in aggregation (acceptable quality range 0.4-0.7)

1321.9 Chapter Summary

  • Compound reduction from multiple techniques (downsampling, aggregation, filtering) multiplies together, enabling 10,000x+ data reduction in industrial scenarios.

  • Cost savings from edge processing can exceed $25,000/year for a single factory deployment with 500 high-frequency sensors.

  • Bundling transmissions at gateways reduces transmission count by 60x (minute-to-hourly), extending battery life proportionally and reducing network congestion.

  • Quality scoring enables intelligent filtering where poor-quality data is deprioritized while maintaining visibility into sensor health issues.

  • The formula Total Reduction = Downsample Ratio x Aggregation Ratio x (1 - Filter Percentage) guides data pipeline design.

1321.10 What’s Next

Continue to Edge Review: Gateway and Security to learn about protocol translation, the Non-IP Things problem, and fail-closed security models for industrial IoT.

Related chapters in this review series: