65  Edge Quiz: Data Calculations

Quiz mastery targets are easiest to plan with threshold math:

\[ C_{\text{target}} = \left\lceil 0.8 \times N_{\text{questions}} \right\rceil \]

Worked example: For a 15-question quiz, target correct answers are \(\lceil 0.8 \times 15 \rceil = 12\). If a learner moves from 8/15 to 12/15, score rises from 53.3% to 80%, crossing mastery with four additional correct answers.

In 60 Seconds

Edge data calculations demonstrate that downsampling and aggregation at gateways can achieve 100x to 14,400x data volume reductions. FIFO buffer management gracefully handles overflow by prioritizing recent data, while gateway-level bundling provides 98%+ power savings by reducing transmission counts from thousands to dozens per day.

65.1 Learning Objectives

This chapter tests your understanding through questions and exercises. Think of it as a practice session that helps you identify which topics you know well and which ones need more review. Working through these problems builds the confidence you need for real-world IoT data challenges.

~30 min | Intermediate | P10.C09.U02

Key Concepts

  • Data volume calculation: Computing the total bytes generated by an IoT deployment: devices × sensors_per_device × sampling_rate_Hz × bytes_per_reading × seconds_per_day.
  • Compression ratio: The factor by which an algorithm reduces data size; a compression ratio of 10:1 means 100 bytes of raw data is compressed to 10 bytes.
  • Bandwidth requirement: The minimum network throughput needed to transmit IoT data: calculated as data_volume_per_second × (1 + overhead_factor) for protocol headers and retransmissions.
  • Battery life estimation: Computing device operational duration: battery_capacity_mAh / average_current_draw_mA, where average current is weighted across active, idle, and sleep modes by their duty cycles.
  • Edge reduction factor: The ratio of data volume before edge processing to data volume after: original_readings / transmitted_readings, expressing how much the edge tier reduces the cloud data ingestion load.

By the end of this chapter, you will be able to:

  • Calculate Data Reduction: Compute downsampling and aggregation effects on data volumes
  • Analyze Buffer Management: Distinguish FIFO queue behavior and overflow handling strategies
  • Evaluate Bundling Benefits: Quantify power and bandwidth savings from data bundling
  • Design Gateway Architecture: Select appropriate solutions for non-IP device integration

65.2 Quiz: Data Reduction and Transmission

A smart agriculture deployment uses LoRaWAN (maximum 51 bytes per uplink on SF12). Each soil sensor transmits temperature (2 bytes), moisture (2 bytes), battery voltage (2 bytes), sensor ID (8 bytes), timestamp (4 bytes), and metadata (6 bytes) = 24 bytes per reading. With 3 sensors per message, the payload exceeds LoRaWAN limits. Here is the optimization using base values.

Current payload (per sensor):

Sensor 1: ID="SENSOR001" (8B) + timestamp=1736424000 (4B) + temp=22.5 (2B)
           + moisture=45 (2B) + voltage=3.3 (2B) + meta (6B) = 24 bytes
Sensor 2: ID="SENSOR002" (8B) + timestamp=1736424000 (4B) + temp=23.1 (2B)
           + moisture=43 (2B) + voltage=3.2 (2B) + meta (6B) = 24 bytes
Sensor 3: ID="SENSOR003" (8B) + timestamp=1736424000 (4B) + temp=22.8 (2B)
           + moisture=44 (2B) + voltage=3.3 (2B) + meta (6B) = 24 bytes
Total: 3 x 24 = 72 bytes (EXCEEDS 51-byte LoRaWAN limit)

Optimized with base values:

Base values (sent once):
  Base ID prefix="SENSOR00" (8B) + base timestamp=1736424000 (4B)
  + metadata (6B) = 18 bytes

Sensor 1: ID suffix=1 (1B) + temp=225 (1B, in 0.1C units)
           + moisture=45 (1B) + voltage=33 (1B, in 0.1V units) = 4 bytes
Sensor 2: ID suffix=2 (1B) + temp=231 (1B)
           + moisture=43 (1B) + voltage=32 (1B) = 4 bytes
Sensor 3: ID suffix=3 (1B) + temp=228 (1B)
           + moisture=44 (1B) + voltage=33 (1B) = 4 bytes

Total: 18 + (3 x 4) = 30 bytes (FITS in 51-byte limit)

Further optimization using delta encoding:

Store the first sensor’s full values; subsequent sensors encode deltas from the first:

  • Sensor 1: ID=1 (1B) + temp=225, moisture=45, voltage=33 (3 bytes) = 4 bytes
  • Sensor 2: ID=2 (1B) + temp_delta=+6, moisture_delta=-2, voltage_delta=-1 (3 bytes as signed int8) = 4 bytes
  • Sensor 3: ID=3 (1B) + temp_delta=+3, moisture_delta=-1, voltage_delta=0 (3 bytes as signed int8) = 4 bytes
  • Total: 18 base + 3 x 4 sensor data = 30 bytes (58% reduction from 72 bytes)

This compression allows 3 sensors per LoRaWAN message instead of requiring 2 messages, reducing transmission count by 33% and saving battery proportionally.

Strategy Best For Compression Ratio CPU Cost Example Use Case
Base Values Repetitive metadata (IDs, timestamps, units) 30–70% reduction Very low (simple subtraction) SenML for LoRaWAN sensor payloads
Delta Encoding Time-series with gradual changes 50–80% reduction Low (diff calculation) Temperature sensors changing slowly
Run-Length Encoding Binary sensors with long stable periods 10–90% (highly variable) Very low Door sensors (open/closed), motion (long idle periods)
Quantization Float values with acceptable precision loss 50–75% reduction Low (rounding/scaling) Temperature (0.1 C precision sufficient vs float32)
Huffman/LZ General text/structured data 40–60% reduction High (encoding/decoding) JSON payloads with repeated keys
CBOR JSON alternative with binary encoding 30–50% vs JSON Medium MQTT payloads, CoAP with SenML

Decision criteria:

  • Network constraint under 100 bytes: Use base values + quantization (e.g., LoRaWAN, Sigfox)
  • Battery critical: Prefer low CPU-cost methods (base values, delta encoding) over Huffman
  • High-frequency data (above 1 Hz): Delta encoding works well for gradual changes (temperature, pressure)
  • Binary state data: Run-length encoding for motion sensors, door switches
  • Standard interoperability needed: CBOR with SenML (RFC 8428) for cross-vendor compatibility
Common Mistake: Downsampling Without Considering Nyquist Frequency

Engineers often downsample sensor data to reduce bandwidth without analyzing signal characteristics. A vibration sensor sampling at 1,000 Hz gets downsampled to 10 Hz to save bandwidth, inadvertently filtering out the 50 Hz motor imbalance frequency that indicates bearing wear.

What goes wrong: A factory monitors motor vibration with accelerometers sampling at 1 kHz (1,000 samples/second). To reduce LoRaWAN transmission costs, the edge gateway downsamples to 10 Hz (taking every 100th sample). A failing bearing generates a 50 Hz vibration spike – but at 10 Hz sampling, this signal is aliased and appears as random noise, making the failure undetectable.

Why it fails: The Nyquist theorem requires sampling at at least 2x the highest frequency of interest. For 50 Hz motor vibration, the minimum sampling rate is 100 Hz. At 10 Hz sampling, the 50 Hz signal aliases to lower frequencies and becomes unrecognizable.

The correct approach:

  1. Identify frequencies of interest BEFORE downsampling:

    • Motor speed: 30 Hz (1,800 RPM)
    • Bearing failure signature: 50–100 Hz harmonics
    • Required sample rate: 2 x 100 Hz = 200 Hz minimum
  2. Use frequency-domain aggregation instead of time-domain downsampling:

    # BAD: Time-domain downsampling loses high-frequency content
    downsampled = raw_signal[::100]  # Keep every 100th sample (1000 Hz -> 10 Hz)
    
    # GOOD: FFT-based aggregation preserves frequency bands
    fft_result = np.fft.fft(raw_signal)
    dominant_freqs = np.argsort(np.abs(fft_result))[-5:]  # Top 5 components
    # Transmit: dominant frequencies + magnitudes (10 values vs 1000 raw)
  3. Apply anti-aliasing filter before downsampling:

    • Before reducing 1,000 Hz to 100 Hz, apply a low-pass filter at 50 Hz cutoff
    • This removes frequencies that would alias, preserving signal integrity
    • Then downsample by 10x safely

Real consequence: A wind turbine farm monitored blade vibration at 500 Hz, downsampling to 1 Hz for cloud transmission (to save LTE bandwidth costs). Critical flutter vibrations at 5–10 Hz were completely lost due to aliasing. A blade failure occurred without warning – post-incident FFT analysis of raw 500 Hz data (briefly stored locally) showed a clear 8 Hz resonance signature 2 weeks prior. The fix: transmit 500 Hz data through FFT and extract the dominant 10 frequency peaks (50 bytes) instead of sending a 1 Hz downsampled value (2 bytes). Result: 96% bandwidth savings while preserving all failure signatures. The lesson: understand your signal’s frequency content before choosing a downsampling rate.

65.3 Concept Relationships

Core Calculation Skills:

  • Data Reduction Ratio = Raw volume / Transmitted volume (100–14,400x achievable)
  • FIFO Buffer Management = Graceful degradation prioritizing recent data
  • Bundling Power Savings = Transmission count reduction (98%+ savings)
  • Gateway Architecture = Non-IP device integration (960 devices to 10–20 gateways)

Builds on:

Enables:

Real-World Applications:

  • Danfoss Supermarkets – Three edge patterns for three sensor types
  • Agricultural LoRa – 98.3% power reduction through hourly bundling
  • Industrial Gateway – 10–20 gateways serve 960 non-IP devices ($30K vs $288K replacement)

65.4 See Also

Calculation Foundations:

Study Progression:

Interactive Tools:

Common Pitfalls

Raw sensor data accounts for only 50–80% of transmitted bytes; the remainder is protocol headers, checksums, and retransmissions. Always add a 20–50% overhead factor to raw data volume when calculating bandwidth requirements.

Battery life uses average current, but power supply sizing uses peak current. A device averaging 2 mA but peaking at 200 mA during radio transmission needs a supply capable of sourcing 200 mA to avoid voltage collapse.

Network speeds are quoted in bits per second (Mbps, kbps) while data volumes are typically in bytes. A 1 Mbps link can transmit 125 KB per second, not 1 MB per second. Always convert consistently.

65.5 Summary

  • Data reduction calculations demonstrate 100–14,400x volume reductions achievable through downsampling and aggregation at edge gateways
  • FIFO buffer management provides graceful degradation under high load by prioritizing recent data over older readings
  • Bundling strategies achieve 60x transmission reduction and 98%+ power savings for agricultural and industrial sensor networks
  • Edge gateways provide the most cost-effective solution for non-IP device integration, reducing management complexity from 960 devices to 10–20 gateways
Key Takeaway

Data calculations are the foundation of edge computing decisions. Downsampling (1 kHz to 10 Hz = 100x reduction) combined with aggregation (100 sensors into 1 summary) can achieve 14,400x total data reduction. FIFO buffer management ensures graceful degradation under load, bundling strategies reduce transmission power by 98%, and edge gateways provide the most cost-effective solution for integrating non-IP legacy devices at $30,000 versus $288,000 for device replacement.

“The Data Diet Challenge!”

“I am on a data diet!” announced Max the Microcontroller one morning at the factory.

“A data diet?” Sammy the Sensor looked confused. “But I give you 1,000 readings every single second!”

“Exactly!” Max explained. “That is like eating 1,000 meals a day – way too much! Here is my plan:”

“Step 1: I will only taste every 10th reading instead of all 1,000. That is called downsampling – like checking the pool temperature once instead of testing every drop of water.”

“Step 2: I group 100 of your sensor friends together and write one report card for all of them. Instead of 500 individual notes, I write just 5 summaries. That is called aggregation – like a teacher averaging the whole class instead of listing every single score.”

Bella the Battery was thrilled. “And when you send data less often, I save power too! Instead of transmitting every minute, we bundle an hour of readings into one package. My energy lasts 60 times longer – I could run for 30 years!”

“But what if the readings pile up faster than Max can process them?” Lila the LED worried.

“Easy!” Max said. “I have a buffer – like a queue at the lunch counter. If the line gets too long, the person who has been waiting the longest leaves to make room for the newest arrival. Newest data is always the freshest and most important!”

What did the Squad learn? Edge computing puts data on a diet through downsampling and aggregation, saving bandwidth and battery. When data arrives too fast, FIFO buffers keep things orderly by always keeping the newest information!

65.6 What’s Next

Current Next
Edge Quiz: Data Calculations Edge Quiz: Power and Optimization

Related topics:

Chapter Focus
Edge Quiz: Fundamentals Business case and ROI foundation
Edge Quiz: Comprehensive Review Integration scenarios