Edge data acquisition is the process of collecting sensor data at the network periphery and deciding what to process locally versus what to send to the cloud. IoT devices fall into three categories – Big Things (servers), Small IP Things (smart cameras), and Non-IP Things (simple sensors needing gateways) – each requiring different acquisition strategies based on their connectivity and processing capabilities.
46.1 Learning Objectives
By the end of this chapter, you will be able to:
Classify IoT Data Sources: Distinguish between Big Things, Small IP Things, and Non-IP Things in edge architectures
Explain Device Connectivity Paths: Describe how different device types connect to cloud infrastructure through direct IP or gateways
Analyze Data Generation Rates: Calculate data volumes across device categories and assess their implications for edge processing
Design Data Acquisition Strategies: Select and justify appropriate transmission schedules based on device capabilities and constraints
46.2 Prerequisites
Before diving into this chapter, you should be familiar with:
Edge, Fog, and Cloud Overview: Understanding the three-tier IoT architecture provides context for where edge data acquisition fits
Sensor Fundamentals: Knowledge of sensor types and characteristics helps understand data acquisition requirements
For Beginners: What is Edge Data Acquisition?
Think of edge data acquisition like a local newspaper reporter versus a national news network.
A local reporter (edge device) collects news from the neighborhood and decides what’s important enough to send to the national headquarters (cloud). They don’t send everything - just the highlights. This saves time, money, and keeps headquarters from being overwhelmed.
The “Edge” is simply where your sensors live:
Location
Example
Why “Edge”?
Your thermostat
Living room wall
At the edge of your network
Factory sensor
On a machine
Far from the central servers
Traffic camera
Roadside pole
Collecting data at the source
Three types of “Things” at the edge:
Big Things - Computers, servers (they can talk to the internet directly)
Small IP Things - Smart bulbs, webcams (they have their own internet connection)
Non-IP Things - Simple sensors that need a “translator” (gateway) to reach the internet
Why process data at the edge instead of sending everything to the cloud?
Challenge
Without Edge
With Edge
Speed
Wait for cloud response
Instant local decisions
Battery
Constant transmission drains battery
Send only summaries, save power
Bandwidth
Network gets clogged
Only important data travels far
Privacy
All your data goes to remote servers
Sensitive data stays local
Real-world example: A security camera generates 1GB of video per hour. Instead of sending all that to the cloud, edge processing detects “motion” and only uploads the 5-second clips that matter.
46.3 Introduction to Edge Data Acquisition
Time: ~5 min | Difficulty: Foundational | Reference: P10.C08.U01
Key Concepts
Edge acquisition architecture: The hardware and software design of a system that captures, validates, and pre-processes sensor data at or near the source before transmission to higher processing tiers.
Sensor interface bus: The low-level communication protocol connecting sensors to a microcontroller or gateway — common IoT interfaces include I2C, SPI, UART, and ADC.
Data aggregation gateway: A device that collects raw readings from multiple nearby sensors, applies local processing (averaging, event detection), and forwards summarised data to the cloud.
Ring buffer: A circular fixed-size memory structure used in edge devices to store a rolling window of recent sensor readings without dynamic memory allocation.
Interrupt-driven sampling: A microcontroller technique where a hardware timer interrupt triggers sensor reads at precise intervals, ensuring consistent sample timing without busy-wait polling.
DMA (Direct Memory Access): A hardware mechanism allowing peripherals (ADC, sensor buses) to transfer data directly to memory without CPU intervention, freeing the processor for other tasks.
Edge data acquisition is the process of collecting, processing, and transmitting sensor data at the network periphery - where physical devices meet the digital infrastructure. This chapter explores the fundamental architecture and device categories that form the foundation of efficient data collection at the IoT edge.
Key Takeaway
In one sentence: Collect raw data at the edge, but only transmit what’s needed - 90% of IoT data is never analyzed.
Remember this rule: If you can’t name who will use the data and how, don’t collect it.
Why Edge Matters
Traditional cloud-centric architectures require all sensor data to travel to remote servers for processing. Edge data acquisition shifts some processing closer to the source, reducing:
Latency: Critical for time-sensitive applications (autonomous vehicles, industrial safety)
Bandwidth: Raw sensor streams can overwhelm network capacity
Energy: Transmitting data is 10-100x more power-intensive than local processing
Privacy: Sensitive data can be processed locally without cloud exposure
46.4 IoT Device Categories
Time: ~10 min | Difficulty: Intermediate | Reference: P10.C08.U02
The key sources of data in IoT are the ‘Things’ - the physical devices and controllers located on Level 1 of the IoT Reference Model. Things can be accessed directly to send and receive data, however, to be IoT ‘Things’, they must be connected to the Internet.
46.4.1 Three Categories of Things
Figure 46.1: IoT Device Categories and Gateway Connectivity Paths
Big Things might be computers and databases. Small IP-enabled Things could include webcams, lights, and smartphones. Non-IP Things may need a Gateway or other device to assist - examples include lights, temperature gauges, locks, and gates.
46.4.2 Device Characteristics Comparison
Category
Examples
Connectivity
Data Rate
Processing Capability
Big Things
Servers, industrial PLCs
Full IP stack
GB/day
High (full OS)
Small IP Things
Smart cameras, lights
Wi-Fi, cellular
MB/day
Medium (embedded)
Non-IP Things
Temperature sensors, door locks
Zigbee, BLE, Modbus
KB/day
Low (microcontroller)
46.5 Data Generation Patterns
Time: ~8 min | Difficulty: Intermediate | Reference: P10.C08.U02b
Understanding data generation patterns is essential for designing efficient edge acquisition systems. Different device types produce vastly different data volumes and require different handling strategies.
Figure 46.2: Data generation statistics for IoT devices
Figure 46.3: Data generation rates and volumes comparison table
Alternative View: Data Volume by Device Type
This view shows how data generation rates vary dramatically by device type, driving different edge processing strategies:
Different device types require different edge processing strategies based on their data generation rates and the value of raw versus processed data.
46.5.1 Inertial Measurement Example
High-frequency sensors like accelerometers and gyroscopes demonstrate why edge aggregation is critical:
Figure 46.4: Accelerometer and gyroscope example sensor data
At 100 Hz sampling across 6 axes (3 accelerometer + 3 gyroscope), an IMU generates 600 samples/second. Transmitting raw data as 16-bit integers would require ~1.2 KB/s – unsustainable for battery-powered devices on LPWAN networks. Edge aggregation reduces this to statistical summaries at 1 Hz.
Putting Numbers to It
How much bandwidth does edge aggregation save for IMU data?
Use the sliders below to explore how sampling rate, number of axes, and aggregation window affect raw versus aggregated data rates. Observe how quickly raw data exceeds LPWAN capacity.
Time: ~5 min | Difficulty: Intermediate | Reference: P10.C08.U02c
Device capabilities directly impact acquisition strategies. The key decision point is power source:
Mains-powered devices (factory equipment, building systems): Can sample continuously and transmit frequently – edge processing focuses on bandwidth reduction, not power savings
Battery-powered devices (field sensors, wearables): Must duty-cycle both sampling and transmission – edge processing is essential to extend battery life from days to years
Energy-harvesting devices (solar-powered nodes): Operate with variable power budgets – edge processing must adapt to available energy
Alternative View: Power Budget Decision Tree
This decision tree visualizes how to select the optimal duty cycle based on power constraints:
46.7 Code Example: IMU Edge Aggregation Pipeline
This Python example demonstrates edge aggregation for the inertial measurement use case discussed above. A 100 Hz IMU produces 600 samples/second across 6 axes. Transmitting raw data is unsustainable, so the edge pipeline computes 1 Hz statistical summaries (RMS and peak per accelerometer axis, RMS per gyroscope axis), reducing bandwidth by ~67x:
import mathimport timeclass IMUEdgeAggregator:"""Aggregate high-frequency IMU data into 1 Hz statistical summaries. Reduces 600 samples/second (100 Hz x 6 axes) to 9 summary values per second, cutting transmission from 1200 bytes/s to 18 bytes/s (67x reduction). """def__init__(self, sample_rate_hz=100, window_sec=1):self.sample_rate = sample_rate_hzself.window_size = sample_rate_hz * window_secself.buffer_accel = {"x": [], "y": [], "z": []}self.buffer_gyro = {"x": [], "y": [], "z": []}def add_sample(self, ax, ay, az, gx, gy, gz):"""Add one raw IMU sample (called at 100 Hz)."""self.buffer_accel["x"].append(ax)self.buffer_accel["y"].append(ay)self.buffer_accel["z"].append(az)self.buffer_gyro["x"].append(gx)self.buffer_gyro["y"].append(gy)self.buffer_gyro["z"].append(gz)def _rms(self, values):"""Root mean square -- captures vibration energy."""ifnot values:return0.0return math.sqrt(sum(v * v for v in values) /len(values))def _peak(self, values):"""Peak absolute value -- detects impacts."""ifnot values:return0.0returnmax(abs(v) for v in values)def window_ready(self):"""Check if enough samples collected for one summary."""returnlen(self.buffer_accel["x"]) >=self.window_sizedef compute_summary(self):"""Compute 1 Hz summary from buffered samples. Returns dict with RMS and peak for each axis -- enough to detect vibration anomalies without raw data. """ summary = {"timestamp": int(time.time()), "samples": self.window_size}for axis in ["x", "y", "z"]: accel =self.buffer_accel[axis][:self.window_size] gyro =self.buffer_gyro[axis][:self.window_size] summary[f"accel_{axis}_rms"] =round(self._rms(accel), 4) summary[f"accel_{axis}_peak"] =round(self._peak(accel), 4) summary[f"gyro_{axis}_rms"] =round(self._rms(gyro), 2)# Clear processed samplesfor axis in ["x", "y", "z"]:self.buffer_accel[axis] =self.buffer_accel[axis][self.window_size:]self.buffer_gyro[axis] =self.buffer_gyro[axis][self.window_size:]return summarydef estimate_bandwidth(self):"""Compare raw vs aggregated data rates.""" raw_bytes_per_sec =self.sample_rate *6*2# 6 axes, 2 bytes each summary_bytes =9*2# 9 summary values, 2 bytes each ratio = raw_bytes_per_sec / summary_bytesreturn {"raw_bytes_per_sec": raw_bytes_per_sec,"summary_bytes_per_sec": summary_bytes,"reduction_ratio": f"{ratio:.0f}x", }# Simulate: factory motor vibration monitoringagg = IMUEdgeAggregator(sample_rate_hz=100, window_sec=1)# Feed 100 simulated samples (1 second of data)import randomfor i inrange(100):# Normal vibration: small accelerations around 0g with noise agg.add_sample( ax=random.gauss(0, 0.05), ay=random.gauss(0, 0.05), az=random.gauss(1.0, 0.03), # 1g gravity on Z gx=random.gauss(0, 2), gy=random.gauss(0, 2), gz=random.gauss(0, 1) )if agg.window_ready(): summary = agg.compute_summary()print("1-second summary (transmitted via LoRa):")for key, val in summary.items():print(f" {key}: {val}")bw = agg.estimate_bandwidth()print(f"\nBandwidth: {bw['raw_bytes_per_sec']} B/s raw -> "f"{bw['summary_bytes_per_sec']} B/s summary = "f"{bw['reduction_ratio']} reduction")# Output:# 1-second summary (transmitted via LoRa):# timestamp: 1738900000# samples: 100# accel_x_rms: 0.0498# accel_x_peak: 0.1523# accel_y_rms: 0.0512# accel_y_peak: 0.1389# accel_z_rms: 1.0004# accel_z_peak: 1.0891# gyro_x_rms: 1.98# gyro_y_rms: 2.05# gyro_z_rms: 0.99## Bandwidth: 1200 B/s raw -> 18 B/s summary = 67x reduction
The edge device transmits only RMS (vibration energy) and peak (impact detection) values at 1 Hz instead of raw waveforms at 100 Hz. A sudden increase in accel_x_rms from 0.05g to 0.5g flags a developing bearing fault without requiring cloud-side waveform analysis.
46.8 Knowledge Check
Quiz: Device Categories
For Kids: Meet the Sensor Squad!
The three types of “Things” at the edge!
The Sensor Squad was setting up a smart garden, and they discovered three very different kinds of helpers:
Sammy the Sensor was a tiny temperature sensor with no internet connection. “I am a Non-IP Thing,” he explained. “I can only whisper my readings to Max using a special short-range language called Zigbee. I need Max to translate for me!”
Lila the LED was a smart camera with Wi-Fi built in. “I am a Small IP Thing! I can talk to the internet all by myself, but I am not as powerful as a big computer.”
Max the Microcontroller was connected to a powerful Raspberry Pi gateway. “I am like a Big Thing – I can run programs, store data, and talk directly to the cloud. My job is to listen to Sammy and help him get his messages to the internet.”
Bella the Battery reminded everyone: “Remember, sending data far away uses a LOT of my energy! So Max should only send the important stuff to the cloud – like when the temperature is too hot for the tomatoes. The normal readings can stay here.”
The lesson: Different devices have different abilities, and smart IoT systems match the right strategy to each type of device!
46.8.1 Worked Example: Fonterra Dairy Farm Edge Acquisition Design
Scenario: Fonterra, New Zealand’s largest dairy cooperative, deploys IoT sensors across 200 milking sheds to monitor milk quality and cow health in real-time. Each shed has a mix of device categories requiring different acquisition strategies.
Adjust the parameters below to see how edge processing affects connectivity costs for a multi-shed IoT deployment. The dominant cost driver is typically the highest-bandwidth device (cameras).
Result: A single Raspberry Pi 4 gateway (NZD 900 with edge ML accelerator) per shed handles all three device categories: protocol translation for Non-IP sensors, video analytics for IP cameras, and pass-through for the SCADA controller. The 99% data reduction makes rural 4G connectivity economically viable.
Key Insight: The three device categories in Fonterra’s deployment map directly to three gateway functions: Big Things need routing (IP to IP), Small IP Things need edge inference (reduce high-bandwidth streams), and Non-IP Things need protocol translation (Modbus/4-20 mA/RFID to MQTT). A single edge gateway serves all three roles, and the dominant cost driver is always the highest-bandwidth device category (cameras, in this case).
Interactive Quiz: Match Concepts
Interactive Quiz: Sequence the Steps
Common Pitfalls
1. Polling sensors in a busy loop instead of using interrupts
Busy-loop polling wastes CPU cycles and prevents the processor from entering low-power sleep states. Use hardware timer interrupts or DMA to trigger sensor reads, allowing the MCU to sleep between samples.
2. Designing acquisition architecture without defining the data budget first
The architecture must be designed backwards from the bandwidth constraint: start with the available link budget, determine how many bytes per second can be transmitted, then design sampling rates and pre-aggregation to fit within that budget.
3. Ignoring timestamp accuracy in multi-sensor architectures
When sensor readings from different buses are acquired at slightly different times due to software scheduling delays, fusing them without correcting for the time offset produces incorrect results. Use hardware timestamps from a shared timer source.
4. Not designing for sensor hot-swapping
In industrial deployments, sensors fail and are replaced while the system is running. Design the acquisition layer to detect new sensors at startup or during operation and handle their absence gracefully rather than crashing.
Label the Diagram
46.9 Summary
Edge data acquisition architecture is built on understanding three fundamental device categories:
Big Things: Full-capability computers with direct cloud connectivity - minimal edge processing needed
Small IP Things: Embedded devices with IP connectivity - benefit from edge compression and filtering
Non-IP Things: Simple sensors requiring gateways - need edge aggregation for efficient transmission
The acquisition strategy must match device capabilities: high-volume devices (cameras) need compression, low-volume devices (temperature sensors) need aggregation, and non-IP devices need protocol translation through gateways.
46.10 Concept Relationships
This chapter establishes the foundational architecture for edge data collection:
Core Classification (This chapter):
Three device categories (Big Things, Small IP Things, Non-IP Things) determine connectivity paths and acquisition strategies
Data generation patterns vary 1000x across categories (door sensors: bytes/day vs cameras: gigabytes/day)