This section provides a stable anchor for cross-references to code offloading and heterogeneous computing across the curriculum.
12.2 Learning Objectives
By the end of this chapter, you will be able to:
Understand Code Offloading Decisions: Explain when to process locally versus offload to cloud based on energy profiles
Calculate Offloading Energy Costs: Compute transmission energy for Wi-Fi vs cellular networks
Apply MAUI Framework: Use the MAUI decision framework to make context-aware offloading decisions
Leverage Heterogeneous Cores: Match computational tasks to appropriate processors (CPU, GPU, DSP, NPU)
Design Energy-Preserving Sensing Plans: Find the cheapest sequence of operations to determine context
In 60 Seconds
Code offloading decides whether an IoT device should compute locally or send data to the cloud; the right choice depends on comparing radio transmission energy (0.1 mJ/KB over Wi-Fi, 1 mJ/KB over cellular) against local computation energy, using frameworks like MAUI to make this decision automatically at runtime.
Key Concepts
Code Offloading: Migrating a computation from the IoT device to a remote server (cloud or edge) to reduce local energy consumption
MAUI Framework: A system that automatically profiles computation and network state to decide whether local or remote execution is cheaper
Transmission Energy: The energy cost of sending data over a radio link; equals data_size × energy_per_byte, which varies by technology and signal strength
Heterogeneous Computing: Using multiple specialized processor types (CPU, GPU, DSP, NPU) on one chip, each optimized for different workload characteristics
NPU (Neural Processing Unit): A chip accelerator designed for neural network inference, achieving 10–100x better energy efficiency than GPU for AI workloads
Energy-Preserving Sensing Plan: A sequence of cheap sensor readings that can infer expensive context without directly measuring it
Local vs Cloud Breakeven: The computation complexity threshold at which local processing uses less energy than transmitting data to the cloud
For Beginners: Code Offloading & Computing
Energy and power management determines how long your IoT device can operate between battery changes or charges. Think of packing for a camping trip with limited battery packs – every bit of power must be used wisely. Since many IoT sensors need to run for months or years unattended, power management is often the single most important engineering decision.
Sensor Squad: Do It Here or Send It Away?
“Sometimes I have a really hard math problem to solve,” said Max the Microcontroller. “I COULD do it myself, but it would take forever and drain Bella’s battery. Or I could send the data to a powerful cloud server and let IT do the math. That is called code offloading.”
Sammy the Sensor asked, “But sending data uses energy too, right?” Max nodded, “Exactly! That is the trade-off. Sending data over Wi-Fi costs about 0.1 millijoules per byte, but over cellular it costs 10 times more. So sometimes it is cheaper to compute locally, and sometimes it is cheaper to offload. The MAUI framework helps you decide.”
Bella the Battery broke it down simply: “If the computation is small, do it locally. If the computation is huge and you have Wi-Fi, send it to the cloud. If you are on cellular with bad signal, definitely do it locally – transmitting over a weak signal wastes tons of my energy!” Lila the LED added, “Modern chips also have specialized processors – a GPU for graphics, a DSP for audio, an NPU for AI. Using the right processor for each job saves energy too!”
12.3 Prerequisites
Before diving into this chapter, you should be familiar with:
Sensing Planner finds the best sequence of proxy attributes to sense, considering: - Direct sensing cost - Inference possibilities from cached attributes - Confidence of inference rules - Overall energy minimization
Example: To determine “InOffice”, options are: 1. Sense directly (80mW) 2. If “Running=True” cached, infer “InOffice=False” (0mW) 3. If “AtHome=True” cached, infer “InOffice=False” (0mW)
Choose the cheapest option!
12.5 Code Offloading Decisions
Figure 12.2: MAUI Code Offloading Decision: Local vs Remote Execution Energy Analysis
12.5.1 Interactive Offloading Energy Calculator
Show code
viewof network_type = Inputs.select(["Wi-Fi","LTE"], {label:"Network type",value:"Wi-Fi"})viewof local_power = Inputs.range([10,1000], {value:50,step:10,label:"Local CPU power (mW)"})viewof compute_time = Inputs.range([0.1,60], {value:2,step:0.1,label:"Computation time (seconds)"})viewof data_size = Inputs.range([1,5000], {value:1000,step:10,label:"Data size to transmit (KB)"})
MAUI (Mobile Assistance Using Infrastructure): Framework that profiles code components in terms of energy to decide whether to run locally or remotely.
Considerations:
Costs related to transfer of code/data
Dynamic decisions based on network constraints
Latency requirements
Local vs remote execution energy
Example: With 3G, offloading may cost more energy due to high network transmission costs. With Wi-Fi, offloading can save significant energy.
Common Misconception: “Cloud Processing Is Always More Energy Efficient”
The Misconception: “The cloud has powerful servers, so offloading computation always saves energy on my IoT device.”
The Reality: Network transmission energy often exceeds local computation energy, especially on cellular networks. The decision depends on network type, data size, and computation complexity.
Quantified Comparison:
Task: Process 1MB sensor data with ML model (2 seconds computation)
Quantified Energy Comparison:
Task: Process 1MB sensor data with ML model (2 seconds computation on local device)
Key insight: LTE tail energy (5-10 sec radio-on after transmission) dominates. For short tasks under 30 seconds, local execution wins. For longer tasks (60+ seconds), offloading can save energy despite the tail penalty.
When Cloud Wins:
Task: Complex ML inference (60 seconds on local device)
Wi-Fi available + heavy computation → Offload (2-20× savings)
LTE only + light computation → Local (avoid 10-15× penalty)
Battery <20% → Always local (conserve energy)
Latency critical → Offload if Wi-Fi, local if LTE
Key Insight: The 5-10 second LTE “tail energy” (radio staying on after transmission) often consumes more energy than the entire local computation. Context-aware offloading decisions must consider network type, not just raw transmission costs.
12.6 Local Computation: Heterogeneous Cores
Figure 12.3: Heterogeneous Mobile SoC Architecture: CPU, DSP, GPU, and NPU Task Scheduling
Modern Mobile SoCs include heterogeneous cores: - CPU: General purpose, control flow - GPU: Massively parallel, graphics and compute - DSP: Low-power signal processing, audio/sensor data - NPU: Neural network acceleration, ML inference
Benefits:
Increase performance and power efficiency
Selected tasks shift to more efficient cores
Dynamic voltage/frequency scaling per core
Example - Keyword Spotting:
Optimized GPU is >6x faster than cloud
Optimized GPU is >21x faster than sequential CPU
Optimized GPU with batching outperforms cloud energy-wise
12.7 Knowledge Check: Heterogeneous Computing
Quiz: Local Computation and Offloading
12.8 Code Offloading Energy Analysis Worksheet
Work Through: Code Offloading Energy Analysis
Scenario: Image processing on wearable device - local vs cloud decision
12.8.1 Step 1: Local Processing Energy
Component
Power
Duration
Energy
Image Capture
80 mA @ 3.7V = 296 mW
100 ms
29.6 mJ
CPU Processing
200 mA @ 3.7V = 740 mW
3000 ms
2,220 mJ
Total Local
-
-
2,249.6 mJ
12.8.2 Step 2: Cloud Offloading Energy (Wi-Fi)
Component
Power
Duration
Energy
Image Capture
80 mA @ 3.7V = 296 mW
100 ms
29.6 mJ
Wi-Fi TX (upload 50KB)
250 mA @ 3.7V = 925 mW
400 ms
370 mJ
Wi-Fi RX (download 5KB)
150 mA @ 3.7V = 555 mW
50 ms
27.75 mJ
Idle Wait (remote processing)
15 mA @ 3.7V = 55.5 mW
500 ms
27.75 mJ
Total Cloud (Wi-Fi)
-
-
455.1 mJ
Wi-Fi Decision: Offload (saves 1,794 mJ = 80% energy reduction)
12.8.3 Step 3: Cloud Offloading Energy (LTE)
Component
Power
Duration
Energy
Image Capture
80 mA @ 3.7V = 296 mW
100 ms
29.6 mJ
LTE TX (upload 50KB)
500 mA @ 3.7V = 1,850 mW
800 ms
1,480 mJ
LTE RX (download 5KB)
300 mA @ 3.7V = 1,110 mW
100 ms
111 mJ
RRC State Overhead
200 mA @ 3.7V = 740 mW
2000 ms
1,480 mJ
Total Cloud (LTE)
-
-
3,100.6 mJ
LTE Decision: Process locally (saves 851 mJ vs LTE offloading)
12.8.4 Step 4: MAUI Decision Framework
Decision = {
if (Wi-Fi available AND energy_cloud_wifi < energy_local):
return "OFFLOAD_WIFI"
elif (energy_local < energy_cloud_cellular):
return "PROCESS_LOCAL"
elif (battery > 50% AND latency_critical):
return "OFFLOAD_CELLULAR"
else:
return "PROCESS_LOCAL"
}
12.8.5 Step 5: Context-Aware Adaptation
Context
Network
Battery
Decision
Energy
Rationale
At Home
Wi-Fi
80%
Offload
455 mJ
Wi-Fi cheap, fast
Outdoors
LTE
80%
Local
2,250 mJ
LTE expensive
Outdoors
LTE
15%
Local
2,250 mJ
Battery critical
At Office
Wi-Fi
15%
Offload
455 mJ
Save battery with Wi-Fi
Your Turn: Calculate offloading decisions for your application!
12.9 Sensor Fusion Energy Optimization Worksheet
12.9.1 Interactive GPS vs Inference Energy Calculator
Show code
viewof gps_power = Inputs.range([20,100], {value:45,step:5,label:"GPS power (mA)"})viewof gps_duration = Inputs.range([5,60], {value:30,step:5,label:"GPS duration per measurement (sec)"})viewof accel_power = Inputs.range([0.1,5], {value:0.5,step:0.1,label:"Accelerometer power (mA)"})viewof inference_rate = Inputs.range([50,95], {value:85,step:5,label:"Inference success rate (%)"})
Scenario: Location tracking using GPS vs Wi-Fi/accelerometer inference
12.9.2 Step 1: Direct GPS Sensing
State
Current
Duration
Energy per Hour
GPS Active
45 mA
30 sec
0.375 mAh
Processing
20 mA
2 sec
0.011 mAh
BLE TX
15 mA
1 sec
0.004 mAh
Sleep
10 µA
27 sec
0.000075 mAh
Per measurement (60s cycle): 0.390 mAh Per hour (60 measurements): 23.4 mAh 200mAh battery life: 8.5 hours
12.9.3 Step 2: ACE Inference Strategy
Use cached GPS + accelerometer for motion detection
Scenario
Method
Current
Duration
Frequency
Stationary
Cached GPS
10 µA
60 sec
59 min/hour
Moving (inferred)
Accel check
0.5 mA
0.5 sec
59 times/hour
Verify Location
GPS
45 mA
30 sec
1 time/hour
Energy per hour:
E_stationary = 59 × (10µA × 60s) / 3600 = 0.0098 mAh
E_accel_check = 59 × (0.5mA × 0.5s) / 3600 = 0.0041 mAh
E_gps_verify = 1 × (45mA × 30s + 20mA × 2s) / 3600 = 0.386 mAh
E_total = 0.40 mAh per hour
200mAh battery life: 500 hours = 20.8 days
Energy savings: 58.5× improvement over continuous GPS!
12.9.4 Step 3: Association Rules for Inference
ACE learns these rules from history:
Rule
Support
Confidence
Inference
Accel_Still=True → AtHome=True
25%
85%
Skip GPS if still
Wi-Fi_SSID=Home → AtHome=True
30%
95%
Use Wi-Fi instead of GPS
Time=Night AND Still → Sleeping=True
15%
90%
10× reduce all sampling
Optimized energy with rules:
85% of requests served from cache/inference (0.01 mAh)
15% require GPS sensing (0.39 mAh)
Average: 0.085 mAh per request
Battery life: 2,352 hours = 98 days!
12.9.5 Step 4: Battery-Aware Adaptation
Battery Level
Strategy
GPS Frequency
Avg Current
100-50%
Normal
Every 5 min
0.40 mA
50-20%
Conservative
Every 15 min
0.15 mA
20-15%
Emergency
Every 30 min
0.08 mA
<15%
Critical
Every 60 min
0.04 mA
Your Turn: Design inference rules for your sensor fusion application!
12.10 Case Study: Google’s Adaptive Offloading in Pixel Phones
Google’s Pixel phones implement a real-world version of the MAUI framework for computational photography. The “Night Sight” feature requires processing 15-30 images through a multi-frame alignment and HDR+ pipeline – computationally equivalent to approximately 60 seconds of sustained CPU work.
The Offloading Decision in Practice
Condition
Processing Location
Why
Wi-Fi connected, charging
Cloud (Google Photos)
Zero energy penalty; cloud produces higher quality result
Wi-Fi connected, battery >50%
Hybrid (edge denoise + cloud enhance)
Balances quality with battery preservation
Cellular only, any battery
Fully local (Tensor NPU)
LTE upload of 30 raw images (~150 MB) costs 2,775 mJ vs 1,200 mJ local NPU processing
Airplane mode
Fully local (Tensor NPU)
No choice; queue cloud processing for later
Measured Energy Comparison
Night Sight processing (15 images, 12MP each):
Local CPU (Cortex-A76): 740 mW x 8.2 sec = 6,068 mJ
Local NPU (Tensor G3): 280 mW x 3.1 sec = 868 mJ (7x more efficient)
Wi-Fi offload: 925 mW x 2.0 sec (upload) + 55 mW x 1.5 sec (wait)
+ 555 mW x 0.3 sec (download) = 2,099 mJ
LTE offload: 1,850 mW x 3.5 sec (upload) + 740 mW x 5.0 sec (tail)
+ 1,110 mW x 0.5 sec (download) = 10,780 mJ
Key insight: The NPU (868 mJ) beats even Wi-Fi offloading (2,099 mJ) for this workload because the data transfer overhead exceeds the computational savings. This contradicts the naive assumption that “cloud is always more energy efficient.” Specialized local hardware has fundamentally changed the offloading calculus – the MAUI framework must account for heterogeneous local processors, not just CPU vs cloud.
When Cloud Still Wins
For Google Photos’ “Magic Eraser” feature (removing objects from images), the ML model requires 3.2 GB of weights that cannot fit on device. Here, offloading is mandatory regardless of energy cost. The decision becomes: offload now (if on Wi-Fi) or defer until Wi-Fi is available (if on cellular).
12.11 Visual Reference Gallery
MAUI Offloading Framework
MAUI Offloading
Intelligent offloading frameworks like MAUI reduce energy by delegating computation-heavy tasks to the cloud when network conditions are favorable.
LEO Low-Energy Offloading
LEO Overview
LEO (Low Energy Offloading) extends MAUI with dynamic adaptation. This visualization shows how the energy profiler monitors real-time consumption, the decision engine evaluates offload candidates, and the adaptive partitioner splits computation to minimize total energy under latency constraints.
Local Computation Performance
Local Computation
Local GPU processing can outperform cloud offloading for many IoT workloads, achieving 21× speedup over sequential CPU while avoiding network transmission energy costs.
Matching Quiz: Match Offloading Strategies to Scenarios
Ordering Quiz: Order the Offloading Decision Process
Label the Diagram
💻 Code Challenge
Order the Steps
Match the Concepts
12.12 Summary
Code offloading and heterogeneous computing are essential for energy-efficient IoT systems:
Energy-Preserving Sensing Plans: Always choose the cheapest method to obtain context - cache, inference, then direct sensing
MAUI Framework: Compare local execution energy against network transmission + idle wait + receive energy
Network-Aware Decisions: Wi-Fi offloading often saves energy; LTE offloading often wastes energy due to tail power
Heterogeneous Cores: Match tasks to appropriate processors - DSP for audio, GPU for parallel, NPU for ML
Context-Aware Adaptation: Adjust offloading decisions based on battery level, network type, and latency requirements
The key insight is that offloading decisions are highly context-dependent. Simple rules like “always offload” or “always local” are suboptimal - intelligent systems adapt to current conditions.
Worked Example: Calculating Offloading Energy for Image Classification
A smart camera needs to classify images (dog/cat/person). Compare local NPU vs Wi-Fi cloud offloading.
Local NPU processing (Google Edge TPU): - Inference time: 15 ms - Power during inference: 2.5 W - Idle power: 0.1 W - Energy per classification: 2.5 W × 0.015 s = 37.5 mJ
Wi-Fi cloud offloading:
Image size: 200 KB (JPEG compressed)
Result size: 1 KB (JSON classification)
Wi-Fi upload: 200 KB at 5 Mbps = 320 ms at 250 mW = 80 mJ
Wi-Fi download: 1 KB at 10 Mbps = 0.8 ms at 150 mW = 0.12 mJ
Idle wait (cloud processing): 50 ms at 20 mW = 1 mJ
Total cloud energy: 80 + 0.12 + 1 = 81.2 mJ
Conclusion: Local NPU wins (37.5 mJ vs 81.2 mJ = 54% energy savings). Wi-Fi transmission overhead exceeds local inference cost.
When cloud wins: If classification requires a 500 MB model (won’t fit on device), offloading is mandatory. Or if the device uses an older CPU instead of NPU: - CPU inference: 800 mW × 2 seconds = 1,600 mJ - Cloud offload: 81.2 mJ (20× more efficient!)
This demonstrates MAUI’s context-aware principle: offloading decision depends on available local hardware AND network conditions.
Decision Framework: Local vs Cloud Processing for Different IoT Workloads
Task
Data Size
Computation
Network
Battery
Recommendation
Energy Savings
Face detection (embedded NPU)
100 KB
20 ms
Wi-Fi
Any
Local NPU
3× vs cloud
Face detection (CPU only)
100 KB
3 sec
Wi-Fi
>50%
Cloud
5× vs local CPU
Voice recognition (keyword spotting)
5 KB
10 ms
Any
Any
Local DSP
10× vs cloud
Voice recognition (full transcription)
500 KB
5 sec
Wi-Fi
>30%
Cloud
2× vs local
Sensor data ML (simple model)
1 KB
5 ms
LTE
Any
Local
50× vs LTE tail
Video analytics (complex model)
5 MB
10 sec
Wi-Fi
>70%
Cloud
Only option (model size)
Decision criteria (MAUI framework):
Model fits on device? → If NO, must offload
Network is Wi-Fi? → If YES, offload heavy computation; if LTE, process locally
Battery <20%? → Always process locally (conserve energy)
Specialized hardware available (NPU/DSP)? → Process locally (10-100× faster)
Latency critical (<100 ms)? → Offload if Wi-Fi, local if LTE (tail latency)
Best For Your Project:
Real-time object detection on drone → Local (edge TPU, latency critical)
Batch image classification at home → Cloud (Wi-Fi available, no time pressure)
Wake word detection on wearable → Local (DSP ultra-low power)
Natural language queries → Cloud (models too large for edge)
Common Mistake: Forgetting GPU Power Consumption When Using Heterogeneous Cores
What they do wrong: Developers optimize a computer vision task to run on mobile GPU, achieving 5× speedup over CPU. They assume battery life improves proportionally: “5× faster means 5× less energy!”
Why it fails: GPUs consume 2-5× more power than CPUs even when delivering speedup. The energy equation is:
Energy = Power × Time
If GPU cuts time by 5× but uses 3× power, energy only improves 1.67× (not 5×).
Real calculation (mobile image processing): - CPU: 800 mW × 1,000 ms = 800 mJ - GPU: 2,400 mW × 200 ms (5× faster) = 480 mJ - Savings: 40% (not 80% as naively expected)
When GPU hurts energy: If task is small (CPU takes 50 ms), GPU overhead dominates: - CPU: 800 mW × 50 ms = 40 mJ - GPU: 2,400 mW × 20 ms (2.5× speedup) + 1,200 mW × 15 ms (init) = 48 + 18 = 66 mJ (worse!)
Correct approach: Profile actual power during execution, not just time. Use tools like: - Android Battery Historian - iOS Instruments (Energy Log) - Embedded: INA219 power monitor on VDD rail
Real-world example: A fitness app offloaded step counting to mobile GPU, expecting 10× battery improvement from 10× speedup. Actual battery life: 20% worse! GPU consumed 4.2 W during active processing vs 1.8 W for CPU, and the 30 ms task ran every second — GPU initialization overhead (50 ms at 2 W) consumed more energy than the computation saved. Switching back to CPU with NEON SIMD instructions delivered 3× speedup at 1.2× power = 2.5× net energy savings. Lesson: Speedup ≠ energy savings. Always measure power, not just time.
Common Pitfalls
1. Assuming Cloud Offloading Always Saves Energy
Many engineers offload computation assuming the cloud is “free” energetically. But radio transmission — especially over cellular or at low signal strength — can consume 10–100× more energy than the computation itself. Always compute the breakeven point before deciding.
2. Ignoring GPU Initialization Overhead
Routing a small task to the GPU may actually increase energy consumption because GPU initialization (50–100 ms at 2–3 W) exceeds the energy savings from faster execution. Only use GPU/NPU acceleration for tasks that take more than a few hundred milliseconds on the CPU.
3. Using Transmission Energy at Full Signal Strength Only
Radio energy increases dramatically at low signal strength as the transmitter boosts power. A cellular link at -100 dBm can consume 10× more energy than at -80 dBm. Always measure transmission energy under real deployment signal conditions.
4. Forgetting Partial Offloading Options
Offloading is not binary (local vs full cloud). Partial offloading — preprocessing on device to reduce data size, then sending compressed results — often provides the best energy tradeoff. Consider edge nodes as intermediate offload targets when cloud latency or transmission cost is too high.