Quantify energy costs for digital operations (computation, memory access)
Compare energy requirements across different operations
Apply the “million-to-one” rule for IoT energy optimization
Make informed decisions about local processing vs. transmission
Explain the battery technology gap and its implications
Key Concepts
Energy Cost Hierarchy: Radio transmission (1–100 mJ) > Flash write (1–10 µJ) > RAM read/write (0.01–1 nJ) > CPU instruction (~0.1–10 nJ) — radio is typically 1,000–100,000× more expensive than computation
Battery Technology Gap: Battery energy density improves ~5–8% per year while computing capability grows ~40% per year; IoT devices cannot simply wait for better batteries
Million-to-One Rule: A radio transmission costs approximately the same energy as executing 1 million CPU instructions — computation is effectively free compared to transmission
Floating-Point Cost: Software floating-point emulation on MCUs without FPU takes 10–100× more cycles than integer arithmetic; always use integer or fixed-point math on such devices
Flash Write Energy: Writing to internal flash typically costs 1–10 µJ per byte, orders of magnitude more than reading; minimize write frequency and batch writes
Connection Overhead: Radio protocols with connection setup (BLE, Wi-Fi, MQTT over TCP) pay fixed overhead per session; amortize overhead by transmitting larger batches of data less frequently
Transmission Energy Formula: E_tx = data_size_bytes × energy_per_byte, where energy_per_byte ranges from 0.1 µJ (BLE) to 1 mJ (cellular at poor signal)
In 60 Seconds
Every IoT operation has an energy price tag: radio transmission (1–100 mJ) costs 1,000–100,000× more than a CPU instruction, making communication the dominant energy consumer in most IoT devices and the first target for optimization.
For Beginners: Energy Cost of Common Operations
Energy and power management determines how long your IoT device can operate between battery changes or charges. Think of packing for a camping trip with limited battery packs – every bit of power must be used wisely. Since many IoT sensors need to run for months or years unattended, power management is often the single most important engineering decision.
Sensor Squad: The Million-to-One Rule!
“Here is a mind-blowing fact,” said Max the Microcontroller. “A single 32-bit addition takes about 1 picojoule of energy. Sending one byte over Bluetooth takes about 1 microjoule. That is a MILLION times more! The ‘million-to-one rule’ means it almost always costs less energy to compute locally than to transmit.”
Sammy the Sensor put it in perspective: “I can do a million math operations for the same energy cost as sending one short message. So if I can process my data locally and only send a tiny summary, I save enormous amounts of energy. Instead of sending 1,000 raw readings, Max calculates the average and sends one number.”
Bella the Battery ranked the operations from cheapest to most expensive: “Computation is almost free. Reading from flash memory costs 100 times more. Reading a sensor costs 1,000 times more. And wireless transmission costs a MILLION times more. This hierarchy tells you exactly where to focus your energy optimization!” Lila the LED concluded, “Think before you transmit. Every bit sent over the air is precious!”
5.2 Energy Cost of Common Operations
Understanding the energy cost of individual operations helps prioritize optimization efforts. The differences span multiple orders of magnitude.
5.2.1 Digital Operations Energy Hierarchy
Figure 5.1: Energy hierarchy showing orders of magnitude difference between operation types
5.2.2 Detailed Operation Costs
Operation
Energy
Relative Cost
Computation
32-bit integer add
0.1 pJ
1×
32-bit integer multiply
3 pJ
30×
32-bit floating point op
10 pJ
100×
AES-128 encryption (16 bytes)
100 nJ
1,000,000×
Memory Access
SRAM read (32-bit)
10 pJ
100×
Flash read (32-bit)
100 pJ
1,000×
DRAM access (32-bit)
3 nJ
30,000×
Flash write (32-bit)
10 nJ
100,000×
I/O and Sensing
GPIO toggle
5 nJ
50,000×
12-bit ADC conversion
500 nJ
5,000,000×
Temperature sensor read
10 µJ
100,000,000×
Wireless Communication
BLE advertising packet
30 µJ
300,000,000×
LoRa packet (SF7)
2 mJ
20,000,000,000×
Wi-Fi packet (with connect)
50 mJ
500,000,000,000×
5.2.3 The Million-to-One Rule
The Million-to-One Rule
Key Insight: Transmitting 1 byte over wireless costs approximately the same energy as 1 million 32-bit CPU operations.
This fundamental ratio has profound implications:
Always process locally when possible - Filtering, aggregation, and compression before transmission saves tremendous energy
Minimize payload size - Every byte costs as much as millions of operations
Batch transmissions - Amortize connection overhead over multiple readings
Use delta encoding - Send only changes, not absolute values
Consider compression - Even expensive algorithms save energy if they reduce TX bytes significantly
5.2.4 Communication vs. Computation Trade-off
Figure 5.2: Comparison showing how local processing dramatically reduces total energy consumption
Example: Smart Agriculture Sensor
Approach
Processing
Transmission
Total Energy
Raw data (60 readings/hour)
0
60 × 50 µJ = 3 mJ
3 mJ
Local average (1 value/hour)
60 × 10 pJ = 0.6 nJ
1 × 50 µJ = 50 µJ
50 µJ
Energy saved
60×
Example: Motion Detection Camera
Approach
Processing
Transmission
Total Energy
Stream all frames (10 fps)
0
10 × 100 KB × 8 = 8 Mbps
800 mW
Edge detection + TX changes
10 ms × 50 mA = 0.5 mJ/frame
1 KB when motion
5.5 mW
Energy saved
145×
5.3 The Battery Technology Gap in Numbers
Battery energy density improves slowly compared to computing:
Decade
Transistor Density
Battery Energy Density
Ratio
1990
1×
1×
1:1
2000
100×
1.5×
67:1
2010
10,000×
2×
5,000:1
2020
1,000,000×
3×
333,333:1
Implication: Software cannot assume energy abundance. Every new feature must be evaluated against its energy cost.
Interactive Calculator: Local Processing vs Transmission
Adjust the parameters below to see the energy savings from local processing:
Show code
viewof num_readings = Inputs.range([10,1000], {value:100,step:10,label:"Number of readings per hour"})viewof tx_energy_per_reading = Inputs.range([10,200], {value:50,step:5,label:"TX energy per reading (µJ)"})viewof compute_energy_per_op = Inputs.range([1,100], {value:10,step:1,label:"Computation energy per operation (pJ)"})
html`<div style="background: var(--bs-light, #f8f9fa); padding: 1rem; border-radius: 8px; border-left: 4px solid #3498DB; margin-top: 0.5rem;"><p><strong>Option A (transmit all):</strong> ${energy_calc.optionA_uJ.toFixed(0)} µJ per hour = ${(energy_calc.optionA_uJ/1000).toFixed(2)} mJ per hour</p><p><strong>Option B (compute average locally):</strong></p><ul style="margin-left: 1.5rem;"> <li>Computation: ${energy_calc.computation_pJ.toFixed(0)} pJ = ${energy_calc.computation_uJ.toFixed(6)} µJ</li> <li>Transmission: ${tx_energy_per_reading} µJ</li> <li><strong>Total:</strong> ${energy_calc.optionB_uJ.toFixed(3)} µJ per hour</li></ul><p style="font-size: 1.1em; color: #16A085; font-weight: bold;">Energy Savings: ${energy_calc.savings_percent.toFixed(1)}%</p><p style="font-size: 0.9em; color: #7F8C8D;">Computation is ${energy_calc.ratio.toLocaleString()}× cheaper than one transmission!</p></div>`
Key Insight: The million-to-one rule quantifies why local processing beats transmission. Computation energy is negligible compared to transmission costs. Always process locally before transmitting.
5.3.1 Why This Matters for IoT
The power gap between traditional computing and IoT is dramatic:
1990s laptop: 10W power budget — plenty for full functionality
2020s IoT sensor: 10µW average budget — every operation counts
Perspective: One hour of IoT sensor operation uses less energy than one second of 1990s laptop operation.
5.4 Energy-Efficient Coding Practices
5.4.1 Compiler Optimization Flags
Flag
Description
Energy Impact
-O0
No optimization
Baseline (slowest, most energy)
-O1
Basic optimization
20-40% reduction
-O2
Standard optimization
40-60% reduction
-O3
Aggressive optimization
50-70% reduction (larger code)
-Os
Optimize for size
40-60% reduction, better cache usage
-Ofast
Fastest execution
60-80% reduction (may break IEEE floats)
Recommendation for IoT: Use -Os when code size matters (flash-constrained), -O2 otherwise. Avoid -O3 on memory-limited devices as larger code may cause cache thrashing.
5.4.2 Algorithm Selection
// BAD: O(n²) bubble sort - energy proportional to n²void bubble_sort(int arr[],int n){for(int i =0; i < n-1; i++)for(int j =0; j < n-i-1; j++)if(arr[j]> arr[j+1]) swap(&arr[j],&arr[j+1]);}// BETTER: O(n log n) quicksort - energy proportional to n log n// For n=1000: bubble=1,000,000 ops, quicksort=10,000 ops = 100× savings// BEST for IoT: Don't sort at all - send summary statistics// Max, min, average require O(n) = 1,000 ops = 1000× savings vs bubble
5.4.3 Data Type Selection
Type
Size
Operations Energy
uint8_t
1 byte
1× (baseline)
uint16_t
2 bytes
1.2-1.5×
uint32_t
4 bytes
1.5-2×
float
4 bytes
10-20×
double
8 bytes
20-40×
Recommendation: Use smallest sufficient data type. Fixed-point arithmetic instead of floating-point where precision permits.
5.4.4 Worked Example: Energy Cost of Different Averaging Methods
Scenario: Calculate average of 100 temperature readings
Method 1: Floating Point
float sum =0.0f;for(int i =0; i <100; i++){ sum +=(float)readings[i];// 100 float adds = 1000 pJ}float avg = sum /100.0f;// 1 float divide = 20 pJ// Total: ~1020 pJ
Method 2: Integer with Final Division
int32_t sum =0;for(int i =0; i <100; i++){ sum += readings[i];// 100 int adds = 10 pJ}int16_t avg = sum /100;// 1 int divide = 10 pJ// Total: ~20 pJ (50× more efficient!)
Method 3: Bit-Shift Division (power of 2)
int32_t sum =0;for(int i =0; i <128; i++){// Use 128 samples sum += readings[i];// 128 int adds = 13 pJ}int16_t avg = sum >>7;// Shift = 1 pJ// Total: ~14 pJ (70× more efficient than float!)
5.4.5 Worked Example: Wireless Camera Power Analysis
Scenario: You are designing a wildlife camera that must operate for 6 months on 4× D batteries (8000 mAh total). The camera should capture and transmit images when motion is detected.
Given:
Camera module: OV2640 (50mA active, 20µA standby)
PIR motion sensor: 10µA continuous
MCU: ESP32 (80mA active with image processing, 10µA deep sleep)
Wi-Fi transmission: 200mA average during TX
Expected motion events: 20 per day average
Image size: 640×480 JPEG = 50KB compressed
Analysis:
Continuous loads (per day):
PIR sensor: 10µA × 24h = 0.24 mAh
ESP32 deep sleep: 10µA × 24h = 0.24 mAh
Camera standby: 20µA × 24h = 0.48 mAh
Daily base: 0.96 mAh
Per-event energy (20 events/day):
Wake from sleep: 80mA × 0.1s = 8 mAs = 0.0022 mAh
Camera capture: 50mA × 0.5s = 25 mAs = 0.0069 mAh
Image processing: 80mA × 0.3s = 24 mAs = 0.0067 mAh
Wi-Fi connect: 160mA × 3s = 480 mAs = 0.133 mAh
Wi-Fi transmit: 200mA × 4s (50KB @ 100kbps) = 800 mAs = 0.222 mAh
Per event: 1,337 mAs = 0.371 mAh
Daily events: 20 × 0.371 = 7.42 mAh
Total daily consumption:
Base load: 0.96 mAh
Motion events: 7.42 mAh
Total: 8.38 mAh/day
Battery life calculation:
Available: 8000 mAh × 0.7 (efficiency) = 5,600 mAh
Life: 5,600 / 8.38 = 668 days = 22 months
Result: Design exceeds 6-month target with 4× margin.
Optimization opportunities if battery size needs reduction:
Use LoRa instead of Wi-Fi (10× less TX energy, but slower)
Reduce image resolution (320×240 = 4× smaller)
Transmit only when motion is “interesting” (AI classification)
Use edge computing to detect false positives (leaves, shadows)
Autonomy required: 7 days without sun (cloudy winter)
Your Task: Size the solar panel and battery
Step 1: Calculate daily energy consumption
MCU active (reading + TX): 50mA × 2s × 288 readings = 28,800 mAs = 8 mAh
MCU deep sleep: 10µA × 86,400s = 864 mAs = 0.24 mAh
LoRa TX: 40mA × 0.5s × 288 = 5,760 mAs = 1.6 mAh
Sensors: 5mA × 0.5s × 288 = 720 mAs = 0.2 mAh
Daily total: _______ mAh
Step 2: Calculate battery size for 7-day autonomy
7-day energy: _______ mAh
With 80% DoD: _______ mAh battery required
Step 3: Calculate solar panel size
Seattle winter: ~2 hours equivalent full sun per day
Daily harvest needed: _______ mAh × 1.5 (margin) = _______ mAh
Panel current required: _______ mAh / 2h = _______ mA
At 6V panel: _______ mA × 6V = _______ mW panel
5.5.2 Interactive Calculator: Multi-Protocol Power Comparison
Compare three connectivity options for an indoor asset tracker. Adjust transmissions per day to see battery life impact:
Key insight: For Wi-Fi, sending 1000 bytes is over 800× more efficient per byte than sending 1 byte (540 vs 0.65 mAs/byte). Always batch!
Worked Example: Local Processing vs Cloud Offloading Energy Cost
Scenario: Smart doorbell with motion detection. Two approaches: stream raw video to cloud OR process locally with edge AI.
Cloud approach:
Camera captures 10 fps @ 640×480
H.264 compression: 200 kbps stream
Wi-Fi transmission: 150 mA continuous
Daily motion events: 50 (5 minutes each)
Energy per event: 150 mA × 300s = 45,000 mAs = 12.5 mAh
Daily: 50 × 12.5 = 625 mAh
Battery life (2,000 mAh): 3.2 days
Edge AI approach:
Camera captures frames
Edge TPU processes (50 mW for 100ms)
Transmit only on person detected (5% of time)
Send 1 frame when motion detected
Motion detection: 50 mW × 0.1s × 50 events = 250 mAs = 0.07 mAh
Transmission (when needed): 150 mA × 2s × 2.5 events/day = 750 mAs = 0.21 mAh
Sleep: 10 µA × 86,000s = 860 mAs = 0.24 mAh
Daily: 0.52 mAh
Battery life: 2,000 ÷ 0.52 = 3,846 days = 10.5 years!
Energy savings: 1,200× improvement
Result: Edge processing uses 1,200× less energy by transmitting only when needed, extending battery life from days to years.
Decision Framework: When to Process Locally vs Offload
Factor
Process Locally
Offload to Cloud
Criteria
Payload size
>10 KB raw data
<1 KB processed
Can you compress >10×?
Frequency
>10/minute
<1/hour
High frequency → local
Latency
<100 ms required
>1 second acceptable
Real-time → local
Processing cost
<1000 ops per byte saved
Complex ML models
Local if compression pays off
Network type
Wi-Fi available
Cellular (expensive)
Cellular → minimize TX
Decision: If (local_processing_energy + reduced_TX_energy) < full_TX_energy, process locally. Usually breaks even at 10:1 compression ratio.
Common Mistake: Optimizing Algorithm Complexity While Ignoring Radio Energy
The Mistake: Spending weeks optimizing O(n²) → O(n log n) algorithm (saves 5 mJ), while transmitting full dataset wastes 50 mJ.
Impact: Algorithm optimization saves 5 mJ, but simple filtering before transmission saves 45 mJ (9× more).
Fix: Optimize in this order: 1) Reduce transmission (biggest win), 2) Minimize sleep current, 3) Optimize algorithms. Radio > Sleep > Compute in energy impact.
5.7 Knowledge Check
## How It Works
The energy hierarchy reveals why radio dominates IoT power budgets:
Energy cost breakdown (typical ARM Cortex-M4 @ 80MHz, 3.3V):
Why the gap exists: Radio physics requires minimum energy to overcome distance/noise. Shannon limit: E > k × distance² × noise. Meanwhile, transistor energy shrinks with Moore’s Law. Result: computation gets cheaper, but radio energy stays constant or decreases slowly.
Million-to-one rule origin: 1 byte WiFi TX ≈ 500 µJ. 1 million 32-bit adds ≈ 1,000,000 pJ = 1 µJ. Ratio: 500:1 to 1000:1 depending on protocol. Rule of thumb: 1 byte TX = 1 million operations. Implication: compress 10:1 → save 90% TX energy even if compression costs 100,000 operations.
5.8 Concept Check
## Concept Relationships
Operation costs quantify the energy price of each design decision:
Justifies: The million-to-one rule explains why Low-Power Strategies prioritize reducing TX over optimizing computation
Measured By: Theoretical costs must be validated with Energy Measurement to account for real hardware inefficiencies
Informs: Protocol selection (LoRaWAN vs WiFi) based on cost per byte TX (2mJ vs 500mJ per packet)
Enables: Edge processing decisions—if local compute costs <1% of TX savings, process locally
Hierarchy cascade: Radio (highest cost) → Flash writes → DRAM access → SRAM access → ALU operations (lowest cost). Optimize from top down: fix radio first (1000× impact), then memory (100× impact), finally computation (1× impact).
Option A: Transmit 400 bytes raw via LoRa SF7 (40mA for 200ms per 50-byte packet) Option B: Run zlib compression (costs 50,000 CPU cycles), transmit compressed (achieves 4:1 ratio = 100 bytes)
Calculate:
Energy for Option A: 40mA × 3.3V × 1.6s (8 packets × 200ms) = ?
Energy for compression: 50,000 cycles ÷ 80MHz × 80mW = ?
Energy for Option B TX: 40mA × 3.3V × 0.4s (2 packets) = ?
Total B vs A savings?
Expected result: A = 211mJ. Compression = 0.05mJ. B TX = 53mJ. Total B = 53.05mJ. Savings: 75%. Compression overhead is <0.1% of TX savings.
5.10.2 Exercise 2: Protocol Energy Comparison
Given: Send 20 bytes of data every 5 minutes for 1 year.
What to observe: BLE = 21.6mAs/day = 253 years. LoRa = 1152mAs/day = 4.7 years. WiFi = 140,000mAs/day = 0.04 years (14 days). BLE’s 1000× advantage over WiFi for small payloads.
Matching Quiz: Match Operations to Energy Costs
Ordering Quiz: Order Operations from Cheapest to Most Expensive Energy
Label the Diagram
💻 Code Challenge
Order the Steps
Match the Concepts
5.11 Summary
Key takeaways from energy cost analysis:
The Million-to-One Rule: Wireless transmission costs ~1 million CPU operations per byte
Process Locally: Filtering, aggregation, and compression save significant energy
Battery Gap is Real: Computing advances 100,000× faster than battery technology
Batch Transmissions: Amortize connection overhead over multiple data points
Choose Algorithms Wisely: O(n²) vs O(n) makes enormous difference in energy consumption
Use Appropriate Data Types: Smaller types and fixed-point math save energy
Common Pitfalls
1. Transmitting Small Payloads Frequently Instead of Batching
Sending one byte per second instead of 60 bytes per minute costs the same 60 radio transmissions but pays 60× the connection overhead. Batch readings into larger payloads and transmit less frequently; the data transmission energy is almost always dominated by connection overhead.
2. Using Floating-Point Math on Cortex-M0/M3 Without FPU
Software floating-point emulation on MCUs without hardware FPU takes 10–100 CPU cycles per operation. A sensor fusion algorithm with 1,000 float operations per second can consume 100,000 CPU cycles — significant at 1–4 MHz. Use integer or fixed-point arithmetic instead.
3. Not Accounting for Retransmission Energy
Lossy wireless links (LoRa, BLE in interference) trigger automatic retransmissions. A 10% packet loss rate adds 10% extra transmission energy to the average; at 30% loss rate, the impact becomes significant. Include an empirical retransmission overhead factor in energy budgets.
4. Assuming Flash Writes Are Like RAM Writes
Internal flash writes require an erase cycle (10–100 ms at high current ~10 mA) before writing new data. Writing 1 byte to a 4 KB flash page erases the entire page first. Minimize flash writes by buffering in RAM and writing in full pages, or using external FRAM/EEPROM for frequent writes.