60  Edge Review: Storage and Economics

In 60 Seconds

Tiered storage (hot/warm/cold) reduces IoT storage costs from 630 TB/year of raw data to just 18.2 GB/year through edge aggregation – a 34,654x reduction. Edge computing deployments typically achieve 130% ROI with a 2.2-year payback period, with larger deployments exceeding 900% ROI.

60.1 Learning Objectives

By the end of this chapter, you will be able to:

  • Design Tiered Storage Architectures: Apply hot/warm/cold storage strategies for IoT data
  • Calculate Storage Requirements: Estimate capacity needs across retention tiers
  • Perform ROI Analysis: Compute return on investment and payback period for edge deployments
  • Evaluate Total Cost of Ownership: Compare setup, operational, and replacement costs

Key Concepts

  • Edge storage cost model: The total cost of storing data at the edge, including hardware cost (flash memory per GB), power consumption of storage access, and maintenance cost of replacing failed storage media.
  • Cloud storage cost model: The total cost of storing data in the cloud, including per-GB storage fees, per-API-request fees, and data egress fees when downloading stored data for analysis.
  • Tiered storage strategy: Keeping recent high-resolution data on fast, expensive storage (edge flash, cloud SSD) and moving older data to slow, cheap storage (cloud object storage) based on access frequency and retention policy.
  • Store-and-forward buffer sizing: Calculating the required edge buffer size to survive the maximum expected cloud connectivity outage: buffer_size = data_rate × max_outage_duration × (1 + safety_margin).
  • Data retention policy: A defined schedule specifying how long data is retained at each tier, after which it is either deleted or moved to cheaper long-term archival storage.

60.2 Prerequisites

Before studying this chapter, complete:

Think of data storage like organizing photos:

  • Hot storage (Tier 1): Photos on your phone - instant access, limited space, expensive per photo
  • Warm storage (Tier 2): Photos on your computer - quick access, more space, moderate cost
  • Cold storage (Tier 3): Photos in cloud archive - slower access, unlimited space, cheapest per photo

IoT systems do the same thing:

  • Keep recent data (30 days) on fast, expensive storage for real-time queries
  • Keep trend data (1 year) on standard storage for analytics
  • Archive historical data forever on cheap cloud storage for compliance

60.3 Level 4 Data Accumulation

At Level 4, data in motion converts to data at rest. Key decisions include:

  • Does persistency require a file system, big data system, or relational database?
  • What data transformations are needed for the required storage system?
  • What retention policies balance cost, performance, and compliance?

60.3.1 Tiered Storage Architecture

Tier Retention Data Type Storage Type Cost/GB/month
Tier 1 - Hot ~30 days Raw edge records Fast SSD, time-series DB $0.20
Tier 2 - Warm ~1 year Hourly aggregates Standard disk $0.05
Tier 3 - Cold Multi-year Daily aggregates Object storage (S3/Blob) $0.01

60.4 Storage Requirement Calculations

### Storage Comparison: Raw vs Tiered

If we stored raw sensor readings (with database indexing and metadata overhead):

1000 sensors x 100 Hz x 86,400 sec/day x 365 days x 200 bytes/reading
= 630,720,000,000,000 bytes = 630.72 TB/year

Each raw reading stored in a time-series database occupies approximately 200 bytes including the sensor value, timestamp, sensor ID, data quality flags, and database indexing overhead.

With tiered edge+cloud storage:

18.2 GB/year = 0.0000289x of raw storage

Reduction factor: 34,654x savings

Tiered storage architecture achieves massive cost reduction through intelligent data retention policies:

\[\text{Storage Reduction} = \frac{\text{Raw Storage Volume}}{\text{Tiered Storage Volume}}\]

For a 1,000-sensor deployment with 100 Hz sampling:

Raw storage (no edge processing): \[1000 \times 100 \text{ Hz} \times 86{,}400 \text{ sec/day} \times 365 \text{ days} \times 200 \text{ bytes} = 630.72 \text{ TB/year}\]

Tiered storage (with edge aggregation):

  • Tier 1 (30 days hot): \(1000 \times 288 \text{ rec/day} \times 30 \times 200 \text{ B} = 1.73 \text{ GB}\)
  • Tier 2 (1 year warm): \(1000 \times 8{,}760 \text{ hr} \times 1{,}800 \text{ B} = 15.77 \text{ GB}\)
  • Tier 3 (1 year cold): \(1000 \times 365 \text{ days} \times 2{,}000 \text{ B} = 0.73 \text{ GB}\)
  • Total: 18.2 GB/year

\[\text{Reduction} = \frac{630{,}720 \text{ GB}}{18.2 \text{ GB}} = 34{,}654\times\]

Annual storage cost at cloud pricing:

  • Raw approach: \(630{,}720 \text{ GB} \times \$0.05/\text{GB} = \$31{,}536/\text{year}\)
  • Tiered approach: \((1.73 \times 0.20) + (15.77 \times 0.05) + (0.73 \times 0.01) = \$1.14/\text{month} = \$13.68/\text{year}\)
  • Savings: $31,522/year (99.96% reduction)

60.4.1 Monthly Storage Costs

Tier Capacity Cost/GB/month Monthly Cost
Tier 1 (SSD) 1.73 GB $0.20 $0.35
Tier 2 (HDD) 15.77 GB $0.05 $0.79
Tier 3 (S3) 0.73 GB $0.01 $0.01
Total 18.23 GB - $1.15/month

For 1000 sensors, storage costs approximately $14/year – negligible compared to data transfer costs.

60.4.2 Interactive: Tiered Storage Cost Explorer

Adjust the deployment parameters below to see how sensor count and sampling rate affect storage requirements and costs across tiers.

60.5 ROI and Payback Analysis

60.5.1 Total Cost of Ownership Framework

### ROI Drivers for Edge Computing

Category Without Edge With Edge Annual Savings
Bandwidth $69/day x 365 $0.005/day x 365 $25,000
Cloud Processing Full cloud compute 100x less $15,000
Latency Benefits Slow response Real-time $20,000
Maintenance Manual inspections Predictive alerts $10,000
Battery Replacement Frequent changes Deep sleep $5,000
Total - - $75,000/year

60.5.2 Scaling ROI with Deployment Size

Deployment Size Setup Cost Annual Savings 5-Year ROI Payback
100 sensors $10,000 $8,000 130% 2.2 years
500 sensors $25,000 $40,000 700% 0.6 years
1000 sensors $40,000 $80,000 900% 0.5 years

Larger deployments have better economics due to:

  • Amortized setup costs across more devices
  • Greater bandwidth savings
  • Centralized management efficiency

60.5.3 Interactive: Edge ROI Calculator

Adjust the cost and savings parameters to explore how different deployment scenarios affect ROI and payback period.

60.6 Retention Policy Implementation

60.6.1 Automatic Data Lifecycle

Tier 1 (Hot): Delete edge records older than 30 days
Tier 2 (Warm): Delete hourly aggregates older than 1 year
Tier 3 (Cold): Keep daily aggregates forever (archive to glacier after 1 year)

60.6.2 Data Aggregation Pipeline

Stage Input Output Reduction
Raw to Edge 100 raw readings 1 edge record (200 bytes) 100:1
Edge to Hourly 12 edge records 1 hourly aggregate (1.8 KB) ~1.3:1 (richer stats)
Hourly to Daily 24 hourly aggregates 1 daily aggregate (2 KB) ~21.6:1

60.6.3 Query Performance by Tier

Tier Storage Type Query Latency Use Case
Tier 1 SSD/Time-series DB Sub-second Real-time dashboards
Tier 2 Standard disk 1-5 seconds Trend analysis
Tier 3 Object storage 10-60 seconds Historical/compliance

60.7 Example TCO Calculation

From the reference material, a typical agricultural IoT deployment:

Cost Category Amount
Setup Costs
Hardware (gateways, sensors) $5,000
Installation labor $1,500
Software licensing $1,250
Subtotal Setup $7,750
Ongoing Costs (5 years)
Cloud services $12,500
Maintenance $15,000
Battery replacement $5,000
Network connectivity $4,950
Subtotal Ongoing $37,450
Total TCO (5 years) $45,200
Annual Average $9,040/year

If this system saves $500/station/year in operational efficiency:

  • 5-year total savings: $125,000 (50 stations x $500 x 5 years)
  • Net benefit: $79,800 ($125,000 - $45,200)
  • ROI percentage: 176.5% ($79,800 / $45,200)
  • Payback period: 1.8 years ($45,200 / $25,000 annual savings)

60.8 Cross-Hub Connections

This comprehensive review connects to multiple learning resources:

Interactive Tools:

  • Simulations Hub - Edge vs Cloud Latency Explorer, Network Topology Visualizer
  • Practice edge data reduction calculations with interactive calculators

Assessment Resources:

Video Learning:

  • Videos Hub - Edge, Fog, Cloud continuum explanations

Knowledge Validation:

60.9 Chapter Summary

  • Tiered storage (hot/warm/cold) balances query performance ($0.20/GB SSD), cost efficiency ($0.01/GB object storage), and compliance requirements (permanent retention).

  • Storage calculations for 1000-sensor deployments show 18.2 GB/year with tiered architecture vs 630.72 TB/year for raw storage – a 34,654x reduction.

  • ROI analysis demonstrates 130% return and 2.2-year payback for edge computing deployments, with larger deployments achieving even better economics.

  • TCO framework includes setup costs (hardware, installation, software), ongoing costs (cloud, maintenance, batteries, connectivity), and replacement costs.

  • Retention policies automatically manage data lifecycle: 30-day edge records (Tier 1), 1-year hourly aggregates (Tier 2), permanent daily summaries (Tier 3).

Scenario: Hospital deploys 5,000 patient monitoring sensors (heart rate, SpO2, blood pressure) across 500 beds.

Data Generation:

  • 5,000 sensors x 1 Hz sampling x 16 bytes = 80 KB/s = 6.9 GB/day raw data
  • Regulatory requirement: 7-year retention for patient records

Naive Approach (No Tiering): Store all raw data in cloud:

  • 7 years x 365 days x 6.9 GB = 17,618 GB = 17.2 TB total
  • Cloud storage (average over accumulation): 17.2 TB x $0.023/GB/month x 84 months = $33,258 total storage cost

Optimized Tiered Storage:

Tier 1 - Hot (1-day retention, SSD):

  • Raw 1-second readings for real-time alerts
  • Storage: 6.9 GB
  • Cost: 6.9 GB x $0.20/GB/month = $1.38/month

Tier 2 - Warm (30-day retention, Standard Disk):

  • 1-minute aggregates (min/max/mean)
  • Size: 6.9 GB / 60 = 115 MB/day x 30 days = 3.45 GB
  • Cost: 3.45 GB x $0.05/GB/month = $0.17/month

Tier 3 - Cold (7-year retention, Glacier Deep Archive):

  • Hourly aggregates + critical events
  • Size: 115 MB / 60 = 1.92 MB/day
  • 7-year storage: 1.92 MB x 365 x 7 = 4.9 GB
  • Cost: 4.9 GB x $0.001/GB/month x 84 months = $0.41 total

Total Storage Cost (7 years):

  • Tier 1: $1.38/month x 84 months = $116
  • Tier 2: $0.17/month x 84 months = $14
  • Tier 3: $0.41 (cumulative) = $0.41
  • Total: $130 (vs $33,258 without tiering)
  • Savings: $33,128 (99.6% reduction)

Compliance Satisfied:

  • Original waveforms: 1 day (meets emergency care needs)
  • Minute-level summaries: 30 days (clinical review)
  • Hourly aggregates: 7 years (meets regulatory requirements)
  • Critical events (arrhythmias, alerts): Permanent (forensic analysis)
Data Type Tier 1 (Hot) Tier 2 (Warm) Tier 3 (Cold) Rationale
Waveforms (ECG, vibration) 1-7 days 30-90 days Never High volume, diagnostic value decays quickly
Vital signs (HR, temp) 7 days 1 year 7 years Regulatory requirement, trend analysis
Events (alarms, anomalies) 30 days Forever N/A Low volume, critical for forensics
Video surveillance 7 days 30 days 90 days Storage intensive, legal hold periods
Environmental (temp, humidity) 30 days 1 year Forever Compliance, small volume
Transactional (RFID scans) 7 days 1 year 7+ years Audit trails, regulatory
Common Mistake: Not Accounting for Data Growth in ROI Calculations

The Mistake: Students calculate edge ROI based on current deployment size, forgetting that IoT deployments typically grow 3-5x within 2 years.

Example:

  • Initial deployment: 100 sensors, edge gateway saves $2,000/year
  • Gateway cost: $1,500
  • Student ROI: 33% first year (($2,000 - $1,500) / $1,500)

What Actually Happens:

  • Year 1: 100 sensors
  • Year 2: 280 sensors (2.8x growth)
  • Year 3: 560 sensors (5.6x growth)
  • Year 4: 750 sensors (leveling off)
  • Year 5: 900 sensors (near capacity)

Realistic ROI:

Year Sensors Cloud Cost (no edge) Edge Gateway Cost Savings
0 - - $1,500 -$1,500
1 100 $2,000 $200 $1,800
2 280 $5,600 $200 $5,400
3 560 $11,200 $200 $11,000
4 750 $15,000 $200 $14,800
5 900 $18,000 $200 $17,800

5-year net cash flow: -$1,500 + $1,800 + $5,400 + $11,000 + $14,800 + $17,800 = $49,300

True ROI: $49,300 / $1,500 = 3,287% (vs 33% from a naive year-1 calculation)

The Lesson: Edge infrastructure scales better than cloud costs. Include growth projections in ROI calculations for accurate business cases.

60.10 Concept Relationships

Storage economics builds on:

  • Edge Architecture - Level 4 (data accumulation) defines tiered storage strategy
  • Data Reduction - Edge aggregation transforms storage economics from 630 TB/year to 18.2 GB/year

Storage economics enables:

  • Edge Deployments - ROI analysis (130% return, 2.2-year payback) justifies gateway investments
  • Data in Cloud - Tiered storage strategy optimizes cloud costs ($0.01/GB cold vs $0.20/GB hot)

Parallel concepts:

  • Tiered storage (hot/warm/cold) maps to the Edge-Fog-Cloud continuum: both use hierarchical placement based on access frequency
  • ROI analysis complements power optimization business case: both demonstrate payback periods under 3 years

60.11 See Also

Related review chapters:

Related chapters in other modules:

60.12 What’s Next

Direction Chapter Link
Next Data in the Cloud data-in-the-cloud.html
Previous Edge Review: Power Optimization edge-review-power-optimization.html
Related Edge Comprehensive Review edge-comprehensive-review.html
Related Big Data Overview big-data-overview.html
Key Takeaway

Tiered storage architecture (hot SSD for 30-day data at $0.20/GB, warm disk for 1-year aggregates at $0.05/GB, cold object storage for permanent archives at $0.01/GB) transforms the economics of IoT data management. For 1000 sensors, tiered storage costs just $14/year compared to 630.72 TB/year of raw storage. Edge computing ROI typically exceeds 130% with a 2.2-year payback, with larger deployments achieving even better returns.

“The Three Storage Shelves!”

Sammy the Sensor had a problem. “I’ve been collecting data for a whole year. Where do we put it ALL?”

Max the Microcontroller showed Sammy three shelves. “Think of it like organizing your toys!”

Shelf 1 is the HOT shelf – it’s right next to your bed. That’s where you keep the toys you play with every day. For us, that means the last 30 days of data, stored on super-fast drives. It’s expensive, but lightning quick!”

Shelf 2 is the WARM shelf – it’s in your closet. Toys you play with sometimes. For us, that’s hourly summaries from the past year. Not as fast, but way cheaper.”

Shelf 3 is the COLD shelf – it’s in the attic. Toys you might want someday but rarely touch. For us, that’s daily summaries kept forever. Super cheap storage!”

Lila the LED did some quick math. “If we kept ALL the raw data, we’d need 630 TERABYTES per year. That’s like filling up 140,000 DVDs!”

“But with our three shelves and edge processing?” Max grinned. “Just 18 gigabytes. That’s ONE tiny USB stick!”

Bella the Battery jumped in. “And here’s the best part – setting up edge processing costs about $10,000, but it saves thousands every single year. In about 2 years, it pays for itself!”

“So it’s like buying a rechargeable battery instead of disposable ones,” Sammy said. “It costs more up front, but saves money in the long run!”

The Sensor Squad learned: Smart storage is like organizing your room – keep the important stuff close, archive the old stuff cheap, and never keep what you don’t need!

60.13 Videos

Edge - Fog - Cloud Overview
Edge-Fog-Cloud Overview
From slides - Continuum placement and gateway roles relevant to edge data processing.

Common Pitfalls

If connectivity outages last 2 hours on average but 12 hours in the worst case, sizing the buffer for 2 hours means data loss every worst-case outage. Size for the 99th percentile outage duration, not the mean.

Consumer-grade microSD cards rated for 10,000 write cycles will fail in months under continuous high-frequency logging. Calculate the expected write endurance before selecting storage media and use industrial-grade flash for continuous logging applications.

Cloud storage costs are often presented as storage-only (per GB/month), hiding the significant per-request and egress costs when data is retrieved for analytics. Model the full storage + retrieval cost at your expected query volume.