15 Time-Series Practice and Labs

Prerequisites: Query Optimization for IoT

This enables: Stream Processing | Anomaly Detection

15.1 Learning Objectives

By the end of this chapter, you will be able to:

Apply time-series concepts through hands-on ESP32 lab exercises
Analyze real-world scale challenges from the Tesla fleet telemetry case study
Design complete time-series storage strategies using worked examples
Implement circular buffers and downsampling on resource-constrained devices
Calculate NTP synchronization intervals for distributed IoT deployments

In 60 Seconds

Practical time-series implementation requires connecting ingestion pipelines, schema design, and query optimization into a working system. The key exercises – inserting 10,000 hourly readings, writing efficient time-window queries, configuring compression policies, and building continuous aggregates – reveal the difference between understanding time-series theory and making it work reliably under real IoT load. Focus on measuring actual compression ratios and query latencies in your environment, as defaults rarely match documentation benchmarks.

15.2 Key Concepts

Bulk Insert: Writing multiple time-series records in a single database transaction or HTTP request, dramatically reducing per-record overhead compared to individual inserts – essential for achieving target ingest rates
Time-Window Aggregation: A query pattern grouping data into fixed time buckets (1 minute, 1 hour) and computing statistics within each bucket – the foundational operation for IoT dashboards and anomaly detection
Compression Policy: A TimescaleDB background job that converts row-oriented time chunks to columnar format after a configured age, typically achieving 10-20x size reduction on floating-point sensor data
time_bucket(): A TimescaleDB SQL function that rounds timestamps to a specified interval (e.g., 1 hour), used in GROUP BY clauses to create time-window aggregations without manual date arithmetic
Continuous Aggregate Refresh: The automatic or scheduled process that updates materialized time-series summaries as new raw data arrives, balancing freshness against the computational cost of re-aggregation
EXPLAIN ANALYZE: The PostgreSQL/TimescaleDB query planning tool that shows actual execution time, chunk exclusion effectiveness, and index usage – essential for diagnosing slow time-series queries
Hypertable Chunk: A physical storage unit within a TimescaleDB hypertable covering a fixed time interval (e.g., 7 days), independently compressible and queryable, with the planner excluding irrelevant chunks based on time predicates
Gap Fill: A time-series query technique that inserts null or interpolated values for missing data points within a time window, enabling continuous chart rendering even when sensor readings are absent

15.3 MVU: Minimum Viable Understanding

Core Concept: Real-world time-series systems combine edge processing, adaptive sampling, and multi-tier retention to handle extreme scale while maintaining analytical capability. Why It Matters: Tesla’s 300 million points/second demonstrates that naive approaches fail catastrophically–only through edge aggregation, adaptive sampling, and tiered retention can such systems become practical. Key Takeaway: Start with standard tools (InfluxDB, TimescaleDB), implement retention from day one, and add edge processing when cloud ingestion becomes a bottleneck.

For Beginners: Time-Series Practice

This hands-on chapter lets you work with time-series data directly. Think of it as the lab portion of a science class – reading about experiments is helpful, but actually running them is how you truly learn. You will practice storing, querying, and managing the kind of timestamped sensor data that real IoT systems produce.

15.4 Real-World Case Study: Tesla Fleet Telemetry

Tesla operates one of the world’s largest IoT time-series systems, collecting data from over 1.5 million vehicles worldwide.

15.4.1 Scale Challenge

Data Generation:

Vehicles: 1.5 million active vehicles
Sensors per vehicle: ~200 sensors (battery, motors, HVAC, GPS, cameras, etc.)
Sampling rate: 1 Hz average (some sensors faster, some slower)
Data points per second: 1.5M vehicles x 200 sensors x 1 Hz = 300 million points/second
Daily data: 300M x 86,400 seconds = 25.9 trillion points/day

Storage Requirements (without optimization):

Raw: 25.9T points x 32 bytes = 829 TB/day
Annual: 829 TB x 365 = 302 PB/year

This is physically impossible to store economically. Tesla’s approach:

15.4.2 Tesla’s Time-Series Strategy

Data flow diagram showing Tesla vehicle time-series data pipeline from edge sensors through multi-tier storage with hot, warm, cold, and archive tiers — Figure 15.1: Tesla Vehicle Time Series Data Flow with Multi-Tier Storage

Key Optimizations:

Edge Aggregation: Vehicles pre-process 90% of data locally
- Only upload anomalies and aggregates
- Reduces cloud ingestion to ~30M points/second (10x reduction)
Adaptive Sampling: Sample rates adjust based on context
- Parked: Sample every 5 minutes
- Driving: Sample every second
- Hard braking: Sample at 100 Hz for 10 seconds
Multi-Tier Retention:
- Hot (7 days): Full resolution for recent analysis
- Warm (30 days): Downsampled for trend analysis
- Cold (1 year): Aggregates for long-term patterns
- Archive: Compliance and model training
Custom Time-Series Engine:
- Tesla built custom infrastructure (not off-the-shelf)
- Columnar storage with extreme compression (50:1 ratios)
- Distributed across data centers globally

Results:

Actual storage: ~15 PB/year (vs. 302 PB raw)
Query latency: <100ms for recent data analysis
Powers Autopilot improvements, range predictions, battery health monitoring

Putting Numbers to It

How does Tesla achieve ~95% reduction on 300M points/sec? Breaking down the data pipeline:

\[ \begin{aligned} \text{Raw generation:} &\quad 1.5M\text{ vehicles} \times 200\text{ sensors} \times 1\text{ Hz} = 300M\text{ pts/sec} \\ \text{Edge aggregation:} &\quad 90\%\text{ filtered locally} \to 30M\text{ pts/sec to cloud} \\ \text{Adaptive sampling:} &\quad \text{Parked (5 min)} + \text{driving (1 Hz)} + \text{events (100 Hz)} \\ &\quad \text{Effective rate:} \approx 0.2\text{ Hz average} \to 6M\text{ pts/sec} \\ \text{Compression (50:1):} &\quad 6M \times 32\text{ bytes} = 192\text{ MB/sec raw} \\ &\quad \div 50 = 3.8\text{ MB/sec compressed} \end{aligned} \]

Daily storage: $3.8\text{ MB/sec} \times 86{,}400 = 328\text{ GB/day} \approx 120\text{ TB/year}$

This 120 TB/year represents the theoretical minimum if all optimizations were applied uniformly. Tesla’s actual ~15 PB/year is ~125x higher because they: (1) retain multiple copies across data centers, (2) keep higher resolution for safety-critical events (crashes, Autopilot incidents), (3) store model training datasets at full resolution, and (4) maintain compliance archives. The combination of edge processing (10x), adaptive sampling (5x), and compression (50x) yields a 2,500x theoretical maximum reduction from 302 PB/year, with the actual ~95% reduction (302 PB to 15 PB) reflecting real-world retention needs.

Lessons for IoT Architects:

Edge processing is essential at scale: Don’t send everything to the cloud
Adaptive strategies: Sample rates and retention policies should match data value
Domain-specific compression: Tesla’s battery telemetry compresses 100:1 because voltage changes slowly
Start with standard tools: Use InfluxDB or TimescaleDB initially, only build custom if you reach their limits

15.5 Interactive Calculators

15.5.1 IoT Data Volume and Retention Calculator

Use this calculator to estimate storage requirements for your own IoT deployment. Adjust the parameters to see how sensor count, sample rate, and retention policies affect total storage.

Show code

viewof numSensors = Inputs.range([1, 10000], {
  value: 500, step: 1, label: "Number of sensors"
})

viewof sampleIntervalSec = Inputs.range([0.1, 3600], {
  value: 10, step: 0.1, label: "Sample interval (seconds)"
})

viewof bytesPerReading = Inputs.range([4, 256], {
  value: 32, step: 4, label: "Bytes per reading"
})

viewof compressionRatio = Inputs.range([1, 50], {
  value: 10, step: 1, label: "Compression ratio"
})

viewof rawRetentionDays = Inputs.range([1, 90], {
  value: 7, step: 1, label: "Raw data retention (days)"
})

viewof dsRetentionDays = Inputs.range([7, 365], {
  value: 30, step: 1, label: "Downsampled retention (days)"
})

viewof dsRatio = Inputs.range([1, 3600], {
  value: 60, step: 1, label: "Downsampling ratio (e.g., 60 = 1-min averages from 1-sec data)"
})

Show code

calculatedValues = {
  const readingsPerDay = numSensors * (86400 / sampleIntervalSec);
  const rawBytesPerDay = readingsPerDay * bytesPerReading;
  const rawGBPerDay = rawBytesPerDay / (1024 ** 3);
  const rawTierGB = rawGBPerDay * rawRetentionDays / compressionRatio;
  const dsPointsPerDay = numSensors * (86400 / (sampleIntervalSec * dsRatio));
  const dsBytesPerDay = dsPointsPerDay * bytesPerReading;
  const dsTierGB = (dsBytesPerDay * dsRetentionDays) / (1024 ** 3) / compressionRatio;
  const totalGB = rawTierGB + dsTierGB;
  const annualRawGB = rawGBPerDay * 365;
  const savingsPercent = annualRawGB > 0 ? ((annualRawGB - totalGB) / annualRawGB * 100) : 0;
  const writesPerSecond = readingsPerDay / 86400;

  return {
    readingsPerDay, rawGBPerDay, rawTierGB, dsPointsPerDay,
    dsTierGB, totalGB, annualRawGB, savingsPercent, writesPerSecond
  };
}

html`<div style="background: #f8f9fa; border: 1px solid #dee2e6; border-radius: 8px; padding: 16px; margin: 12px 0; font-family: system-ui, -apple-system, sans-serif;">
<h4 style="margin-top: 0; color: #2C3E50;">Storage Estimate</h4>
<table style="width: 100%; border-collapse: collapse; font-size: 14px;">
<tr style="border-bottom: 1px solid #dee2e6;">
  <td style="padding: 6px 8px;"><strong>Data points per day</strong></td>
  <td style="padding: 6px 8px; text-align: right;">${calculatedValues.readingsPerDay.toLocaleString()}</td>
</tr>
<tr style="border-bottom: 1px solid #dee2e6;">
  <td style="padding: 6px 8px;"><strong>Write throughput</strong></td>
  <td style="padding: 6px 8px; text-align: right;">${calculatedValues.writesPerSecond.toFixed(1)} writes/sec</td>
</tr>
<tr style="border-bottom: 1px solid #dee2e6;">
  <td style="padding: 6px 8px;"><strong>Raw data per day</strong></td>
  <td style="padding: 6px 8px; text-align: right;">${calculatedValues.rawGBPerDay < 1 ? (calculatedValues.rawGBPerDay * 1024).toFixed(1) + " MB" : calculatedValues.rawGBPerDay.toFixed(2) + " GB"}</td>
</tr>
<tr style="border-bottom: 1px solid #dee2e6;">
  <td style="padding: 6px 8px;"><strong>Raw tier (${rawRetentionDays} days, ${compressionRatio}:1 compression)</strong></td>
  <td style="padding: 6px 8px; text-align: right;">${calculatedValues.rawTierGB < 1 ? (calculatedValues.rawTierGB * 1024).toFixed(1) + " MB" : calculatedValues.rawTierGB.toFixed(2) + " GB"}</td>
</tr>
<tr style="border-bottom: 1px solid #dee2e6;">
  <td style="padding: 6px 8px;"><strong>Downsampled tier (${dsRetentionDays} days, ${dsRatio}:1 downsample)</strong></td>
  <td style="padding: 6px 8px; text-align: right;">${calculatedValues.dsTierGB < 1 ? (calculatedValues.dsTierGB * 1024).toFixed(1) + " MB" : calculatedValues.dsTierGB.toFixed(2) + " GB"}</td>
</tr>
<tr style="border-bottom: 2px solid #2C3E50;">
  <td style="padding: 6px 8px;"><strong>Total steady-state storage</strong></td>
  <td style="padding: 6px 8px; text-align: right; font-weight: bold; color: #16A085;">${calculatedValues.totalGB < 1 ? (calculatedValues.totalGB * 1024).toFixed(1) + " MB" : calculatedValues.totalGB.toFixed(2) + " GB"}</td>
</tr>
<tr>
  <td style="padding: 6px 8px;"><strong>vs. 1 year uncompressed raw</strong></td>
  <td style="padding: 6px 8px; text-align: right;">${calculatedValues.annualRawGB.toFixed(1)} GB (${calculatedValues.savingsPercent.toFixed(1)}% savings)</td>
</tr>
</table>
<p style="font-size: 12px; color: #7F8C8D; margin-bottom: 0;">
${calculatedValues.writesPerSecond < 100 ? "Write rate is within TimescaleDB/InfluxDB capacity for standard hardware." :
  calculatedValues.writesPerSecond < 10000 ? "Write rate requires optimized TSDB configuration (batching, dedicated SSD)." :
  "Write rate may require distributed TSDB cluster or edge pre-aggregation."}
</p>
</div>`

15.5.2 NTP Drift Budget Calculator

Calculate the maximum NTP synchronization interval for your IoT devices based on oscillator quality and accuracy requirements.

Show code

viewof oscillatorPPM = Inputs.range([0.1, 200], {
  value: 100, step: 0.1, label: "Oscillator drift (ppm)"
})

viewof targetAccuracyMs = Inputs.range([1, 5000], {
  value: 500, step: 1, label: "Target accuracy (ms)"
})

viewof ntpErrorMs = Inputs.range([1, 500], {
  value: 100, step: 1, label: "NTP sync error (ms)"
})

viewof connectivityPercent = Inputs.range([50, 100], {
  value: 80, step: 1, label: "Connectivity uptime (%)"
})

Show code

ntpValues = {
  const driftMsPerSec = oscillatorPPM / 1000;
  const availableForDrift = targetAccuracyMs - ntpErrorMs;
  const maxIntervalSec = availableForDrift > 0 ? availableForDrift / driftMsPerSec : 0;
  const maxIntervalMin = maxIntervalSec / 60;
  const safeIntervalMin = maxIntervalMin * (connectivityPercent / 100);
  const recommendedMin = Math.max(1, Math.floor(safeIntervalMin / 5) * 5);
  const driftPer5Min = driftMsPerSec * 300;
  const driftPerHour = driftMsPerSec * 3600;
  const feasible = availableForDrift > 0;

  return {
    driftMsPerSec, availableForDrift, maxIntervalSec, maxIntervalMin,
    safeIntervalMin, recommendedMin, driftPer5Min, driftPerHour, feasible
  };
}

html`<div style="background: #f8f9fa; border: 1px solid #dee2e6; border-radius: 8px; padding: 16px; margin: 12px 0; font-family: system-ui, -apple-system, sans-serif;">
<h4 style="margin-top: 0; color: #2C3E50;">NTP Sync Budget</h4>
${ntpValues.feasible ? html`
<table style="width: 100%; border-collapse: collapse; font-size: 14px;">
<tr style="border-bottom: 1px solid #dee2e6;">
  <td style="padding: 6px 8px;"><strong>Clock drift rate</strong></td>
  <td style="padding: 6px 8px; text-align: right;">${ntpValues.driftMsPerSec.toFixed(3)} ms/sec</td>
</tr>
<tr style="border-bottom: 1px solid #dee2e6;">
  <td style="padding: 6px 8px;"><strong>Drift over 5 minutes</strong></td>
  <td style="padding: 6px 8px; text-align: right;">${ntpValues.driftPer5Min.toFixed(1)} ms</td>
</tr>
<tr style="border-bottom: 1px solid #dee2e6;">
  <td style="padding: 6px 8px;"><strong>Drift over 1 hour</strong></td>
  <td style="padding: 6px 8px; text-align: right;">${ntpValues.driftPerHour.toFixed(1)} ms</td>
</tr>
<tr style="border-bottom: 1px solid #dee2e6;">
  <td style="padding: 6px 8px;"><strong>Available drift budget</strong></td>
  <td style="padding: 6px 8px; text-align: right;">${ntpValues.availableForDrift.toFixed(0)} ms (${targetAccuracyMs} ms target - ${ntpErrorMs} ms NTP error)</td>
</tr>
<tr style="border-bottom: 1px solid #dee2e6;">
  <td style="padding: 6px 8px;"><strong>Maximum sync interval</strong></td>
  <td style="padding: 6px 8px; text-align: right;">${ntpValues.maxIntervalMin.toFixed(1)} minutes</td>
</tr>
<tr style="border-bottom: 1px solid #dee2e6;">
  <td style="padding: 6px 8px;"><strong>With ${connectivityPercent}% uptime safety factor</strong></td>
  <td style="padding: 6px 8px; text-align: right;">${ntpValues.safeIntervalMin.toFixed(1)} minutes</td>
</tr>
<tr style="border-bottom: 2px solid #2C3E50;">
  <td style="padding: 6px 8px;"><strong>Recommended sync interval</strong></td>
  <td style="padding: 6px 8px; text-align: right; font-weight: bold; color: #16A085;">${ntpValues.recommendedMin} minutes</td>
</tr>
</table>
<p style="font-size: 12px; color: #7F8C8D; margin-bottom: 0;">
${oscillatorPPM <= 2 ? "TCXO-grade oscillator: excellent stability, infrequent sync needed." :
  oscillatorPPM <= 20 ? "Good crystal: moderate sync frequency sufficient for most IoT applications." :
  "Standard low-cost crystal: frequent NTP sync required for sub-second accuracy."}
</p>` :
html`<p style="color: #E74C3C; font-weight: bold;">NTP sync error (${ntpErrorMs} ms) exceeds target accuracy (${targetAccuracyMs} ms). Reduce NTP error or increase target accuracy.</p>`}
</div>`

15.6 Understanding Check

Scenario: Smart Building Deployment

You’re deploying an IoT system for a 20-story office building with the following sensor network:

Temperature sensors: 500 sensors (1 per room), report every 10 seconds
Occupancy sensors: 500 sensors (motion detection), report on change (avg 1/minute)
Energy meters: 50 meters (per floor + equipment), report every 30 seconds
Air quality sensors: 100 sensors (CO2, VOC), report every 60 seconds

Each reading is 32 bytes (timestamp + sensor_id + value + metadata).

15.6.1 Questions:

How many data points per day does this system generate?
What is the raw storage requirement per month (without compression)?
If you implement this retention policy, what’s the storage after 1 year?
- Tier 1: Raw data, 7 days retention
- Tier 2: 1-minute averages, 30 days retention
- Tier 3: 1-hour averages, 1 year retention
- Tier 4: Daily aggregates, forever
Which database would you recommend and why?

15.6.2 Solutions:

1. Data points per day:

Temperature: 500 sensors x (86,400 sec/day / 10 sec) = 4,320,000 points/day
Occupancy: 500 sensors x (1,440 min/day / 1 min) = 720,000 points/day
Energy: 50 meters x (86,400 sec/day / 30 sec) = 144,000 points/day
Air quality: 100 sensors x (86,400 sec/day / 60 sec) = 144,000 points/day

Total: 5,328,000 points/day

2. Raw storage per month:

Daily: 5,328,000 points x 32 bytes = 170.5 MB/day
Monthly: 170.5 MB x 30 = 5.1 GB/month raw

3. Storage after 1 year with retention policy:

Tier 1 (Raw, 7 days): - 5,328,000 points/day x 7 days x 32 bytes = 1.19 GB - With 10:1 compression: 119 MB

Tier 2 (1-minute averages, 30 days): - 1,150 sensors x 1,440 minutes/day = 1,656,000 points/day (one averaged value per sensor per minute) - 1,656,000 x 30 days x 32 bytes = 1.59 GB - With 10:1 compression: 159 MB

Tier 3 (1-hour averages, 1 year): - 1,150 sensors x 24 hours/day = 27,600 points/day (one averaged value per sensor per hour) - 27,600 x 365 days x 32 bytes = 322 MB - With 10:1 compression: 32 MB

Tier 4 (Daily aggregates, forever): - 1,150 sensors x 1 daily aggregate = 1,150 points/day - 1,150 x 365 days x 32 bytes = 13.4 MB/year - With 10:1 compression: 1.3 MB/year

Total storage after 1 year: ~311 MB (119 + 159 + 32 + 1.3)

vs. no retention policy: 5.1 GB/month x 12 = 61.2 GB raw (6.1 GB compressed)

Savings: 95% reduction vs. compressed raw, 99.5% vs. uncompressed raw

4. Database recommendation:

Recommended: TimescaleDB

Reasoning:

Write throughput is manageable: 5.3M points/day / 86,400 seconds = ~62 writes/second (well within TimescaleDB capacity)
Need for correlations: Building management systems need to join sensor data with:
- Room assignments (which tenant, department)
- Energy billing data
- Maintenance schedules
- Occupancy reservations
SQL compatibility: Facilities team likely familiar with SQL, easier integration with existing building management software
PostgreSQL ecosystem: Rich tooling for dashboards (Grafana), reporting, and analytics

Alternative: InfluxDB would work if: - Write rates increased 10x (more sensors added) - No need to correlate with relational business data - Team willing to learn Flux query language

Not recommended: Prometheus–designed for short-term infrastructure monitoring, not multi-year IoT data retention.

15.7 Worked Example: Smart Grid Query Optimization

15.8 Worked Example: Time-Series Query Optimization for Smart Grid

Scenario: A utility company operates a smart grid with 10,000 smart meters, each reporting energy consumption every 15 minutes. The operations team needs dashboards showing: - Real-time consumption (last hour, per-meter detail) - Daily peak demand (last 30 days, aggregated by region) - Monthly trends (last 12 months, company-wide)

Goal: Design a query and storage strategy that keeps dashboard latency under 2 seconds while minimizing storage costs.

What we do: Estimate raw data ingestion and storage requirements.

Calculations:

Meters: 10,000
Readings per meter per day: 24 hours x 4 readings/hour = 96
Total readings per day: 10,000 x 96 = 960,000
Bytes per reading: ~50 bytes (timestamp + meter_id + kWh + voltage + metadata)
Daily raw data: 960,000 x 50 = 48 MB/day
Annual raw data: 48 MB x 365 = 17.5 GB/year (uncompressed)

Why: Understanding data volume determines partition strategy, retention policies, and hardware requirements. At 960K writes/day (11 writes/second average), even a basic TSDB handles this easily–but dashboards querying across 350M+ annual rows need optimization.

What we do: Configure time-based partitioning to isolate recent vs historical data.

TimescaleDB Configuration:

-- Create hypertable with 1-day chunks
CREATE TABLE meter_readings (
    time        TIMESTAMPTZ NOT NULL,
    meter_id    INTEGER NOT NULL,
    region_id   INTEGER NOT NULL,
    kwh         DOUBLE PRECISION,
    voltage     DOUBLE PRECISION
);

SELECT create_hypertable('meter_readings', 'time',
    chunk_time_interval => INTERVAL '1 day');

-- Add indexes for common query patterns
CREATE INDEX idx_meter_time ON meter_readings (meter_id, time DESC);
CREATE INDEX idx_region_time ON meter_readings (region_id, time DESC);

Why: Daily chunks mean queries for “last hour” scan only 1 partition (~40K rows) instead of the entire table. The meter_id and region_id indexes accelerate filtered queries without excessive write overhead.

What we do: Pre-compute hourly and daily aggregates to accelerate dashboard queries.

Continuous Aggregate Setup:

-- Hourly aggregates (for daily peak analysis)
CREATE MATERIALIZED VIEW meter_hourly
WITH (timescaledb.continuous) AS
SELECT
    time_bucket('1 hour', time) AS hour,
    region_id,
    COUNT(*) as reading_count,
    AVG(kwh) as avg_kwh,
    MAX(kwh) as peak_kwh,
    SUM(kwh) as total_kwh
FROM meter_readings
GROUP BY hour, region_id
WITH NO DATA;

-- Refresh policy: update every 15 minutes, cover last 2 hours
SELECT add_continuous_aggregate_policy('meter_hourly',
    start_offset => INTERVAL '2 hours',
    end_offset => INTERVAL '15 minutes',
    schedule_interval => INTERVAL '15 minutes');

-- Daily aggregates (for monthly trend analysis)
CREATE MATERIALIZED VIEW meter_daily
WITH (timescaledb.continuous) AS
SELECT
    time_bucket('1 day', time) AS day,
    region_id,
    AVG(kwh) as avg_kwh,
    MAX(kwh) as daily_peak_kwh,
    SUM(kwh) as total_kwh
FROM meter_readings
GROUP BY day, region_id
WITH NO DATA;

SELECT add_continuous_aggregate_policy('meter_daily',
    start_offset => INTERVAL '3 days',
    end_offset => INTERVAL '1 day',
    schedule_interval => INTERVAL '1 day');

Why: Continuous aggregates pre-compute results incrementally. A 30-day peak demand query now scans 30 rows per region (720 total for 24 regions) instead of 28.8M raw readings–a 40,000x reduction.

What we do: Implement tiered retention to balance detail vs storage cost.

Retention Policy:

-- Compress data older than 7 days (10:1 compression typical)
ALTER TABLE meter_readings SET (
    timescaledb.compress,
    timescaledb.compress_segmentby = 'meter_id',
    timescaledb.compress_orderby = 'time DESC'
);

SELECT add_compression_policy('meter_readings', INTERVAL '7 days');

-- Drop raw data older than 90 days (aggregates retained)
SELECT add_retention_policy('meter_readings', INTERVAL '90 days');

-- Keep hourly aggregates for 2 years
SELECT add_retention_policy('meter_hourly', INTERVAL '2 years');

-- Keep daily aggregates for 10 years
SELECT add_retention_policy('meter_daily', INTERVAL '10 years');

Why: This tiered approach provides: - Last 7 days: Full resolution, uncompressed (fast per-meter queries) - 7-90 days: Full resolution, compressed (10x storage reduction) - 90+ days: Only aggregates retained (99% storage reduction)

What we do: Write queries that leverage partitions and aggregates.

Optimized Queries:

-- Dashboard 1: Real-time consumption (last hour)
-- Scans: ~40K rows from current chunk
SELECT meter_id, time, kwh
FROM meter_readings
WHERE time > NOW() - INTERVAL '1 hour'
  AND region_id = 5
ORDER BY time DESC;
-- Latency: ~50ms

-- Dashboard 2: Daily peak demand (last 30 days)
-- Scans: ~720 rows from hourly aggregate
SELECT date_trunc('day', hour) AS day,
       MAX(peak_kwh) as daily_peak
FROM meter_hourly
WHERE hour > NOW() - INTERVAL '30 days'
GROUP BY date_trunc('day', hour)
ORDER BY day;
-- Latency: ~20ms

-- Dashboard 3: Monthly trends (last 12 months)
-- Scans: ~288 rows from daily aggregate
SELECT date_trunc('month', day) as month,
       SUM(total_kwh) as monthly_consumption
FROM meter_daily
WHERE day > NOW() - INTERVAL '12 months'
GROUP BY month
ORDER BY month;
-- Latency: ~15ms

Why: Each query hits the appropriate data tier–raw data for recent detail, hourly aggregates for medium-term analysis, daily aggregates for long-term trends. All queries complete in under 100ms.

Outcome: All three dashboard queries complete in under 100ms (well under the 2-second requirement).

Storage Summary: | Data Tier | Retention | Size (Year 1) | Size (Year 5) | |———–|———–|—————|—————| | Raw (uncompressed) | 7 days | 336 MB | 336 MB | | Raw (compressed) | 90 days | 430 MB | 430 MB | | Hourly aggregates | 2 years | 15 MB | 30 MB | | Daily aggregates | 10 years | 0.5 MB | 2.5 MB | | Total | - | ~780 MB | ~800 MB |

Comparison without optimization: 87.5 GB after 5 years (109x more storage).

Key Decisions Made:

Daily chunks: Isolates recent data for fast queries
Continuous aggregates: Pre-computes common dashboard queries
Tiered retention: Keeps detail where needed, aggregates everywhere else
Compression after 7 days: Balances query speed vs storage
Index strategy: region_id + time indexes match query patterns

15.9 Worked Examples: Time Synchronization

Accurate timestamps are fundamental to time-series data. These two worked examples address the clock synchronization challenges that arise in distributed IoT deployments.

Worked Example: Calculating Sync Interval to Prevent Data Gaps

Scenario: A fleet management system tracks 5,000 vehicles using GPS sensors. Each vehicle reports position data to a central time-series database (InfluxDB). Due to cellular network variability, GPS timestamps from vehicles can drift from server time. The operations team needs to ensure data can be correctly ordered despite clock discrepancies.

Given:

Number of vehicles: 5,000
Reporting interval: 10 seconds per vehicle
Vehicle GPS clock accuracy: +/-2 seconds (cellular NTP sync)
Server clock accuracy: +/-50 ms (GPS-disciplined NTP)
Data retention window for real-time view: 1 hour
Out-of-order arrival tolerance: Up to 30 seconds late

Steps:

Calculate timestamp uncertainty between any two vehicles:
- Vehicle A clock: +/-2 seconds
- Vehicle B clock: +/-2 seconds
- Combined uncertainty = +/-4 seconds (worst case: A is +2s, B is -2s)
Determine minimum sampling interval for unambiguous ordering:
- Events must be >4 seconds apart to guarantee correct ordering
- Current interval (10 seconds) > 4 seconds: OK for ordering
Configure InfluxDB write buffer for late arrivals:
- Maximum expected lateness: 30 seconds
- Set cache-max-memory-size to handle 30 seconds of pending writes
- Buffer size = 5,000 vehicles x 3 readings (30s / 10s) x 100 bytes = 1.5 MB
Design timestamp handling strategy:
- Primary timestamp: Vehicle GPS timestamp (event time)
- Secondary timestamp: Server receipt time (processing time)
- Store both: time (GPS) and received_at (server)
- Query by GPS time for correct route reconstruction
Configure retention policy for clock drift tolerance:
- Hot tier: 1 hour at full resolution (for real-time dashboards)
- Use 5-second GROUP BY time() to absorb +/-2 second drift variations

Result: Configure InfluxDB with 1.5 MB write cache, dual timestamps, and 5-second aggregation buckets. This absorbs +/-4 second clock drift while correctly ordering 98.5% of position updates.

Key Insight: In distributed IoT systems, clock drift is inevitable. Design your data model to store both event time (when it happened) and ingestion time (when you received it). Query by event time for analytics but use ingestion time for troubleshooting data pipeline issues.

Worked Example: NTP Sync Interval for Edge Gateway Fleet

Scenario: A smart agriculture company deploys 200 edge gateways across remote farms. Each gateway aggregates data from 50 soil sensors and forwards to the cloud every 5 minutes. The gateways have low-cost oscillators and intermittent cellular connectivity. The team must design NTP synchronization to ensure timestamp accuracy for cross-farm analysis.

Given:

Edge gateways: 200 devices
Gateway oscillator drift: +/-100 ppm (low-cost crystal)
Cellular connectivity: Available 80% of time (intermittent)
NTP round-trip time: 200-500 ms (cellular network)
Required timestamp accuracy: +/-500 ms for cross-farm comparison
Data upload interval: 5 minutes (300 seconds)

Steps:

Calculate maximum drift between NTP syncs:
- Drift rate = 100 ppm = 100 microseconds per second = 0.1 ms/s
- Over 5-minute upload interval: 0.1 ms/s x 300s = 30 ms drift
- Over 1 hour without sync: 0.1 ms/s x 3600s = 360 ms drift
Calculate NTP sync accuracy limits:
- Network RTT asymmetry: Assume 10% asymmetry
- Asymmetry error = RTT x asymmetry = 350 ms x 10% = 35 ms
- Best achievable NTP accuracy: ~50-100 ms over cellular
Determine maximum sync interval:
- Error budget: 500 ms target accuracy
- NTP sync error: 100 ms (typical)
- Available for drift: 500 - 100 = 400 ms
- Max sync interval = 400 ms / 0.1 ms/s = 4,000 seconds = 66 minutes
Account for connectivity gaps:
- 20% downtime means potential 20% longer gaps between syncs
- Apply safety factor: 66 min x 0.8 = 53 minutes recommended
- Round to 30 minutes for practical implementation
Configure NTP client for intermittent connectivity:
- Primary: Cloud NTP server (time.google.com)
- Secondary: GPS time from connected sensors (if available)
- Retry interval on failure: 5 minutes
- Maximum poll interval: 30 minutes
- Store last-known offset for gap periods

Result: Configure edge gateways to sync NTP every 30 minutes. During connectivity gaps up to 66 minutes, clocks remain within 400 ms accuracy. Combined with NTP sync error (~100 ms), total accuracy stays within 500 ms target.

Key Insight: Low-cost IoT devices with 100 ppm oscillators need hourly NTP syncs for sub-second accuracy. For tighter requirements (<100 ms), either upgrade to TCXO oscillators (+/-2 ppm) or implement GPS-disciplined timing at edge gateways.

15.10 Hands-On Lab: Time-Series Data Logger

Lab Overview

In this hands-on lab, you will build a time-series data logger using an ESP32 microcontroller. You will learn how to collect timestamped sensor data, implement efficient storage using circular buffers, apply downsampling techniques, and query historical data through serial commands.

15.10.1 Learning Objectives

By completing this lab, you will be able to:

Collect timestamped sensor data from multiple sensors on an ESP32
Implement a circular buffer for efficient memory management on resource-constrained devices
Apply downsampling techniques to reduce storage requirements while preserving data trends
Query historical data using custom serial commands
Evaluate the trade-offs between data resolution and storage capacity

15.10.2 Prerequisites

Basic C/C++ programming knowledge
Familiarity with Arduino IDE concepts
Understanding of time-series data concepts (covered earlier in this chapter)

15.10.3 Wokwi Simulator

Use the embedded simulator below to complete this lab. The ESP32 environment comes pre-configured with essential libraries.

Simulator Tips

Click inside the simulator and press Ctrl+Shift+M (or Cmd+Shift+M on Mac) to open the Serial Monitor
Use the temperature sensor on the virtual breadboard or simulate readings with the built-in random values
You can save your project to Wokwi by creating a free account

15.10.4 Step-by-Step Instructions

15.10.4.1 Step 1: Set Up the Basic Data Structure

First, define the data structures for storing timestamped sensor readings. Copy this code into the simulator:

#include <Arduino.h>
#include <time.h>

// Configuration constants
#define BUFFER_SIZE 100        // Number of readings to store
#define SAMPLE_INTERVAL 1000   // Sample every 1 second (ms)
#define DOWNSAMPLE_FACTOR 5    // Average every 5 readings for storage

// Data point structure - 12 bytes per reading
struct DataPoint {
  uint32_t timestamp;    // Unix timestamp (seconds since epoch)
  float temperature;     // Temperature in Celsius
  float humidity;        // Relative humidity percentage
};

// Circular buffer for raw data (high resolution)
DataPoint rawBuffer[BUFFER_SIZE];
int rawHead = 0;
int rawCount = 0;

// Circular buffer for downsampled data (long-term storage)
DataPoint downsampledBuffer[BUFFER_SIZE];
int dsHead = 0;
int dsCount = 0;

// Accumulator for downsampling
float tempAccum = 0;
float humidAccum = 0;
int accumCount = 0;

// Timing
unsigned long lastSampleTime = 0;
uint32_t startTime = 0;

15.10.4.2 Step 2: Implement the Circular Buffer Operations

Add these functions to manage the circular buffer efficiently:

// Add a data point to a circular buffer
void addToBuffer(DataPoint* buffer, int& head, int& count,
                 DataPoint point, int maxSize) {
  buffer[head] = point;
  head = (head + 1) % maxSize;
  if (count < maxSize) {
    count++;
  }
}

// Get the index of a point at a given offset from newest
int getIndex(int head, int count, int offset, int maxSize) {
  if (offset >= count) return -1;
  return (head - 1 - offset + maxSize) % maxSize;
}

// Calculate storage usage
void printStorageStats() {
  int rawBytes = rawCount * sizeof(DataPoint);
  int dsBytes = dsCount * sizeof(DataPoint);

  Serial.println("\n=== Storage Statistics ===");
  Serial.printf("Raw buffer: %d/%d points (%d bytes)\n",
                rawCount, BUFFER_SIZE, rawBytes);
  Serial.printf("Downsampled: %d/%d points (%d bytes)\n",
                dsCount, BUFFER_SIZE, dsBytes);
  Serial.printf("Total memory: %d bytes\n", rawBytes + dsBytes);
  Serial.printf("Compression ratio: %.1fx\n",
                rawCount > 0 ? (float)(rawCount * sizeof(DataPoint)) /
                              (dsCount * sizeof(DataPoint) + 1) : 1.0);
}

15.10.4.3 Step 3: Implement Sensor Reading and Downsampling

Add the sensor collection and downsampling logic:

// Simulate sensor readings (replace with real sensors)
DataPoint readSensors() {
  DataPoint dp;
  dp.timestamp = startTime + (millis() / 1000);
  dp.temperature = 25.0 + sin(millis() / 10000.0) * 5.0
                   + random(-10, 10) / 10.0;
  dp.humidity = 50.0 + cos(millis() / 15000.0) * 10.0
                + random(-5, 5) / 10.0;
  return dp;
}

// Collect a reading and accumulate for downsampling
void processSensorReading() {
  DataPoint reading = readSensors();
  addToBuffer(rawBuffer, rawHead, rawCount, reading, BUFFER_SIZE);

  tempAccum += reading.temperature;
  humidAccum += reading.humidity;
  accumCount++;

  if (accumCount >= DOWNSAMPLE_FACTOR) {
    DataPoint ds = { reading.timestamp,
                     tempAccum / accumCount,
                     humidAccum / accumCount };
    addToBuffer(downsampledBuffer, dsHead, dsCount, ds, BUFFER_SIZE);
    tempAccum = humidAccum = accumCount = 0;
  }
}

15.10.4.4 Step 4: Implement Query Commands

Add serial command processing for querying historical data:

// Query last N readings from a ring buffer
void queryLastN(DataPoint* buffer, int head, int count,
                int n, const char* name) {
  Serial.printf("\n=== Last %d from %s ===\n", n, name);
  int toShow = min(n, count);
  for (int i = 0; i < toShow; i++) {
    int idx = getIndex(head, count, i, BUFFER_SIZE);
    if (idx >= 0)
      Serial.printf("%lu\t%.2fC\t%.2f%%\n",
          buffer[idx].timestamp, buffer[idx].temperature,
          buffer[idx].humidity);
  }
}

// Min/max/avg statistics over a buffer
void calculateStats(DataPoint* buffer, int head, int count) {
  if (!count) { Serial.println("No data."); return; }
  float minT=999, maxT=-999, sumT=0, minH=999, maxH=-999, sumH=0;
  for (int i = 0; i < count; i++) {
    int idx = getIndex(head, count, i, BUFFER_SIZE);
    if (idx < 0) continue;
    float t = buffer[idx].temperature, h = buffer[idx].humidity;
    sumT += t; sumH += h;
    if (t < minT) minT = t; if (t > maxT) maxT = t;
    if (h < minH) minH = h; if (h > maxH) maxH = h;
  }
  Serial.printf("Temp  - Min:%.1f Max:%.1f Avg:%.1f\n", minT, maxT, sumT/count);
  Serial.printf("Humid - Min:%.1f Max:%.1f Avg:%.1f\n", minH, maxH, sumH/count);
}

// Serial command dispatcher (HELP, RAW N, DS N, STATS, STORAGE, CLEAR)
void processCommand(String cmd) {
  cmd.trim(); cmd.toUpperCase();
  if (cmd == "HELP")           printHelp();
  else if (cmd.startsWith("RAW "))
    queryLastN(rawBuffer, rawHead, rawCount, cmd.substring(4).toInt(), "Raw");
  else if (cmd.startsWith("DS "))
    queryLastN(downsampledBuffer, dsHead, dsCount, cmd.substring(3).toInt(), "DS");
  else if (cmd == "STATS")     { calculateStats(rawBuffer,rawHead,rawCount);
                                  calculateStats(downsampledBuffer,dsHead,dsCount); }
  else if (cmd == "STORAGE")   printStorageStats();
  else if (cmd == "CLEAR")     { rawHead=rawCount=dsHead=dsCount=0;
                                  tempAccum=humidAccum=accumCount=0; }
}

15.10.4.5 Step 5: Setup and Main Loop

Complete the program with the setup and loop functions:

void setup() {
  Serial.begin(115200);
  delay(1000);

  // Initialize pseudo-random time (in real deployment, use NTP)
  startTime = 1704067200; // Jan 1, 2024 00:00:00 UTC

  Serial.println("\n====================================");
  Serial.println("  Time-Series Data Logger v1.0");
  Serial.println("====================================");
  Serial.println("Collecting sensor data...");
  Serial.printf("Sample interval: %d ms\n", SAMPLE_INTERVAL);
  Serial.printf("Downsample factor: %d\n", DOWNSAMPLE_FACTOR);
  Serial.printf("Buffer size: %d readings\n", BUFFER_SIZE);
  Serial.println("\nType HELP for available commands.\n");
}

void loop() {
  // Collect sensor data at specified interval
  unsigned long currentTime = millis();
  if (currentTime - lastSampleTime >= SAMPLE_INTERVAL) {
    lastSampleTime = currentTime;
    processSensorReading();
  }

  // Process serial commands
  if (Serial.available()) {
    String command = Serial.readStringUntil('\n');
    processCommand(command);
  }
}

15.10.5 Testing Your Implementation

Start the simulator and open the Serial Monitor (Ctrl+Shift+M)
Wait 10-15 seconds to collect some data
Try these commands:
- HELP - View all available commands
- RAW 10 - Show the last 10 raw readings
- DS 5 - Show the last 5 downsampled readings
- STATS - View min/max/average statistics
- STORAGE - See memory usage and compression ratio

15.10.6 Challenge Exercises

Challenge 1: Add a Third Sensor

Modify the code to include a third sensor (e.g., pressure or light level). Update the DataPoint structure and all related functions.

Hint: You will need to update the struct, the readSensors() function, and all print statements.

Challenge 2: Implement Time-Range Queries

Add a new command RANGE start end that queries data between two timestamps. For example, RANGE 1704067210 1704067220 should show all readings in that 10-second window.

Hint: Iterate through the buffer using getIndex() and compare each point’s timestamp against the start and end parameters. Parse the timestamps from the serial command using strtoul().

Challenge 3: Multi-Resolution Downsampling

Implement a three-tier storage system: - Raw data: 1-second resolution (last 60 readings) - Medium: 10-second averages (last 100 readings) - Long-term: 1-minute averages (last 100 readings)

This mirrors how production time-series databases like InfluxDB implement retention policies.

Hint: Create three separate circular buffers and three downsampling accumulators.

Challenge 4: Anomaly Detection

Add automatic anomaly detection that prints a warning when temperature or humidity readings deviate more than 2 standard deviations from the running average.

Hint: Maintain running sum and sum-of-squares to calculate variance efficiently without storing all historical values.

15.10.7 What You Learned

In this lab, you implemented core time-series database concepts on a microcontroller:

Concept	Implementation
Timestamped data	Each reading includes a Unix timestamp
Circular buffer	Fixed-size buffer that overwrites oldest data
Downsampling	Averaging multiple readings to reduce storage
Query interface	Serial commands for data retrieval
Storage efficiency	Compression ratio tracking

These same principles apply to production time-series databases like InfluxDB and TimescaleDB, which use similar techniques at much larger scale with additional optimizations like columnar compression and time-based partitioning.

15.11 Retention Strategy Design

This section synthesizes the patterns from the case study and worked examples into actionable guidance for designing your own retention strategies.

Worked Example: Optimizing Smart Factory Sensor Data Storage

Scenario: A manufacturing facility deploys 250 vibration sensors on production machinery, each reporting at 10 Hz. Raw data volume threatens to overwhelm storage capacity and increase cloud costs by 300%. Design a retention strategy that balances operational needs with cost constraints.

Given:

Sensors: 250 vibration sensors
Sample rate: 10 Hz (10 readings/second per sensor)
Data size: 16 bytes per reading (timestamp + sensor_id + vibration_amplitude)
Current cloud storage cost: $0.023 per GB-month
Operational requirements: Real-time anomaly detection (last 24 hours), weekly trend analysis, annual compliance reporting

Step 1: Calculate baseline storage requirements

Daily raw data = 250 sensors × 10 readings/sec × 86,400 sec/day × 16 bytes
             = 250 × 864,000 × 16 = 3,456,000,000 bytes/day
             = 3.22 GB/day raw

Annual raw storage = 3.22 GB/day × 365 days = 1,175 GB/year (1.15 TB)
Annual cost (no retention policy) = 1,175 GB × $0.023 = $27.03/month growing indefinitely

Step 2: Design multi-tier retention policy

Tier 1 (Hot): Raw 10 Hz data, 24-hour retention
  Storage = 3.22 GB/day × 1 day = 3.22 GB
  Use case: Real-time anomaly detection (FFT analysis requires high-resolution)

Tier 2 (Warm): 1-second averages (10→1 Hz downsampling), 7-day retention
  Reduction = 10:1 temporal + 2:1 compression = 20:1 total
  Storage = 3.22 GB/day ÷ 10 × 7 days × 0.5 (compression) = 1.13 GB
  Use case: Recent operational diagnostics

Tier 3 (Cool): 1-minute averages, 90-day retention
  Downsampling = 600:1 (from raw 10 Hz)
  Storage = 3.22 GB/day ÷ 600 × 90 days × 0.5 = 0.24 GB
  Use case: Weekly trend analysis

Tier 4 (Archive): Daily aggregates (min/max/avg/p95), 7-year retention
  Downsampling = 864,000:1
  Storage = 3.22 GB/day ÷ 864,000 × 2,555 days × 0.5 = 0.005 GB (5 MB)
  Use case: Annual compliance reports

Step 3: Calculate total storage and cost savings

Total storage with retention policy:
  Tier 1: 3.22 GB
  Tier 2: 1.13 GB
  Tier 3: 0.24 GB
  Tier 4: 0.005 GB (negligible)
  Total: 4.59 GB

Monthly cost = 4.59 GB × $0.023 = $0.11/month
Savings vs. no policy = ($27.03 - $0.11) / $27.03 = 99.6% cost reduction

Storage efficiency: 1,175 GB (annual raw) vs. 4.59 GB (steady state) = 256:1 reduction

Step 4: Validate against operational requirements

✓ Real-time anomaly detection: 24-hour hot tier at full 10 Hz resolution
✓ Weekly trend analysis: 90-day warm/cool tier with 1-minute granularity
✓ Annual compliance: 7-year archive with daily summaries

Result: The factory achieves 99.6% storage cost reduction while meeting all operational and compliance requirements. The multi-tier retention strategy eliminates 256:1 of redundant historical data without losing analytical capability.

Key Insight: Retention policies must align with data access patterns. High-resolution data is only valuable for recent analysis—historical trends require far less granularity. Always implement retention policies from day one; retrofitting them on existing petabyte-scale deployments is prohibitively expensive.

Decision Framework: Choosing Time-Series Database Retention Strategy

When to use aggressive downsampling (>100:1 ratio):

Data access frequency drops exponentially with age (most queries target last 24 hours)
Historical analysis requires only trends, not per-second precision
Storage costs are primary constraint
Example: Temperature monitoring in non-critical applications (office HVAC)

When to keep full-resolution longer (minimal downsampling):

Forensic analysis requires exact replay of events (security incidents, equipment failures)
Regulatory compliance mandates audit trails at original resolution
Anomaly detection algorithms need high-frequency patterns years later
Example: Industrial safety systems, medical device monitoring, financial trading

Cost-benefit threshold calculation:

Retention extension justified if:
  (Cost of storage per month) < (Probability of needing data × Value of insight)

Example: Keeping 1 TB extra data costs $23/month.
If there's a 5% chance per year of needing that data to diagnose a $50K equipment failure:
  $23 × 12 months = $276/year < (0.05 × $50,000) = $2,500
  → KEEP the data

Action checklist before implementing retention:

Audit actual query patterns (don’t guess—instrument your analytics queries)
Calculate storage growth rate at current sampling (TB/month, TB/year)
Document compliance retention requirements (legal, regulatory, contractual)
Test downsampling algorithms don’t destroy critical signal features
Implement automated tier migration (manual = guaranteed failure)
Set calendar reminders to review policy every 6 months

Common Mistake: Implementing Retention Policies Without Validating Query Patterns

The Mistake: A team implements aggressive 30-day retention and 100:1 downsampling based on “typical” IoT recommendations, without auditing their actual query patterns. Six months later, the compliance team discovers annual safety audits require per-second vibration data going back 3 years—data that was irreversibly deleted.

Why It Happens:

Teams copy retention policies from blog posts or vendor examples without customization
“Typical” IoT use cases are too generic (industrial safety ≠ smart home)
Compliance requirements are documented in legal PDFs, not engineering wikis
Analytics teams don’t communicate data needs to DevOps until too late

The Consequences:

Compliance audit failure requiring expensive manual workarounds ($150K incident at one manufacturer)
Inability to diagnose recurring equipment failures (root cause analysis requires historical patterns)
Costly data recovery attempts from backups (if backups even exist)
Reputational damage from admitting data loss to regulators

The Fix:

Audit query patterns FIRST: Instrument your analytics queries for 30 days

-- Example audit query for TimescaleDB
SELECT
  DATE_TRUNC('day', query_start) as day,
  MIN(data_age_hours) as oldest_data_accessed,
  AVG(data_age_hours) as avg_data_age,
  COUNT(*) as query_count
FROM query_log
WHERE query_text LIKE '%sensor_readings%'
GROUP BY DATE_TRUNC('day', query_start)
ORDER BY day DESC;

Interview stakeholders: Meet with compliance, operations, analytics teams
- “What’s the oldest data you’ve queried in the last year?”
- “What would happen if we deleted data older than X days?”
- “Are there regulatory retention requirements we don’t know about?”
Test retention on staging data: Implement policy on 6-month subset, validate impact
- Run historical queries against downsampled data—do results change?
- Check for edge cases (anomaly detection at window boundaries)
Implement with safety margins: Add 2x buffer to calculated requirements
- If compliance needs 1 year, keep 2 years
- If 90% of queries access last 30 days, keep 60 days at full resolution

Version your retention policies: Treat like database schemas

# retention-policy-v2.yaml
version: 2
effective_date: 2026-03-01
policies:
  hot_tier:
    resolution: raw
    duration: 48h  # Was 24h in v1 (compliance requirement discovered)
  warm_tier:
    resolution: 1min_avg
    duration: 90d  # Was 30d in v1

How One Team Caught This: A smart factory scheduled a “retention policy dry run” where they logged what WOULD be deleted without actually deleting it. The log showed 147 queries in one month accessing data older than the proposed 30-day retention window—including critical safety audits. Policy was adjusted to 2-year retention before going live.

Key Prevention: Never implement retention based on assumptions. Instrument actual data access patterns for at least 30 days, interview all stakeholders, and add safety margins. Data you delete is gone forever—storage costs $0.02/GB-month, but data loss costs are measured in compliance fines and lost diagnostics.

For Kids: Meet the Sensor Squad!

Tesla has 1.5 million cars sending data – the Sensor Squad learns how they handle it!

Sammy the Sensor is amazed: “Every Tesla car has 200 sensors! That is 300 MILLION measurements per second from all the cars combined!”

“That is like every person on Earth sending a text message every 26 seconds!” gasps Lila the LED.

But here is the mind-blowing part: If Tesla saved EVERY single reading, they would need 829 TERABYTES of storage PER DAY. That is like filling 200,000 smartphones with data every single day!

“That is impossible!” says Bella the Battery. “How do they do it?”

Max the Microcontroller explains Tesla’s three clever tricks:

Trick 1: Edge Processing (Think at the Car!) Instead of sending every single reading, the car itself does some thinking first. “I calculate the average battery temperature every minute and only send THAT. Instead of 60 readings per minute, I send just 1!”

Trick 2: Adaptive Sampling (Pay Attention When It Matters!) “When the car is parked, I check every 60 seconds – boring! But when the driver slams the brakes, I check 1,000 times per second! That is like taking a photo every minute during a boring class, but taking a video during the exciting school play!”

Trick 3: Tiered Storage (The Closet System!)

Today’s data: Keep everything! (like clothes you wear this week)
Last month: Keep hourly summaries (like clothes in your closet)
Last year: Keep daily summaries (like clothes in the attic)
Really old: Keep weekly summaries (like clothes at grandma’s house)

With all three tricks, Tesla’s 302 PB/year shrinks to about 15 PB/year – that is a 95% reduction! Instead of needing 829 terabytes EVERY DAY, they need about 41 terabytes – still a lot, but way more manageable!

“And they can STILL figure out exactly what happened during any crash,” adds Sammy, “because the car keeps detailed data locally for emergencies. Smart!”

15.11.1 Try This at Home!

Pretend you are a Tesla sensor! Write down the temperature every minute for 10 minutes (your “raw data”). Now make a summary: what was the average? The highest? The lowest? Your summary is just 3 numbers instead of 10 – that is a 70% reduction! In a real car, this happens with 200 sensors at once, saving massive amounts of storage.

Match Time-Series Practice Concepts to Their Real-World Application

Common Pitfalls

1. Running practice exercises with unrealistically small datasets

A dataset of 1,000 rows will not reveal the performance differences between good and bad schema designs. Practice with at least 1 million rows (simulating ~1 week of 10-sensor, 1-second data) to observe the actual impact of chunk exclusion, compression, and index choice. TimescaleDB’s sample datasets provide realistic scale for practice.

2. Not verifying compression actually activated

Adding a compression policy schedules compression but does not run it immediately. After creating a policy, explicitly call SELECT compress_chunk(i) FROM show_chunks(‘readings’) and verify with SELECT * FROM chunk_compression_stats(‘readings’). Assuming compression is active without verification is a common cause of unexpectedly high storage consumption in practice.

3. Skipping EXPLAIN ANALYZE on slow queries

Slow practice queries are valuable learning opportunities, not just obstacles. Run EXPLAIN ANALYZE on any query taking more than 100ms to understand whether chunk exclusion is working, whether the correct index is being used, and whether a continuous aggregate would eliminate the bottleneck. Developing this diagnostic habit in practice prevents production performance surprises.

Label the Diagram

💻 Code Challenge

15.12 Summary

This chapter applied time-series concepts through practical examples and hands-on exercises spanning Tesla-scale fleet telemetry, smart building deployments, smart grid query optimization, time synchronization for distributed systems, and embedded data logging on ESP32.

Key Takeaways:

Edge processing is essential at scale: Tesla’s ~95% data reduction (302 PB to ~15 PB/year) comes primarily from on-vehicle aggregation and adaptive sampling, not just cloud compression.
Adaptive strategies match data value: Sample faster during interesting events (hard braking at 100 Hz), slower during routine operation (parked every 5 minutes).
Standard tools handle most workloads: Start with InfluxDB or TimescaleDB – only build custom when you exceed their limits. A 20-story building’s 5.3M points/day (62 writes/second) is well within standard TSDB capacity.
Embedded systems can implement TSDB concepts: Circular buffers, downsampling, and time-based queries work on microcontrollers with as little as 2.4 KB of RAM (two 100-element buffers at 12 bytes each).
Time synchronization requires planning: Low-cost oscillators (100 ppm) drift 360 ms/hour – calculate drift budgets, NTP sync intervals, and design for connectivity gaps.
Retention policies must be validated against actual query patterns: Never implement retention based on assumptions. Audit data access patterns for at least 30 days and interview all stakeholders before deleting data.

15.13 Concept Relationships

15.14 What’s Next

You have now practiced time-series concepts through hands-on labs, worked examples, and real-world case studies. Choose your next topic based on what you want to explore:

If you want to…	Read this next
Process IoT data in real-time before it reaches the database	Stream Processing
Detect anomalies and unusual patterns in sensor data	Anomaly Detection
Explore edge computing patterns like Tesla’s on-vehicle aggregation	Edge Compute Patterns
Review additional storage design worked examples	Data Storage Worked Examples
Revisit time-series query optimization techniques	Query Optimization for IoT