1290  Time-Series Database Platforms

1290.1 Learning Objectives

By the end of this chapter, you will be able to:

  • Compare InfluxDB, TimescaleDB, and Prometheus architectures and use cases
  • Evaluate platform trade-offs for specific IoT deployment requirements
  • Select the right time-series database for your application domain
  • Write basic queries in Flux (InfluxDB), SQL (TimescaleDB), and PromQL (Prometheus)
  • Understand the licensing, clustering, and ecosystem differences between platforms
TipMVU: Minimum Viable Understanding

Core concept: Choose InfluxDB for maximum write throughput (500K+ pts/sec), TimescaleDB when you need SQL compatibility and joins with business data, or Prometheus for Kubernetes-native infrastructure monitoring. Why it matters: The wrong platform choice locks you into limitations–InfluxDB cannot join sensor data with customer records, Prometheus only retains 15-30 days, and TimescaleDB cannot match InfluxDB’s raw write speed. Key takeaway: Start with the question “Do I need to join time-series data with relational business data?” If yes, use TimescaleDB; if no and write volume exceeds 300K pts/sec, use InfluxDB; for infrastructure monitoring with alerting, use Prometheus.

1290.2 Comparing Major Platforms

Estimated time: ~20 min | Difficulty: Advanced | Unit: P10.C15.U03

Three dominant platforms serve different IoT use cases: InfluxDB (pure time-series), TimescaleDB (PostgreSQL compatibility), and Prometheus (monitoring and alerting).

1290.2.1 InfluxDB: Pure Time-Series Platform

Architecture:

  • Purpose-built time-series database from the ground up
  • Custom query language (Flux) optimized for time-series operations
  • Built-in retention policies and continuous queries (downsampling)
  • Schema-less tags and fields model

Data Model:

measurement: temperature
tags: {sensor_id=temp_001, location=building_a, floor=3}
fields: {value=23.5, battery=85}
timestamp: 2025-12-15T10:00:00Z
  • Measurement: Like a table (e.g., “temperature”, “humidity”)
  • Tags: Indexed metadata (dimensions for grouping)
  • Fields: Actual values (not indexed)
  • Timestamp: Automatic time index

Strengths:

  • Extreme write performance: 500,000+ points/second single node
  • Excellent compression: 20:1 to 30:1 typical
  • Built-in downsampling: Automatic aggregation to lower resolutions
  • Time-aware functions: derivative(), moving_average(), rate()
  • Retention policies: Automatic data expiration

Trade-offs:

  • Custom query language (learning curve vs. SQL)
  • No joins with external data (pure time-series only)
  • Enterprise features require paid license (clustering, high availability)

Ideal Use Case:

High-velocity IoT metrics collection where all data is time-series and you need maximum write performance.

Example Flux Query:

from(bucket: "iot_sensors")
  |> range(start: -1h)
  |> filter(fn: (r) => r["_measurement"] == "temperature")
  |> filter(fn: (r) => r["location"] == "building_a")
  |> aggregateWindow(every: 5m, fn: mean)
  |> yield(name: "mean")

1290.2.2 TimescaleDB: PostgreSQL-Compatible Time-Series

Architecture:

  • Extension to PostgreSQL (inherits full SQL support)
  • Automatic time-based chunking (hypertables)
  • Continuous aggregates for downsampling
  • Full PostgreSQL ecosystem compatibility

Data Model:

CREATE TABLE sensor_data (
  time        TIMESTAMPTZ NOT NULL,
  sensor_id   TEXT NOT NULL,
  location    TEXT,
  temperature DOUBLE PRECISION,
  humidity    DOUBLE PRECISION
);

SELECT create_hypertable('sensor_data', 'time');
  • Hypertable: Abstraction that automatically partitions by time
  • Chunks: Underlying tables (e.g., data_2025_12_week1, data_2025_12_week2)
  • Standard SQL: Use normal PostgreSQL queries

Strengths:

  • SQL compatibility: Existing PostgreSQL knowledge applies
  • Joins with relational data: Combine time-series with user accounts, device metadata
  • Rich ecosystem: Existing PostgreSQL tools (pgAdmin, Grafana, etc.)
  • Full ACID compliance: Transactions when needed
  • Good compression: 10:1 to 15:1 typical

Trade-offs:

  • Lower write throughput than pure time-series DBs (200,000 points/second)
  • More complex setup (PostgreSQL tuning required)
  • Heavier resource usage (PostgreSQL overhead)

Ideal Use Case:

IoT systems that need to correlate time-series data with business data (customer accounts, product catalogs, billing information).

Example SQL Query:

SELECT
  time_bucket('5 minutes', time) AS bucket,
  sensor_id,
  AVG(temperature) as avg_temp,
  MAX(temperature) as max_temp
FROM sensor_data
WHERE time > NOW() - INTERVAL '1 hour'
  AND location = 'building_a'
GROUP BY bucket, sensor_id
ORDER BY bucket DESC;

1290.2.3 Prometheus: Monitoring and Alerting Platform

Architecture:

  • Pull-based metric collection (scrapes exporters)
  • Local time-series database (not designed for long-term storage)
  • Built-in alerting engine (Alertmanager)
  • Kubernetes-native integration

Data Model:

http_requests_total{method="GET", endpoint="/api/sensors", status="200"} 1547 @ 1639564800
  • Metric name: http_requests_total
  • Labels: {method="GET", endpoint="/api/sensors", status="200"}
  • Value: 1547
  • Timestamp: 1639564800 (Unix timestamp)

Strengths:

  • Pull-based: Resilient to client failures (server controls collection)
  • Service discovery: Automatically discovers targets in Kubernetes
  • Built-in alerting: Define rules, route to on-call engineers
  • Powerful query language: PromQL optimized for monitoring
  • Kubernetes standard: De facto observability for containerized systems

Trade-offs:

  • Short-term storage: Typically 15-30 days (not for long-term IoT data)
  • No built-in downsampling: Use external systems (Thanos, Cortex)
  • Pull model limitation: Requires network access to scrape targets
  • Limited compression: Optimized for recency, not long-term storage

Ideal Use Case:

Infrastructure monitoring, Kubernetes observability, short-term alerting on IoT gateway health.

Example PromQL Query:

rate(sensor_readings_total{location="building_a"}[5m])

1290.2.4 Platform Comparison Table

Feature InfluxDB TimescaleDB Prometheus
Write Throughput 500K+ pts/sec 200-300K pts/sec 1M+ pts/sec
Query Language Flux (custom) SQL (PostgreSQL) PromQL (custom)
Compression 20:1 to 30:1 10:1 to 15:1 5:1 to 10:1
Built-in Downsampling Yes Yes (continuous aggregates) No (external)
Long-term Storage Excellent Excellent Limited (15-30 days)
Joins with Relational Data No Yes (full PostgreSQL) No
Alerting Via Kapacitor Via external tools Built-in (Alertmanager)
Best For Pure IoT time-series IoT + business data Infrastructure monitoring
Open Source License MIT (Community) Apache 2.0 Apache 2.0
Clustering/HA Enterprise only PostgreSQL replication External (Thanos/Cortex)
WarningTradeoff: InfluxDB vs TimescaleDB for IoT Deployments

Option A: InfluxDB - Purpose-built time-series database - Write throughput: 500K+ points/second (single node) - Compression ratio: 20:1 to 30:1 (excellent) - Query language: Flux (custom, learning curve) - SQL joins: Not supported (pure time-series only) - Licensing: MIT (Community), Enterprise paid for clustering - Best for: High-velocity pure metrics without relational data needs

Option B: TimescaleDB - PostgreSQL with time-series extension - Write throughput: 200K-300K points/second (single node) - Compression ratio: 10:1 to 15:1 (good) - Query language: Standard SQL (familiar to most teams) - SQL joins: Full PostgreSQL compatibility - Licensing: Apache 2.0 (open source), replication included - Best for: IoT analytics that correlate sensor data with business data

Decision Factors: - Choose InfluxDB when: Write volume exceeds 300K pts/sec, all data is time-series (no joins needed), team willing to learn Flux, maximum compression is critical (storage-constrained) - Choose TimescaleDB when: Need to join sensor data with user accounts/billing/metadata, team has strong SQL skills, using PostgreSQL ecosystem tools (pgAdmin, Grafana PostgreSQL datasource), compliance requires familiar auditable database - Cost comparison (5,000 sensors, 1 reading/sec, 1 year): InfluxDB Cloud ~$200/month; TimescaleDB Cloud ~$300/month; Self-hosted either ~$150/month (3-node cluster)

CautionPitfall: Tag Cardinality Explosion in InfluxDB

The Mistake: Using high-cardinality values as tags (device UUIDs, user IDs, session tokens) instead of fields, causing memory exhaustion and query performance collapse.

Why It Happens: Tags are indexed for fast filtering, so developers instinctively make every queryable attribute a tag. They don’t realize InfluxDB stores one time-series per unique tag combination in memory. With 100,000 devices as tags, you create 100,000 in-memory series–even if each device has only one reading.

The Fix:

# WRONG: device_id as tag creates 100K series
measurement: temperature
tags: {device_id=uuid-12345-...}  # High cardinality!
fields: {value=23.5}

# CORRECT: device_id as field, use coarse tags
measurement: temperature
tags: {region=us-west, building=hq}  # Low cardinality
fields: {value=23.5, device_id="uuid-12345-..."}

Cardinality limits to remember: - InfluxDB OSS: ~1 million series before severe degradation - InfluxDB Cloud: Charged per series (100K series = ~$50/month overhead) - Rule of thumb: Tags should have <10,000 unique values - Query by high-cardinality field: Use |> filter(fn: (r) => r.device_id == "uuid-...")

1290.2.5 Selection Decision Tree

%%{init: {'theme': 'base', 'themeVariables': {'primaryColor':'#E8F4F8','primaryTextColor':'#2C3E50','primaryBorderColor':'#16A085','lineColor':'#16A085','secondaryColor':'#FEF5E7','tertiaryColor':'#F4ECF7','edgeLabelBackground':'#ffffff','textColor':'#2C3E50','fontSize':'14px'}}}%%
flowchart TD
    A[IoT Database Selection] --> B{Need to join with<br/>relational data?}

    B -->|Yes| C[TimescaleDB<br/>SQL compatibility essential]
    B -->|No| D{Primary use case?}

    D -->|Long-term IoT storage| E{Write volume?}
    D -->|Infrastructure monitoring| F[Prometheus<br/>Kubernetes native]

    E -->|Very high<br/>>300K pts/sec| G[InfluxDB<br/>Maximum throughput]
    E -->|Moderate<br/><300K pts/sec| H{Team skills?}

    H -->|Strong SQL| C
    H -->|Willing to learn Flux| G

    style A fill:#E8F4F8
    style C fill:#16A085,color:#fff
    style F fill:#E67E22,color:#fff
    style G fill:#2C3E50,color:#fff

Figure 1290.1: Time Series Database Selection Decision Tree

%% fig-alt: "Side-by-side platform comparison showing InfluxDB, TimescaleDB, and Prometheus as three parallel tracks. Each track shows the platform's strengths: InfluxDB excels at write throughput (500K+ pts/sec), 20-30x compression, and pure time-series workloads. TimescaleDB provides SQL compatibility, PostgreSQL joins, and 10-15x compression for mixed workloads. Prometheus offers Kubernetes-native monitoring, built-in alerting, but limited to 15-30 day retention. Shows when to choose each platform based on use case alignment."
%%{init: {'theme': 'base', 'themeVariables': {'primaryColor':'#E8F4F8','primaryTextColor':'#2C3E50','primaryBorderColor':'#16A085','lineColor':'#16A085','secondaryColor':'#FEF5E7','tertiaryColor':'#F4ECF7','edgeLabelBackground':'#ffffff','textColor':'#2C3E50','fontSize':'14px'}}}%%
flowchart TB
    subgraph Influx["InfluxDB"]
        I1[500K+ writes/sec]
        I2[20-30x compression]
        I3[Flux query language]
        I4[Pure time-series]
        I5[No SQL joins]
        I1 --> I2 --> I3 --> I4 --> I5
    end

    subgraph Timescale["TimescaleDB"]
        T1[200-300K writes/sec]
        T2[10-15x compression]
        T3[Full SQL support]
        T4[Joins with PostgreSQL]
        T5[Familiar tooling]
        T1 --> T2 --> T3 --> T4 --> T5
    end

    subgraph Prometheus["Prometheus"]
        P1[1M+ samples/sec]
        P2[5-10x compression]
        P3[PromQL language]
        P4[Built-in alerting]
        P5[15-30 day retention]
        P1 --> P2 --> P3 --> P4 --> P5
    end

    UseCase{Your Use Case}
    UseCase -->|High-volume sensors<br/>Pure metrics| Influx
    UseCase -->|IoT + Business data<br/>SQL preferred| Timescale
    UseCase -->|K8s infrastructure<br/>Short-term alerts| Prometheus

    style Influx fill:#2C3E50,stroke:#16A085,color:#fff
    style Timescale fill:#16A085,stroke:#2C3E50,color:#fff
    style Prometheus fill:#E67E22,stroke:#2C3E50,color:#fff
    style UseCase fill:#7F8C8D,stroke:#2C3E50,color:#fff

Figure 1290.2: Platform Comparison: InfluxDB vs TimescaleDB vs Prometheus strengths and trade-offs

This view maps IoT application domains to optimal database choices:

%% fig-alt: "Decision matrix showing IoT application domains mapped to time-series database recommendations. Industrial IoT with high-velocity sensors and predictive maintenance maps to InfluxDB for maximum throughput. Smart Buildings with energy management and occupancy analytics maps to TimescaleDB for SQL joins with tenant/billing data. DevOps and Infrastructure with container monitoring and Kubernetes observability maps to Prometheus for native alerting. Connected Vehicles with fleet tracking and telematics maps to hybrid InfluxDB plus TimescaleDB approach. Healthcare IoT with patient monitoring and medical devices maps to TimescaleDB for compliance and audit requirements."
%%{init: {'theme': 'base', 'themeVariables': {'primaryColor':'#E8F4F8','primaryTextColor':'#2C3E50','primaryBorderColor':'#16A085','lineColor':'#16A085','secondaryColor':'#FEF5E7','tertiaryColor':'#F4ECF7','edgeLabelBackground':'#ffffff','textColor':'#2C3E50','fontSize':'14px'}}}%%
flowchart LR
    subgraph Domains["IoT Application Domains"]
        IIoT["Industrial IoT<br/>High-velocity sensors<br/>Predictive maintenance"]
        Smart["Smart Buildings<br/>Energy management<br/>Occupancy analytics"]
        DevOps["DevOps/Infrastructure<br/>Container monitoring<br/>K8s observability"]
        Fleet["Connected Vehicles<br/>Fleet tracking<br/>Telematics"]
        Health["Healthcare IoT<br/>Patient monitoring<br/>Medical devices"]
    end

    subgraph Databases["Recommended Database"]
        Influx["InfluxDB<br/>Max throughput"]
        Timescale["TimescaleDB<br/>SQL + Joins"]
        Prom["Prometheus<br/>Native alerting"]
        Hybrid["Hybrid<br/>InfluxDB + TimescaleDB"]
    end

    IIoT --> Influx
    Smart --> Timescale
    DevOps --> Prom
    Fleet --> Hybrid
    Health --> Timescale

    style IIoT fill:#2C3E50,stroke:#16A085,color:#fff
    style Smart fill:#16A085,stroke:#2C3E50,color:#fff
    style DevOps fill:#E67E22,stroke:#2C3E50,color:#fff
    style Fleet fill:#7F8C8D,stroke:#2C3E50,color:#fff
    style Health fill:#16A085,stroke:#2C3E50,color:#fff
    style Influx fill:#2C3E50,stroke:#16A085,color:#fff
    style Timescale fill:#16A085,stroke:#2C3E50,color:#fff
    style Prom fill:#E67E22,stroke:#2C3E50,color:#fff
    style Hybrid fill:#7F8C8D,stroke:#2C3E50,color:#fff

Different IoT domains have distinct requirements that align with specific database strengths.

%% fig-alt: "Layered architecture diagram of time-series database internals showing four tiers from top to bottom: Query Layer (Flux/SQL/PromQL parser, optimizer, executor), Index Layer (time-based partitioning, inverted indexes for tags, bloom filters for fast lookups), Storage Layer (LSM tree with memtable, WAL, and SST files at multiple compaction levels), and Compression Layer (delta encoding for timestamps, dictionary encoding for tags, run-length for repeated values). Arrows show write path flowing down and query path flowing up through all layers."
%%{init: {'theme': 'base', 'themeVariables': {'primaryColor':'#E8F4F8','primaryTextColor':'#2C3E50','primaryBorderColor':'#16A085','lineColor':'#16A085','secondaryColor':'#FEF5E7','tertiaryColor':'#F4ECF7','edgeLabelBackground':'#ffffff','textColor':'#2C3E50','fontSize':'14px'}}}%%
flowchart TB
    subgraph Query["Query Layer"]
        Q1[Query Parser<br/>Flux/SQL/PromQL]
        Q2[Optimizer<br/>Time-range pruning]
        Q3[Executor<br/>Parallel scan]
    end

    subgraph Index["Index Layer"]
        I1[Time Partitions<br/>Daily/Weekly chunks]
        I2[Tag Indexes<br/>Inverted index]
        I3[Bloom Filters<br/>Fast existence check]
    end

    subgraph Storage["Storage Layer (LSM Tree)"]
        S1[Memtable<br/>In-memory buffer]
        S2[WAL<br/>Write-ahead log]
        S3[SST Files<br/>Sorted string tables]
    end

    subgraph Compress["Compression Layer"]
        C1[Delta Encoding<br/>Timestamps]
        C2[Dictionary<br/>Tag values]
        C3[Run-Length<br/>Repeated values]
    end

    Write[Sensor Data<br/>Writes] --> S1
    S1 --> S2
    S2 --> S3
    S3 --> Compress

    Read[Dashboard<br/>Queries] --> Q1
    Q1 --> Q2
    Q2 --> Q3
    Q3 --> Index
    Index --> Storage

    style Query fill:#27AE60,stroke:#2C3E50,color:#fff
    style Index fill:#16A085,stroke:#2C3E50,color:#fff
    style Storage fill:#2C3E50,stroke:#16A085,color:#fff
    style Compress fill:#E67E22,stroke:#2C3E50,color:#fff
    style Write fill:#E74C3C,stroke:#2C3E50,color:#fff
    style Read fill:#3498DB,stroke:#2C3E50,color:#fff

Figure 1290.3: Time-Series Database Internal Architecture: From query parsing through indexing to LSM storage and compression

1290.3 Summary

Platform selection is one of the most important decisions in time-series database architecture.

Key Takeaways:

  1. InfluxDB: Maximum write throughput (500K+ pts/sec), excellent compression (20-30x), purpose-built for pure time-series. Choose when write volume is extreme and no relational joins are needed.

  2. TimescaleDB: PostgreSQL compatibility with 200-300K pts/sec write capacity. Choose when you need SQL joins with business data or your team has strong SQL skills.

  3. Prometheus: Kubernetes-native monitoring with built-in alerting. Choose for infrastructure monitoring with 15-30 day retention needs.

  4. Avoid cardinality explosion: In InfluxDB, keep tag cardinality under 10,000 unique values. Use fields for high-cardinality identifiers.

  5. Domain alignment matters: Industrial IoT favors InfluxDB, smart buildings favor TimescaleDB, DevOps favors Prometheus. Match platform strengths to your use case.

1290.4 What’s Next

In the next chapter on Data Retention and Downsampling, we’ll explore how to implement multi-tier retention policies that reduce storage by 95-98% while preserving analytical capabilities, and address critical time synchronization challenges in distributed IoT systems.