Data Storage

Storage Roles, Time-Series Workloads, Lifecycle, and Release Evidence

Time-series databases, CAP theorem, sharding, cloud storage, serialization, and data persistence

Your guide: Data Dora

“Every reading has a time and a cost — decide retention before you decide the database.”

Data storage turns device messages into durable evidence. A storage design has to answer more than “which database should we use?” It must explain what each record represents, who owns it, how it is queried, how long it stays, how quality problems are handled, and what proof a release must produce before it changes the pipeline.

This book contains fourteen content chapters plus this orientation page. Use the sidebar as the full source of chapter order. Use this page when you need a faster path to the right topic.

Start With the Storage Job

Picture the first release review for a new IoT product. The dashboard looks useful, but nobody can yet explain which store owns the device registry, which one owns raw history, which cache can be rebuilt, or how an archived day would come back during an incident. This module is the map for that conversation: name the job of each data shape before choosing a database.

The first decision is the job of the data, not the product name. A device registry, telemetry store, event log, latest-value cache, object store, and archive can all appear in the same IoT platform, but they have different query patterns and failure modes.

Two-stage route map for the Data Storage module: storage fundamentals then time-series databases, each labeled with its chapter count.

Roles

What kind of record is this?

Start with Data Storage Overview and Data Storage and Databases when the platform needs a clear separation between registry data, telemetry, events, files, caches, and archives.

Choice

Which database behavior matters?

Use Database Selection Framework and CAP Theorem and Database Categories to connect data shape, consistency, partition behavior, and operations.

Scale

Will the design survive growth?

Read Sharding Strategies and Data Quality Monitoring when hot partitions, schema drift, duplicate readings, bad timestamps, or quarantine paths matter.

Time series

How should measurements age?

Use the time-series chapters when readings arrive continuously and need time-window queries, rollups, retention, downsampling, platform selection, query optimization, or practice labs.

Chapter Chooser

Use the shortest path that matches your current design question. Read in order when you are new to storage; jump directly when you are reviewing an existing platform.

Question

Read first

Then check

What data do we have?

Data Storage Overview

Data Storage and Databases

Which database family fits?

Database Selection Framework

CAP Theorem and Database Categories

Can reads and writes scale?

Sharding Strategies

Data Quality Monitoring

Is this mainly telemetry?

Time-Series Databases

Time-Series Database Fundamentals

Which time-series platform?

Time-Series Databases for IoT

Time-Series Database Platforms

How do we query and retain data?

Time-Series Query Optimization

Data Retention and Downsampling

How do we practice design review?

Data Storage Worked Examples

Time-Series Practice and Labs

Evidence Loop

Every chapter in this book should lead back to evidence. A storage decision is weak if it cannot show what was accepted, rejected, transformed, stored, summarized, retained, restored, and monitored.

A storage design is ready for release only when the data path and its evidence loop are both visible.

Ingest Identify the device, tenant, protocol, receive time, payload version, and authentication result.

Validate Check schema, units, bounds, timestamps, duplicates, and quarantine reasons before trusted storage.

Store Match durable records to roles: registry, telemetry, events, artifacts, latest values, and archive.

Serve Prove the indexes, partitions, caches, and query paths support the required operational questions.

Retain Define hot, warm, cold, and delete behavior with restore tests rather than retention promises alone.

Release Record migration plans, rollback limits, quality dashboards, replay results, and ownership decisions.

Quality Bar for This Module

Avoid

One table for every IoT fact

Different data shapes need different constraints, indexes, retention rules, and ownership boundaries.

Avoid

Database choice before query evidence

Product names are not design evidence. Start with read paths, write paths, consistency needs, failure behavior, and lifecycle.

Avoid

Retention without restore testing

A retention policy is incomplete until someone can restore or replay the data needed for incident review, audit, or model training.

Avoid

Untested timestamp assumptions

IoT storage should distinguish device time, gateway receive time, platform ingest time, and processing time when those differences matter.

Suggested Paths

New to storage: read the overview, storage roles, database selection, CAP theorem, sharding, quality monitoring, and worked examples before moving into the time-series chapters.
Building a telemetry pipeline: start with time-series databases, fundamentals, database options, platform selection, query optimization, retention, and practice.
Reviewing an existing platform: scan the chapter chooser, compare the design to the evidence loop, then use the worked examples to test whether each storage decision has measurable proof.

The sidebar remains the complete navigation for this book. Use it to continue with Data Storage Overview or jump to the chapter that matches your current storage question.