5 Sharding Strategies

Shard Keys, Hot Partitions, Query Routing, Rebalancing, and Release Evidence

data-storage

sharding

5.1 Start With the Query That Must Stay Fast

Sharding is easiest to understand from the user’s query. A support page wants one device’s recent history now, while an operations report may need a whole fleet summary later. A good shard key makes the common path predictable and gives the expensive path a controlled plan instead of a surprise fan-out.

Overview: Sharding Is A Routing Contract

Sharding splits records across storage nodes so one machine is not responsible for every write, read, and byte of growth. The key design choice is the shard key. It decides where a record lands, which queries stay local, which queries fan out, and how painful growth or rebalancing will be later.

The useful question is not "is the database sharded?" The useful question is whether the routing contract matches the workload. IoT storage often needs recent device history, tenant isolation, fleet summaries, retention, replay, and restore. A shard key that helps one of those jobs can make another job expensive.

If you only need the intuition, use this rule: choose shard keys from required queries and write distribution, then prove the routing behavior before release.

Sharding evidence should show which queries stay local and which queries need controlled scatter-gather or summaries.

Worked example: 12,000 devices each report every 30 seconds. That is 12,000 × 2 × 60 × 24 = 34,560,000 readings per day. A time-only shard key can make retention easy, but if every current reading lands in today's shard, one owner absorbs the entire live write load. A device-hash route across 8 shards would average about 4,320,000 readings per shard per day, but it makes a fleet-wide hourly average touch all 8 shards unless the system maintains summaries.

The shard-key review must therefore name both read paths. A support page that reads one device for the last 24 hours should route to one shard or a small known set of time chunks. A compliance report that reads every device in a region may intentionally scatter, but then it needs a fan-out budget, merge timeout, and summary fallback. The design is not proven until those local and scatter paths are tested with realistic counts.

Locality

Common reads should touch one shard or a small known shard set. Device, tenant, site, and time fields shape that locality.

Distribution

Writes should spread across enough shards to avoid a hot current partition, hot tenant, or hot device overwhelming one node.

Lifecycle

Retention, compaction, archival, and restore need bounded partitions or chunks that can be moved, verified, and removed safely.

Operations

Routing tests, skew dashboards, scatter budgets, rebalance drills, and backup evidence decide whether the sharding plan is ready.

Practitioner: Build The Shard-Key Ledger

A shard-key ledger forces the team to review locality, distribution, lifecycle, and operations before committing to a routing scheme. It should list each required query, the fields available in that query, the expected write distribution, and the evidence that proves routing under realistic load.

Do not treat partitioning, sharding, replication, and read replicas as interchangeable. Partitioning splits one logical table or dataset into physical pieces. Sharding assigns records to different ownership groups or nodes. Replication copies data for durability or read scale. Read replicas can help dashboards but do not fix a bad write-routing key.

Key Pattern

Works Well When

Common Failure

Evidence To Keep

Device or asset key

Most queries ask for one device, one asset, or a small known set of devices over a time window.

Fleet-wide analytics scatter across many shards unless summaries or analytics paths exist.

Single-device latency, targeted fan-out tests, per-shard write histogram, and summary-table plan.

Tenant or site key

Isolation, billing, access control, geography, or operational ownership matters.

A large tenant or site dominates its shard and becomes a noisy neighbor.

Tenant load distribution, move plan, quota policy, noisy-neighbor alarms, and restore drill.

Time bucket

Retention, chunk pruning, archival, and restore are dominated by time windows.

Current writes overload the newest bucket when time is the only routing field.

Current-bucket write rate, retention dry run, chunk size limits, and time-bound query plan.

Composite key

The workload needs both bounded time windows and locality for device, tenant, site, or region.

Too many dimensions make routing, backfills, and support work hard to explain and test.

Routing fixtures, key examples, boundary tests, rebalance drill, and support runbook.

Shard-Key Review Record

Primary query patterns:
Write distribution assumption:
Shard key:
Time or lifecycle boundary:
Queries that stay local:
Queries that fan out:
Hot-key mitigation:
Rebalance method:
Backup and restore evidence:
Release dashboard:

Under The Hood: Rebalancing And Hot Keys Decide Production Risk

A sharded design changes over time. Device fleets grow unevenly, tenants differ in write volume, regions add sites, retention windows change, and backfills compete with live ingestion. The routing contract therefore needs a rebalancing story, not only a first-day shard key.

Naive routing can make growth dangerous. If ownership is based only on a direct modulo of the current node count, adding or removing nodes may move many keys at once. More controlled approaches use stable ownership maps, token ranges, virtual buckets, explicit routing tables, or managed service mechanisms that move a bounded portion of the key space with verification.

Worked example: a telemetry cluster maps 1,024 virtual buckets onto 8 physical shards, so each shard owns about 128 buckets. If a ninth shard is added, the team can move 64 buckets first, which is 64 ÷ 1,024 = 6.25% of the key space, watch copy lag and read correctness, then continue in bounded waves. The release drill should record which buckets moved, how many device series were copied, how many writes arrived during dual-write or cutover, and what rollback condition would stop the move.

Hot keys need the same evidence. If one tenant sends 18,000 writes per minute while the median tenant sends 600, a tenant-only shard key can overload one owner even though the global average looks safe. The mitigation might be tenant-plus-device sub-buckets, quota enforcement, precomputed summaries, or moving that tenant to a larger shard group. Under the hood, the router must keep those exceptions deterministic so retries, backfills, and restore jobs use the same ownership map.

Mechanics To Prove

Routing Determinism

The same key must route predictably across application instances, deployments, and retries. Built-in process-randomized hashes are not suitable for durable routing.

Hot-Key Handling

High-rate devices, tenants, or sites may need controlled sub-buckets, quotas, summaries, or an explicit move plan.

Scatter Budgets

Queries that touch many shards need fan-out limits, timeout behavior, merge-cost evidence, and summary alternatives.

Move Safety

Rebalancing needs copy checks, dual-read or cutover rules, write ordering, rollback conditions, and restore validation.

Release Evidence

Routing fixtures: known device, tenant, site, and time-boundary keys route to expected shards.
Skew dashboard: per-shard writes, bytes, queue depth, latency, error rate, and hot-key alerts are visible.
Backfill drill: historical replay is throttled and does not starve live writes.
Rebalance drill: a bounded key range moves while reads, writes, and restore checks remain correct.
Failure policy: the coordinator, router, or client behavior is known when one shard is slow or unavailable.

5.2 Summary

Sharding is a routing contract between records, queries, storage owners, and operations.
The shard key should be selected from required queries, write distribution, lifecycle boundaries, and isolation needs.
Time buckets help retention but can create hot current partitions when used alone.
Device, tenant, site, and composite keys can improve locality but may make fleet analytics scatter without summaries.
Production readiness depends on routing tests, skew dashboards, scatter budgets, rebalance drills, backup/restore evidence, and clear failure behavior.

Key Takeaway

Shard keys solve scale only when they match the workload and come with evidence for routing, skew, rebalancing, retention, and recovery.