5  CAP Theorem and Database Categories

In 60 Seconds

The CAP theorem states that distributed databases can only guarantee two of three properties: Consistency, Availability, and Partition tolerance. Since IoT systems span edge, fog, and cloud with unreliable networks, partition tolerance is mandatory–forcing a choice between consistency and availability for each data type.

Learning Objectives

After completing this chapter, you will be able to:

  • Explain the CAP theorem and its three properties in the context of distributed IoT systems
  • Classify IoT data types by their consistency and availability requirements using a CP/AP decision framework
  • Evaluate synchronous versus asynchronous write-ahead log strategies for different IoT workloads
  • Compare horizontal sharding, replication, and storage tiering trade-offs for scaling IoT databases
  • Design polyglot persistence architectures that apply different CAP trade-offs to different data types within the same system
MVU: CAP Theorem for IoT

Mental Model: Think of CAP as a dial, not a switch. For each data type in your IoT system, you tune the dial between “always correct” (CP) and “always responsive” (AP). The dial position depends on the business cost of being wrong versus the business cost of being silent.

Decision Rule: If inconsistency causes physical harm or financial loss (control commands, billing), choose CP. If unavailability causes operational blindness (sensor telemetry, monitoring), choose AP. Most IoT systems need both–this is called polyglot persistence.

When you store IoT data across multiple computers, you face a fundamental trade-off. Imagine a team of librarians maintaining copies of the same catalog – if the phone lines between libraries go down, they can either stop lending books until they sync up (consistency), or keep lending knowing catalogs might briefly differ (availability). The CAP theorem says you must choose, and the right choice depends on your IoT application.

5.1 Introduction

The CAP theorem is one of the most important theoretical foundations for designing distributed IoT systems. As IoT deployments grow from single-device prototypes to fleets of thousands of sensors spanning edge, fog, and cloud tiers, the question of how to store and access data reliably becomes critical. This chapter provides a deep dive into CAP trade-offs, building on the overview in Data Storage and Databases. You will apply CAP reasoning to real IoT scenarios, compare synchronous and asynchronous durability strategies, evaluate scaling approaches, and design storage tiering policies that reduce cost while meeting performance requirements.

5.2 The CAP Theorem

5.2.1 How It Works: CAP Trade-offs in Distributed Databases

The CAP theorem, first conjectured by Eric Brewer in 2000 and formally proven by Gilbert and Lynch in 2002, states that a distributed system can guarantee at most two of three properties simultaneously. This section explains each property and what the trade-off looks like in practice when network partitions occur:

Consistency (C): All nodes see the same data at the same time–when you read from any node, you get the most recent write

Availability (A): Every request gets a non-error response (though data may be stale)–the system always responds, even during network failures

Partition Tolerance (P): System continues operating despite network failures splitting nodes into isolated groups

The Trade-off Mechanism:

When a network partition occurs (e.g., edge gateway loses connection to cloud for 10 minutes):

CP System (MongoDB with majority write concern):

  • Minority partition: Rejects writes –> unavailable
  • Majority partition: Accepts writes –> consistent
  • After partition heals: No conflicts, all nodes identical
  • Behavior: Short unavailability ensures long-term consistency

AP System (Cassandra with consistency level ONE):

  • Both partitions: Accept writes –> available
  • Writes may differ across partitions –> temporarily inconsistent
  • After partition heals: Conflict resolution (last-write-wins) –> eventual consistency
  • Behavior: Always available but may serve stale data briefly

IoT Application: For sensor telemetry (temperature readings), AP is acceptable–slightly stale data is better than no data. For control commands (turn off motor), CP is required–conflicting commands could damage equipment.

Triangle diagram showing the three CAP theorem properties (Consistency, Availability, Partition Tolerance) with arrows indicating that only two can be guaranteed simultaneously, and labels showing CP and AP trade-off regions for IoT use cases
Figure 5.1: CAP Theorem Trade-offs: Distributed systems can guarantee only two of Consistency, Availability, and Partition Tolerance–IoT sensor data often accepts AP (eventual consistency), while control commands require CP (strong consistency).

5.2.2 CAP Trade-offs Over Time

The following timeline shows what happens before, during, and after a network partition for both CP and AP systems. Notice how the CP system sacrifices availability during the partition to maintain consistency, while the AP system remains available but must reconcile conflicting data after the partition heals.

Sequence diagram showing a timeline of events during a network partition, illustrating how CP systems reject writes on the minority side while AP systems accept writes on both sides, followed by conflict resolution after partition heals
Figure 5.2: CAP Theorem Timeline: How consistency and availability trade-offs manifest during network partitions.

5.2.3 Mapping IoT Data Types to CAP Trade-offs

Not all IoT data has the same consistency requirements. The key insight is that different data types within the same system should use different CAP strategies based on the business cost of inconsistency versus unavailability.

  • Sensor telemetry (temperature, humidity, voltage): AP preferred – a stale reading from 2 seconds ago is far more useful than no reading at all. Eventual consistency resolves differences within seconds after partition heals.
  • Control commands (actuator on/off, valve open/close): CP required – conflicting commands during a partition could damage physical equipment or create safety hazards.
  • Edge deployments: Must assume partitions will occur (wireless links, cellular gaps, maintenance windows), making partition tolerance non-negotiable.

5.3 Tradeoff: Synchronous vs Asynchronous Write-Ahead Log

Tradeoff: Synchronous vs Asynchronous WAL

Option A: Synchronous WAL (fsync on every write)

  • Write latency: 5-20ms per write (disk sync overhead)
  • Durability guarantee: Zero data loss on crash (every write persisted)
  • Write throughput: 5,000-20,000 writes/sec (limited by disk IOPS)
  • Hardware requirements: High-performance SSD (100K+ IOPS) recommended
  • Use cases: Financial transactions, billing records, device commands, audit logs

Option B: Asynchronous WAL (batch fsync every 100ms-1s)

  • Write latency: 0.5-2ms per write (memory buffer only)
  • Durability guarantee: Up to 1 second of data loss on crash
  • Write throughput: 100,000-500,000 writes/sec (memory-bound)
  • Hardware requirements: Standard SSD sufficient
  • Use cases: High-velocity sensor telemetry, metrics, logs where occasional loss is acceptable

Decision Factors:

  • Choose Synchronous WAL when: Every record is critical (billing, safety alerts), regulatory compliance requires provable data integrity, recovery point objective (RPO) is zero, write volume is under 20K/sec
  • Choose Asynchronous WAL when: Write volume exceeds 50K/sec, losing 1 second of sensor data is acceptable, latency-sensitive applications cannot wait for disk sync, cost optimization requires commodity hardware
  • Hybrid approach: Synchronous for commands and alerts (low volume, high value); asynchronous for telemetry (high volume, replaceable) - PostgreSQL supports per-transaction synchronous_commit settings
  • Real-world numbers: 10K sensors at 1Hz with sync WAL requires 10K IOPS (one fsync per write); same load with async WAL batching every 100ms requires only ~10 batch fsyncs/sec (1,000x reduction in disk operations)

5.4 Database Categories Comparison

For a comprehensive comparison of database categories and selection criteria, see Database Selection Framework. The diagram below summarizes the key trade-offs between write speed, query complexity, and scalability for IoT workloads.

Matrix chart comparing six database categories (Relational, Document, Key-Value, Time-Series, Wide-Column, Graph) across dimensions of write speed, query complexity, scalability model, and typical IoT use cases
Figure 5.3: Database Comparison Matrix: Trade-offs between write speed, query complexity, and scalability for IoT workloads.

5.5 Scaling Trade-offs

The CAP trade-offs discussed above become more pronounced as IoT systems scale. When a single database node can no longer handle the write volume or storage requirements, you must choose between sharding (distributing data) and replication (copying data)–each with distinct implications for consistency and availability.

5.5.1 Tradeoff: Horizontal Sharding vs Replication

Tradeoff: Horizontal Sharding vs Replication for IoT Scale

Option A: Horizontal Sharding - Distribute data across nodes by device_id hash or time range - Write throughput: 500K-2M writes/sec (scales linearly with nodes) - Storage per node: 1/N of total data (e.g., 100GB per 10 nodes = 1TB total) - Query latency: 5-50ms for single-shard queries; 50-500ms for cross-shard aggregations - Operational cost: $2,000-5,000/month for 10-node cluster (cloud managed)

Option B: Read Replicas - Single primary with multiple read replicas - Write throughput: 50K-100K writes/sec (limited by single primary) - Storage per node: Full dataset on each replica (e.g., 1TB on each of 5 nodes) - Query latency: 2-10ms (local reads); 10-50ms replication lag - Operational cost: $1,500-3,000/month for primary + 4 replicas (cloud managed)

Decision Factors:

  • Choose Sharding when: Write volume exceeds 100K/sec, single-node storage limit exceeded (>10TB), queries are device-specific (90%+ hit single shard)
  • Choose Replication when: Read volume is 10x+ write volume, writes under 100K/sec, need geographic read distribution, strong consistency required for all reads
  • Hybrid approach: Shard by device_id + replicate each shard for high availability (most production IoT systems use this)

5.5.2 Tradeoff: Hot vs Cold Storage

Tradeoff: Hot Storage (SSD/Memory) vs Cold Storage (Object Storage/HDD)

Option A: Hot Storage (NVMe SSD, In-Memory Cache)

  • Query latency: 1-10ms for any time range
  • Storage cost: $0.10-0.25/GB/month (cloud SSD), $500/TB (on-prem NVMe)
  • Throughput: 500K-2M IOPS (random access optimized)
  • Durability: Requires replication for redundancy
  • Best for: Real-time dashboards, alerting, last 7-30 days of data

Option B: Cold Storage (S3/GCS Object Storage, HDD)

  • Query latency: 100-500ms first byte, 1-10 seconds for large scans
  • Storage cost: $0.01-0.03/GB/month (cloud object), $20/TB (on-prem HDD)
  • Throughput: 100-1000 MB/s sequential (poor random access)
  • Durability: 99.999999999% (11 nines) built-in redundancy
  • Best for: Compliance archives, ML training data, data older than 30 days

Decision Factors:

  • Choose Hot Storage when: Query latency SLA under 50ms, frequent random access to any data point, alerting requires instant historical context, dashboards refresh every few seconds
  • Choose Cold Storage when: Data accessed less than once per month, batch analytics can tolerate multi-second latency, storage budget is primary constraint, regulatory retention requires 7+ years
  • Tiered approach: Hot (last 7 days on SSD), Warm (7-90 days on cheaper SSD), Cold (90+ days on object storage) - implements automatic data lifecycle management
  • Cost example: 10TB of sensor data costs $1,000/month on SSD vs $100/month on S3 - 10x savings for 90% of data that’s rarely accessed

5.5.3 Tradeoff: Normalized vs Denormalized Schema

Tradeoff: Normalized Schema vs Denormalized Schema for IoT

Option A: Normalized Schema (3NF, Foreign Keys)

  • Storage efficiency: High - no duplicate data (device metadata stored once)
  • Write performance: Fast - update metadata in single location
  • Query performance: Slower - requires JOINs across 3-5 tables
  • Schema changes: Easy - add columns to dimension tables
  • Query latency: 10-100ms for typical dashboard queries (JOIN overhead)
  • Best for: Device registry, user accounts, billing records

Option B: Denormalized Schema (Flat tables, Embedded data)

  • Storage efficiency: Lower - device name/location repeated in every reading
  • Write performance: Slower - must include all metadata with each insert
  • Query performance: Faster - single table scan, no JOINs required
  • Schema changes: Hard - must backfill millions of rows for new attributes
  • Query latency: 2-20ms for typical dashboard queries (no JOIN overhead)
  • Best for: High-velocity sensor telemetry, time-series analytics

Decision Factors:

  • Choose Normalized when: Device metadata changes frequently (firmware versions, calibration), storage cost is critical, data integrity is paramount, analytics require flexible ad-hoc joins
  • Choose Denormalized when: Query latency is critical (<10ms SLA), metadata is stable (location rarely changes), write volume exceeds 100K/sec, primary access is time-range scans
  • Hybrid approach: Normalize device registry (PostgreSQL), denormalize telemetry with embedded tags (InfluxDB/TimescaleDB) - join at query time when needed, pre-join for dashboards
  • Storage impact: 100 bytes/reading with embedded metadata vs 50 bytes without = 2x storage, but avoids 5-10ms JOIN latency on every query

The CAP Theorem is like a rule about making promises when your friends live far away!

5.5.4 The Sensor Squad Adventure: Three Promises, Pick Two

Max the Microcontroller was building a super cool message network between three Sensor Squad clubhouses in different neighborhoods.

“I want our network to do THREE things!” Max announced:

  1. Always agree – Every clubhouse has the exact same information (Consistency)
  2. Always answer – If someone asks a question, they always get a response (Availability)
  3. Keep working even if a road is blocked – If one path between clubhouses is broken, the system still works (Partition Tolerance)

Sammy the Sensor scratched his head. “That sounds easy! Let’s do all three!”

But Bella the Battery discovered the problem. “What happens when the road between Clubhouse A and Clubhouse B gets blocked by a fallen tree?”

“If we want everyone to AGREE,” said Lila the LED, “then Clubhouse B has to STOP answering questions until the road is fixed. Otherwise it might give OLD information!”

“But if we want to ALWAYS ANSWER,” said Max, “then Clubhouse B has to answer even if its information might be a little outdated!”

The Squad realized: When roads get blocked (and they always do!), you can either always agree OR always answer – but not both!

So they made a smart plan: - For important commands like “Turn off the oven!” they chose Always Agree (even if it takes longer) - For temperature readings, they chose Always Answer (a slightly old reading is better than no reading!)

Sammy smiled. “So the trick is picking the RIGHT two promises for each type of message!”

5.5.5 Key Words for Kids

Word What It Means
Consistency Everyone has the same information – like all your friends knowing the same score in a game
Availability Always getting an answer when you ask – like a store that never closes
Partition When a connection breaks – like when the phone line goes down between two houses

Scenario: A smart grid monitors 100,000 meters with distributed databases across 3 data centers. Network partitions occur monthly during maintenance. Choose consistency model for two data types: billing records and voltage telemetry.

Data Type 1: Billing Records (CP - Consistency Required)

Why CP:

  • Financial data must be consistent
  • Incorrect billing costs money and reputation
  • Can tolerate temporary unavailability during partition
  • Reconciliation after partition is complex and error-prone

Implementation (MongoDB with majority write concern):

db.billing.insertOne(
  { meter_id: "MTR-001", kwh: 245.6, cost: 32.45 },
  { writeConcern: { w: "majority", wtimeout: 5000 } }
)

Behavior during partition:

  • Minority partition: Writes rejected (unavailable)
  • Majority partition: Writes succeed (consistent)
  • After partition heals: No conflicts, all nodes consistent

Trade-off: During 15-minute partition, minority site cannot record billing = short unavailability for long-term consistency.

Data Type 2: Voltage Telemetry (AP - Availability Required)

Why AP:

  • Grid operators need continuous monitoring
  • Slightly stale voltage readings (1-2 seconds old) are acceptable
  • Cannot afford monitoring blackout during partition
  • Historical analysis tolerates eventual consistency

Implementation (Cassandra with eventual consistency):

INSERT INTO voltage_readings (meter_id, timestamp, voltage)
VALUES ('MTR-001', toTimestamp(now()), 240.2)
USING TTL 2592000;  -- 30 day retention

Behavior during partition:

  • All partitions: Accept writes (available)
  • Readings may differ briefly across sites
  • After partition heals: Latest timestamp wins (LWW conflict resolution)

Trade-off: Operators see real-time data even during partition, accepting that different sites may have slightly different views for 1-5 minutes.

Cost Analysis:

  • CP approach downtime: 15 min/month partition x 12 = 180 min/year = 99.966% uptime
  • AP approach consistency lag: ~30 seconds average, 2 minutes max
  • Business impact: Billing errors cost $50K+, monitoring gap costs $200K+ (blackouts)

Decision: Use CP for billing (financial accuracy), AP for telemetry (continuous monitoring).

Key Insight: Different data types in the SAME system require different CAP trade-offs based on business consequences of failure.

Data Type Consistency Critical? Availability Critical? Partition Common? Recommendation Example Database
Financial transactions Yes (errors costly) No (can retry) No (LAN mostly) CP PostgreSQL, MongoDB (majority)
Device firmware versions Yes (conflicts break devices) No (can defer update) Yes (edge/cloud) CP etcd, Consul
Sensor telemetry No (stale OK) Yes (continuous monitoring) Yes (edge/cloud) AP Cassandra, DynamoDB
User session data No (re-login OK) Yes (UX critical) Yes (global users) AP Redis Cluster, DynamoDB
Audit logs Yes (compliance) No (batch OK) No (single DC) CP PostgreSQL, MongoDB
Real-time alerts Partial (some duplicates OK) Yes (cannot miss) Yes (edge/cloud) AP with idempotency Kafka, RabbitMQ
Inventory management Yes (oversell costly) No (can backorder) No (warehouse LAN) CP PostgreSQL, MySQL
Video metadata No (thumbnails eventual) Yes (playback continuous) Yes (CDN) AP Cassandra, MongoDB

How to choose:

Step 1: Identify partition likelihood

  • Single data center on LAN: Partitions rare –> Can choose CA (though not advised)
  • Edge/fog/cloud architecture: Partitions common –> Must choose CP or AP
  • Global distribution: Partitions inevitable –> Must choose AP for availability

Step 2: Assess consistency requirement

  • Ask: “What happens if two users see different data for 5 minutes?”
    • Financial loss or compliance violation –> CP
    • User confusion but no data corruption –> AP
    • Operational monitoring gap –> AP with monitoring on both sides

Step 3: Assess availability requirement

  • Ask: “What happens if writes are rejected for 5 minutes?”
    • Lost sensor data that cannot be recovered –> AP
    • Transaction user can retry later –> CP
    • Critical control command that needs immediate ACK –> CP (or pre-buffer locally)

Step 4: Consider conflict resolution complexity

  • Last-Write-Wins (simple) –> AP acceptable
  • Complex merge logic –> CP preferred
  • No conflicts possible (append-only) –> AP safe

Example: Smart thermostat firmware version tracking: - Partition common? Yes (cloud/edge split) - Consistency critical? Yes (v1.0 and v2.0 conflict breaks device) - Availability critical? No (can defer firmware update by 5 minutes) - Decision: CP (use etcd or Consul with Raft consensus)

Common Mistake: Treating All Data Types Identically

The Error: Using Cassandra (AP) for ALL data including billing records because “we need high availability.”

What Goes Wrong:

Example Scenario:

Network partition splits 3-node Cassandra cluster: [DC1: Node A] | [DC2: Nodes B, C]

User 1 in DC1 records payment: $100 credit
User 2 in DC2 records usage: $120 charge
After partition heals: Last-write-wins conflict resolution picks one, discards the other
Result: Either lost payment ($100 credit vanishes) or lost charge ($120 usage vanishes)

Why This Costs Money:

  • Lost credits: Angry customers, refunds, legal issues
  • Lost charges: Revenue loss, billing disputes
  • Reconciliation: Manual audit of all transactions during partition (hours of work)

The Fix: Polyglot CAP - different databases for different consistency needs:

# Financial data: Use PostgreSQL (CP)
def record_payment(meter_id, amount):
    with pg_transaction():  # ACID transaction, blocks during partition
        billing.insert(meter_id, amount, "credit")
        ledger.update_balance(meter_id, +amount)
    # Either both succeed or both fail (consistency)

# Telemetry data: Use Cassandra (AP)
def record_voltage(meter_id, voltage):
    cassandra.insert(meter_id, timestamp=now(), voltage=voltage)
    # Succeeds even during partition (availability)
    # Eventual consistency is acceptable for monitoring

Cost-Benefit Analysis:

Approach Complexity Partition Behavior Business Risk Annual Cost
All CP (PostgreSQL) Low Billing unavailable 180 min/year Low (consistent) $50K (infra + downtime)
All AP (Cassandra) Low Always available High (conflicts) $20K (infra) + $100K (dispute resolution)
Polyglot (PostgreSQL + Cassandra) Medium Billing CP, telemetry AP Low (targeted) $35K (both DBs)

Key Insight: Trying to save operational complexity by using one database for everything either costs availability (all CP) or costs correctness (all AP). The right complexity is polyglot persistence.

5.6 See Also

Related Chapters:

External Resources:

5.7 Try It Yourself

Hands-On Challenge: Simulate CAP trade-offs with a simple distributed key-value store

Task: Implement a 3-node distributed system and observe consistency vs availability:

  1. Setup (Python Simulation):

    • Create 3 “nodes” (dictionaries) representing database replicas
    • Node1, Node2, Node3 initially synchronized with same data
  2. CP Mode Implementation:

    def write_cp(key, value, nodes):
        # Phase 1: Check reachability (prepare phase)
        reachable = [n for n in nodes if n.is_reachable()]
        if len(reachable) < 2:  # Majority not available
            return False  # Reject write - CP preserves consistency
        # Phase 2: Write to majority (commit phase)
        for node in reachable:
            node.write(key, value)
        return True
  3. AP Mode Implementation:

    def write_ap(key, value, nodes):
        # Write to ALL reachable nodes (availability-first)
        written = False
        for node in nodes:
            if node.is_reachable():
                node.write(key, value)
                written = True
        return written  # Succeeds if ANY node accepted the write
  4. Test Partition Scenario:

    • Simulate network partition: Node1 isolated, Node2+Node3 together
    • CP: Writes to Node1 fail (minority partition)
    • AP: Writes to Node1 succeed but data diverges
  5. Observe Results:

    • CP: Consistency maintained but Node1 unavailable
    • AP: All nodes available but temporarily inconsistent

What to Observe:

  • CP rejects writes during partition –> guarantees consistency
  • AP accepts conflicting writes –> requires conflict resolution
  • Neither is “better”–choice depends on application requirements

Common Pitfalls

CA (Consistency + Availability) assumes no network partitions exist. Edge-to-cloud IoT systems always experience partitions – wireless gaps, maintenance windows, hardware failures. Designing for CA leads to systems that fail catastrophically when the inevitable partition occurs. Always design for partition tolerance and consciously choose CP or AP per data type.

Key Concepts
  • CAP Theorem: A distributed systems theorem stating that only two of Consistency, Availability, and Partition tolerance can be guaranteed simultaneously – partition tolerance is mandatory in IoT, forcing a CP vs AP choice
  • CP System: A Consistency + Partition-tolerant system that rejects writes on the minority partition during a network split, ensuring all nodes see identical data at the cost of temporary unavailability
  • AP System: An Availability + Partition-tolerant system that continues accepting writes on all partitions during a network split, allowing brief inconsistency that is resolved via conflict resolution after healing
  • Eventual Consistency: A consistency model where all replicas converge to the same value given sufficient time without new writes – acceptable for IoT telemetry but not for safety-critical control commands
  • Network Partition: A network failure that splits distributed nodes into isolated groups unable to communicate, making partition tolerance non-negotiable in edge-to-cloud IoT deployments
  • Synchronous WAL: A durability strategy that waits for the write-ahead log to be confirmed on disk before acknowledging a write, providing strong durability at the cost of higher latency
  • Storage Tiering: A cost-reduction architecture that routes data to SSD (hot), HDD (warm), or object storage (cold) based on age and query frequency, typically achieving 70-90% cost savings
  • Conflict Resolution: The mechanism by which AP systems reconcile divergent writes after a partition heals – common strategies include last-write-wins (timestamp), application-level merge, and vector clocks

Making every data type use CP (strong consistency) causes unnecessary availability loss. Temperature telemetry delayed or dropped during a 10-minute network partition has zero business value lost – the sensor will send fresh data when reconnected. Reserve CP for control commands and billing data where inconsistency has real consequences.

AP systems do not require minutes or hours to converge – most resolve conflicts within milliseconds to seconds after partition healing. The confusion between ‘eventual’ (bounded, fast convergence) and ‘indefinitely inconsistent’ leads teams to over-engineer CP solutions where AP would have been simpler and more resilient.

CAP only addresses behavior during partition. PACELC adds: during normal operation (E), choose between Latency (L) and Consistency (C). Many IoT systems optimize for low latency in normal operation, accepting slightly stale reads – this tradeoff is invisible to pure CAP analysis.

5.8 Summary

  • CAP theorem constrains distributed systems to two of three guarantees: Consistency, Availability, Partition tolerance
  • IoT sensor data often accepts eventual consistency (AP) for higher availability
  • Control commands require strong consistency (CP) to prevent conflicting actions
  • Synchronous WAL guarantees durability at cost of write latency; async WAL trades durability for throughput
  • Sharding enables horizontal write scaling; replication enables read scaling and high availability
  • Storage tiering (hot/warm/cold) reduces costs by 70-80%+ while meeting latency SLAs
  • Schema normalization trade-offs depend on query patterns, update frequency, and latency requirements

CAP theorem constraints manifest as quantifiable availability vs consistency trade-offs. The quorum consensus formula \(W + R > N\) ensures strong consistency, where \(W\) = write replicas acknowledged, \(R\) = read replicas queried, \(N\) = total nodes in the cluster.

Use the calculator below to explore how different quorum configurations affect consistency guarantees and availability in your IoT cluster.

5.8.1 Quorum Consistency Calculator

5.8.2 Storage Tiering Cost Calculator

Use this calculator to estimate the cost savings from implementing hot/warm/cold storage tiering for your IoT sensor data.

5.9 What’s Next

With a solid grasp of how CAP trade-offs shape database design for distributed IoT systems, you can now explore how specific database technologies implement these trade-offs in practice.

Topic Chapter What You’ll Learn
Time-Series Storage Time-Series Databases How TimescaleDB and InfluxDB implement AP trade-offs with write-ahead logs, compaction, and retention policies for high-velocity sensor telemetry
Database Selection Database Selection Framework Criteria for choosing between relational, document, key-value, and time-series databases – extending CAP reasoning into a practical decision framework
Data Partitioning Sharding Strategies How to partition data across nodes using hash, range, and geographic sharding – applying the horizontal scaling trade-offs from this chapter
Data Quality Data Quality Monitoring How eventual consistency in AP systems affects data quality metrics, and strategies for detecting and resolving inconsistencies
Multi-Tier Architecture Edge/Fog/Cloud Architecture How edge-fog-cloud deployments determine partition likelihood and shape the CAP decisions for each tier