1286 Database Selection Tool

Choose the Right Database for IoT

1286.1 IoT Database Selection Advisor

Choosing the right database for an IoT application is a critical architectural decision. This interactive tool helps you evaluate and compare database options based on your specific workload requirements, data patterns, and scalability needs.

Tool Overview

This advisor evaluates five popular database options for IoT:

InfluxDB: Purpose-built time-series database with high write performance
TimescaleDB: PostgreSQL extension combining SQL with time-series optimization
MongoDB: Flexible document database for varied data structures
Cassandra: Distributed database for massive scale and high availability
PostgreSQL: Proven relational database with JSONB and TimescaleDB compatibility

How to Use This Tool

Adjust the workload parameters using the sliders and selectors
View the recommendation that updates in real-time
Examine the radar chart comparing all databases across key metrics
Review the scoring breakdown to understand the reasoning
Experiment with different configurations to see how requirements affect choices

Show code

{
  // Load D3.js v7
  const d3 = await require("d3@7");

  // IEEE Color palette
  const colors = {
    navy: "#2C3E50",
    teal: "#16A085",
    orange: "#E67E22",
    gray: "#7F8C8D",
    lightGray: "#ECF0F1",
    white: "#FFFFFF",
    red: "#E74C3C",
    green: "#27AE60",
    purple: "#9B59B6",
    yellow: "#F1C40F",
    darkBlue: "#1a252f"
  };

  // Database configurations
  const databases = {
    influxdb: {
      name: "InfluxDB",
      color: "#22ADF6",
      description: "Purpose-built time-series database",
      strengths: ["High write throughput", "Built-in retention policies", "Flux query language", "Native time-series aggregations"],
      weaknesses: ["Limited join support", "Eventual consistency", "Cardinality limits"]
    },
    timescaledb: {
      name: "TimescaleDB",
      color: "#FDB515",
      description: "PostgreSQL extension for time-series",
      strengths: ["Full SQL support", "Compression", "Continuous aggregates", "Mature ecosystem"],
      weaknesses: ["Single-node scaling", "Requires PostgreSQL expertise"]
    },
    mongodb: {
      name: "MongoDB",
      color: "#00ED64",
      description: "Flexible document database",
      strengths: ["Schema flexibility", "Horizontal scaling", "Rich queries", "Time-series collections"],
      weaknesses: ["Higher storage overhead", "Memory usage", "Limited time-series optimization"]
    },
    cassandra: {
      name: "Cassandra",
      color: "#1287B1",
      description: "Distributed wide-column store",
      strengths: ["Linear scalability", "High availability", "Write optimization", "Multi-datacenter"],
      weaknesses: ["Complex operations", "Limited query flexibility", "Eventual consistency"]
    },
    postgresql: {
      name: "PostgreSQL",
      color: "#336791",
      description: "Proven relational database",
      strengths: ["ACID compliance", "Rich ecosystem", "JSONB support", "Extensibility"],
      weaknesses: ["Time-series not native", "Scaling complexity", "Storage overhead"]
    }
  };

  // Layout dimensions
  const width = 850;
  const height = 700;

  // State variables
  let ingestionRate = 10000; // messages per second
  let queryPattern = "time-range";
  let consistency = "eventual";
  let deviceCount = 10000;
  let retentionDays = 30;
  let workloadType = "timeseries";

  // Create container
  const container = d3.create("div")
    .style("font-family", "system-ui, -apple-system, sans-serif");

  // Main layout
  const mainLayout = container.append("div")
    .style("display", "flex")
    .style("flex-wrap", "wrap")
    .style("gap", "20px");

  // Left panel - Controls
  const leftPanel = mainLayout.append("div")
    .style("flex", "1")
    .style("min-width", "320px")
    .style("max-width", "380px");

  // Controls header
  leftPanel.append("h3")
    .style("margin", "0 0 15px 0")
    .style("color", colors.navy)
    .style("font-size", "16px")
    .style("border-bottom", `3px solid ${colors.teal}`)
    .style("padding-bottom", "8px")
    .text("Workload Configuration");

  // Control form
  const form = leftPanel.append("div")
    .style("background", colors.white)
    .style("padding", "20px")
    .style("border-radius", "8px")
    .style("border", `2px solid ${colors.lightGray}`)
    .style("box-shadow", "0 2px 8px rgba(0,0,0,0.08)");

  // Helper to create slider control
  function createSlider(parent, label, id, min, max, step, value, unit, formatter) {
    const group = parent.append("div")
      .style("margin-bottom", "18px");

    const labelRow = group.append("div")
      .style("display", "flex")
      .style("justify-content", "space-between")
      .style("margin-bottom", "6px");

    labelRow.append("label")
      .attr("for", id)
      .style("font-size", "13px")
      .style("font-weight", "bold")
      .style("color", colors.navy)
      .text(label);

    const valueSpan = labelRow.append("span")
      .attr("id", `${id}-value`)
      .style("font-size", "13px")
      .style("color", colors.teal)
      .style("font-weight", "bold")
      .text(formatter ? formatter(value) : `${value}${unit}`);

    const slider = group.append("input")
      .attr("type", "range")
      .attr("id", id)
      .attr("min", min)
      .attr("max", max)
      .attr("step", step)
      .attr("value", value)
      .style("width", "100%")
      .style("cursor", "pointer");

    return { slider, valueSpan };
  }

  // Helper to create select control
  function createSelect(parent, label, id, options) {
    const group = parent.append("div")
      .style("margin-bottom", "18px");

    group.append("label")
      .attr("for", id)
      .style("display", "block")
      .style("font-size", "13px")
      .style("font-weight", "bold")
      .style("color", colors.navy)
      .style("margin-bottom", "6px")
      .text(label);

    const select = group.append("select")
      .attr("id", id)
      .style("width", "100%")
      .style("padding", "10px 12px")
      .style("border-radius", "6px")
      .style("border", `1px solid ${colors.gray}`)
      .style("font-size", "13px")
      .style("cursor", "pointer")
      .style("background", colors.white);

    options.forEach(opt => {
      select.append("option")
        .attr("value", opt.value)
        .attr("selected", opt.selected ? true : null)
        .text(opt.label);
    });

    return select;
  }

  // Ingestion rate slider
  const ingestionControl = createSlider(form, "Data Ingestion Rate", "ingestion", 100, 100000, 100, ingestionRate, " msg/s", v => {
    if (v >= 1000) return `${(v/1000).toFixed(1)}K msg/s`;
    return `${v} msg/s`;
  });

  // Device count slider
  const deviceControl = createSlider(form, "Device Count", "devices", 100, 1000000, 100, deviceCount, "", v => {
    if (v >= 1000000) return `${(v/1000000).toFixed(1)}M devices`;
    if (v >= 1000) return `${(v/1000).toFixed(0)}K devices`;
    return `${v} devices`;
  });

  // Retention period slider
  const retentionControl = createSlider(form, "Data Retention Period", "retention", 1, 365, 1, retentionDays, " days", v => {
    if (v >= 365) return "1 year";
    if (v >= 30) return `${Math.floor(v/30)} month${v >= 60 ? 's' : ''}`;
    return `${v} days`;
  });

  // Query pattern select
  const querySelect = createSelect(form, "Primary Query Pattern", "query", [
    { value: "time-range", label: "Time-Range Queries", selected: true },
    { value: "aggregation", label: "Aggregation & Analytics" },
    { value: "realtime", label: "Real-Time Streaming" },
    { value: "adhoc", label: "Ad-Hoc Exploration" }
  ]);

  // Workload type select
  const workloadSelect = createSelect(form, "Primary Workload Type", "workload", [
    { value: "timeseries", label: "Time-Series Metrics", selected: true },
    { value: "keyvalue", label: "Key-Value Lookups" },
    { value: "document", label: "Document Storage" },
    { value: "relational", label: "Relational Data" }
  ]);

  // Consistency select
  const consistencySelect = createSelect(form, "Consistency Requirement", "consistency", [
    { value: "eventual", label: "Eventual Consistency", selected: true },
    { value: "strong", label: "Strong Consistency" }
  ]);

  // Right panel - Results
  const rightPanel = mainLayout.append("div")
    .style("flex", "1.5")
    .style("min-width", "450px");

  // Recommendation box
  const recommendBox = rightPanel.append("div")
    .style("background", "linear-gradient(135deg, #1a252f 0%, #2C3E50 100%)")
    .style("padding", "20px")
    .style("border-radius", "8px")
    .style("margin-bottom", "20px")
    .style("box-shadow", "0 4px 12px rgba(0,0,0,0.15)");

  recommendBox.append("div")
    .style("font-size", "12px")
    .style("color", colors.gray)
    .style("text-transform", "uppercase")
    .style("letter-spacing", "1px")
    .style("margin-bottom", "8px")
    .text("Recommended Database");

  const recommendName = recommendBox.append("div")
    .attr("id", "recommend-name")
    .style("font-size", "28px")
    .style("font-weight", "bold")
    .style("color", colors.white)
    .style("margin-bottom", "8px");

  const recommendDesc = recommendBox.append("div")
    .attr("id", "recommend-desc")
    .style("font-size", "14px")
    .style("color", colors.lightGray)
    .style("margin-bottom", "12px");

  const recommendReason = recommendBox.append("div")
    .attr("id", "recommend-reason")
    .style("font-size", "12px")
    .style("color", colors.teal)
    .style("padding", "10px")
    .style("background", "rgba(22, 160, 133, 0.15)")
    .style("border-radius", "4px")
    .style("border-left", `3px solid ${colors.teal}`);

  // Chart container
  const chartContainer = rightPanel.append("div")
    .style("display", "flex")
    .style("gap", "20px")
    .style("flex-wrap", "wrap");

  // Radar chart
  const radarDiv = chartContainer.append("div")
    .style("flex", "1")
    .style("min-width", "300px")
    .style("background", colors.white)
    .style("padding", "15px")
    .style("border-radius", "8px")
    .style("border", `2px solid ${colors.lightGray}`);

  radarDiv.append("h4")
    .style("margin", "0 0 10px 0")
    .style("color", colors.navy)
    .style("font-size", "14px")
    .text("Database Comparison");

  const radarSvg = radarDiv.append("svg")
    .attr("viewBox", "0 0 350 320")
    .attr("width", "100%")
    .style("max-height", "320px");

  // Scores breakdown
  const scoresDiv = chartContainer.append("div")
    .style("flex", "1")
    .style("min-width", "200px")
    .style("background", colors.white)
    .style("padding", "15px")
    .style("border-radius", "8px")
    .style("border", `2px solid ${colors.lightGray}`);

  scoresDiv.append("h4")
    .style("margin", "0 0 10px 0")
    .style("color", colors.navy)
    .style("font-size", "14px")
    .text("Scoring Breakdown");

  const scoresContent = scoresDiv.append("div")
    .attr("id", "scores-content");

  // Radar chart dimensions
  const radarWidth = 350;
  const radarHeight = 320;
  const radarRadius = 110;
  const radarCenterX = radarWidth / 2;
  const radarCenterY = radarHeight / 2 + 10;

  // Radar axes
  const axes = [
    { key: "writePerf", label: "Write Performance" },
    { key: "queryFlex", label: "Query Flexibility" },
    { key: "scalability", label: "Scalability" },
    { key: "consistency", label: "Consistency" },
    { key: "timeseries", label: "Time-Series" },
    { key: "ecosystem", label: "Ecosystem" }
  ];

  const angleSlice = (Math.PI * 2) / axes.length;

  // Draw radar grid
  const radarGrid = radarSvg.append("g")
    .attr("transform", `translate(${radarCenterX}, ${radarCenterY})`);

  // Concentric circles
  [0.2, 0.4, 0.6, 0.8, 1.0].forEach((level, i) => {
    radarGrid.append("circle")
      .attr("r", radarRadius * level)
      .attr("fill", "none")
      .attr("stroke", colors.lightGray)
      .attr("stroke-dasharray", i < 4 ? "2,2" : "none")
      .attr("stroke-width", i === 4 ? 2 : 1);

    if (i === 4) {
      radarGrid.append("text")
        .attr("x", 5)
        .attr("y", -radarRadius * level + 4)
        .attr("font-size", "9px")
        .attr("fill", colors.gray)
        .text(`${level * 100}%`);
    }
  });

  // Axis lines and labels
  axes.forEach((axis, i) => {
    const angle = i * angleSlice - Math.PI / 2;
    const x = Math.cos(angle) * radarRadius;
    const y = Math.sin(angle) * radarRadius;

    radarGrid.append("line")
      .attr("x1", 0)
      .attr("y1", 0)
      .attr("x2", x)
      .attr("y2", y)
      .attr("stroke", colors.lightGray)
      .attr("stroke-width", 1);

    const labelX = Math.cos(angle) * (radarRadius + 20);
    const labelY = Math.sin(angle) * (radarRadius + 20);

    radarGrid.append("text")
      .attr("x", labelX)
      .attr("y", labelY)
      .attr("text-anchor", "middle")
      .attr("dominant-baseline", "middle")
      .attr("font-size", "10px")
      .attr("fill", colors.navy)
      .attr("font-weight", "bold")
      .text(axis.label);
  });

  // Radar paths group
  const radarPaths = radarGrid.append("g").attr("class", "radar-paths");

  // Legend
  const legend = radarSvg.append("g")
    .attr("transform", "translate(10, 10)");

  Object.entries(databases).forEach(([key, db], i) => {
    const g = legend.append("g")
      .attr("transform", `translate(${i * 70}, 0)`);

    g.append("rect")
      .attr("width", 12)
      .attr("height", 12)
      .attr("rx", 2)
      .attr("fill", db.color);

    g.append("text")
      .attr("x", 16)
      .attr("y", 10)
      .attr("font-size", "9px")
      .attr("fill", colors.navy)
      .text(db.name.substring(0, 8));
  });

  // Calculate database scores based on workload
  function calculateScores() {
    const scores = {};

    Object.keys(databases).forEach(dbKey => {
      scores[dbKey] = {
        writePerf: 0.5,
        queryFlex: 0.5,
        scalability: 0.5,
        consistency: 0.5,
        timeseries: 0.5,
        ecosystem: 0.5,
        total: 0
      };
    });

    // InfluxDB scores
    scores.influxdb.writePerf = Math.min(1, 0.5 + (ingestionRate / 100000) * 0.5);
    scores.influxdb.queryFlex = queryPattern === "time-range" ? 0.9 : queryPattern === "aggregation" ? 0.85 : 0.5;
    scores.influxdb.scalability = deviceCount > 100000 ? 0.7 : 0.85;
    scores.influxdb.consistency = consistency === "eventual" ? 0.9 : 0.4;
    scores.influxdb.timeseries = workloadType === "timeseries" ? 0.95 : 0.4;
    scores.influxdb.ecosystem = 0.75;

    // TimescaleDB scores
    scores.timescaledb.writePerf = Math.min(0.85, 0.4 + (ingestionRate / 100000) * 0.45);
    scores.timescaledb.queryFlex = queryPattern === "adhoc" ? 0.95 : 0.85;
    scores.timescaledb.scalability = deviceCount > 500000 ? 0.5 : 0.75;
    scores.timescaledb.consistency = consistency === "strong" ? 0.95 : 0.9;
    scores.timescaledb.timeseries = workloadType === "timeseries" ? 0.9 : 0.7;
    scores.timescaledb.ecosystem = 0.9;

    // MongoDB scores
    scores.mongodb.writePerf = Math.min(0.8, 0.4 + (ingestionRate / 100000) * 0.4);
    scores.mongodb.queryFlex = queryPattern === "adhoc" ? 0.85 : 0.75;
    scores.mongodb.scalability = 0.85;
    scores.mongodb.consistency = consistency === "eventual" ? 0.8 : 0.75;
    scores.mongodb.timeseries = workloadType === "document" ? 0.85 : workloadType === "timeseries" ? 0.7 : 0.6;
    scores.mongodb.ecosystem = 0.85;

    // Cassandra scores
    scores.cassandra.writePerf = Math.min(1, 0.6 + (ingestionRate / 100000) * 0.4);
    scores.cassandra.queryFlex = queryPattern === "time-range" ? 0.7 : 0.4;
    scores.cassandra.scalability = 0.95;
    scores.cassandra.consistency = consistency === "eventual" ? 0.9 : 0.6;
    scores.cassandra.timeseries = workloadType === "timeseries" ? 0.75 : 0.5;
    scores.cassandra.ecosystem = 0.7;

    // PostgreSQL scores
    scores.postgresql.writePerf = Math.min(0.6, 0.3 + (ingestionRate / 100000) * 0.3);
    scores.postgresql.queryFlex = 0.95;
    scores.postgresql.scalability = deviceCount > 100000 ? 0.4 : 0.65;
    scores.postgresql.consistency = consistency === "strong" ? 0.98 : 0.95;
    scores.postgresql.timeseries = workloadType === "relational" ? 0.9 : 0.5;
    scores.postgresql.ecosystem = 0.95;

    // Adjust for workload type
    if (workloadType === "keyvalue") {
      scores.mongodb.timeseries += 0.1;
      scores.cassandra.timeseries += 0.15;
    }
    if (workloadType === "document") {
      scores.mongodb.timeseries += 0.15;
    }
    if (workloadType === "relational") {
      scores.postgresql.timeseries += 0.2;
      scores.timescaledb.timeseries += 0.1;
    }

    // Adjust for retention
    if (retentionDays > 90) {
      scores.influxdb.writePerf -= 0.05;
      scores.cassandra.scalability += 0.05;
    }

    // Clamp all values
    Object.keys(scores).forEach(dbKey => {
      Object.keys(scores[dbKey]).forEach(metric => {
        scores[dbKey][metric] = Math.max(0.1, Math.min(1, scores[dbKey][metric]));
      });
    });

    // Calculate totals with weights
    const weights = {
      writePerf: ingestionRate > 50000 ? 0.25 : 0.15,
      queryFlex: queryPattern === "adhoc" ? 0.2 : 0.1,
      scalability: deviceCount > 100000 ? 0.25 : 0.15,
      consistency: consistency === "strong" ? 0.2 : 0.1,
      timeseries: workloadType === "timeseries" ? 0.25 : 0.1,
      ecosystem: 0.15
    };

    // Normalize weights
    const totalWeight = Object.values(weights).reduce((a, b) => a + b, 0);
    Object.keys(weights).forEach(k => weights[k] /= totalWeight);

    Object.keys(scores).forEach(dbKey => {
      scores[dbKey].total = axes.reduce((sum, axis) => {
        return sum + scores[dbKey][axis.key] * weights[axis.key];
      }, 0);
    });

    return { scores, weights };
  }

  // Generate recommendation reason
  function generateReason(topDb, scores) {
    const reasons = [];

    if (ingestionRate > 50000) {
      if (topDb === "influxdb" || topDb === "cassandra") {
        reasons.push("high write throughput requirement");
      }
    }

    if (queryPattern === "adhoc") {
      if (topDb === "timescaledb" || topDb === "postgresql") {
        reasons.push("full SQL support for ad-hoc queries");
      }
    }

    if (workloadType === "timeseries") {
      if (topDb === "influxdb") {
        reasons.push("native time-series optimization");
      } else if (topDb === "timescaledb") {
        reasons.push("time-series with SQL compatibility");
      }
    }

    if (consistency === "strong") {
      if (topDb === "postgresql" || topDb === "timescaledb") {
        reasons.push("ACID compliance requirement");
      }
    }

    if (deviceCount > 500000) {
      if (topDb === "cassandra") {
        reasons.push("linear horizontal scalability");
      }
    }

    if (reasons.length === 0) {
      reasons.push("best overall fit for your requirements");
    }

    return "Selected for: " + reasons.join(", ");
  }

  // Update visualization
  function updateVisualization() {
    const { scores, weights } = calculateScores();

    // Find top database
    let topDb = null;
    let topScore = 0;
    Object.entries(scores).forEach(([key, data]) => {
      if (data.total > topScore) {
        topScore = data.total;
        topDb = key;
      }
    });

    // Update recommendation
    recommendName.text(databases[topDb].name)
      .style("color", databases[topDb].color);
    recommendDesc.text(databases[topDb].description);
    recommendReason.text(generateReason(topDb, scores));

    // Update radar chart
    radarPaths.selectAll("*").remove();

    Object.entries(databases).forEach(([key, db]) => {
      const pathData = axes.map((axis, i) => {
        const angle = i * angleSlice - Math.PI / 2;
        const value = scores[key][axis.key];
        return {
          x: Math.cos(angle) * radarRadius * value,
          y: Math.sin(angle) * radarRadius * value
        };
      });

      const lineGenerator = d3.lineRadial()
        .radius((d, i) => scores[key][axes[i].key] * radarRadius)
        .angle((d, i) => i * angleSlice)
        .curve(d3.curveLinearClosed);

      const dataPoints = axes.map((axis, i) => scores[key][axis.key]);

      radarPaths.append("path")
        .datum(dataPoints)
        .attr("d", lineGenerator)
        .attr("fill", db.color)
        .attr("fill-opacity", key === topDb ? 0.35 : 0.1)
        .attr("stroke", db.color)
        .attr("stroke-width", key === topDb ? 3 : 1.5)
        .attr("stroke-opacity", key === topDb ? 1 : 0.6);
    });

    // Update scores breakdown
    scoresContent.html("");

    const sortedDbs = Object.entries(scores)
      .sort((a, b) => b[1].total - a[1].total);

    sortedDbs.forEach(([key, data], index) => {
      const row = scoresContent.append("div")
        .style("display", "flex")
        .style("align-items", "center")
        .style("padding", "8px 0")
        .style("border-bottom", index < sortedDbs.length - 1 ? `1px solid ${colors.lightGray}` : "none");

      row.append("div")
        .style("width", "12px")
        .style("height", "12px")
        .style("border-radius", "2px")
        .style("background", databases[key].color)
        .style("margin-right", "8px");

      row.append("div")
        .style("flex", "1")
        .style("font-size", "12px")
        .style("color", colors.navy)
        .style("font-weight", index === 0 ? "bold" : "normal")
        .text(databases[key].name);

      // Score bar
      const barContainer = row.append("div")
        .style("width", "80px")
        .style("height", "8px")
        .style("background", colors.lightGray)
        .style("border-radius", "4px")
        .style("overflow", "hidden")
        .style("margin-right", "8px");

      barContainer.append("div")
        .style("width", `${data.total * 100}%`)
        .style("height", "100%")
        .style("background", databases[key].color)
        .style("border-radius", "4px");

      row.append("div")
        .style("font-size", "12px")
        .style("font-weight", "bold")
        .style("color", index === 0 ? databases[key].color : colors.gray)
        .style("min-width", "35px")
        .style("text-align", "right")
        .text(`${(data.total * 100).toFixed(0)}%`);
    });

    // Add weighted criteria
    scoresContent.append("div")
      .style("margin-top", "15px")
      .style("padding-top", "10px")
      .style("border-top", `2px solid ${colors.teal}`)
      .style("font-size", "11px")
      .style("color", colors.gray)
      .html(`<strong style="color: ${colors.navy}">Active Weights:</strong><br>` +
        Object.entries(weights)
          .sort((a, b) => b[1] - a[1])
          .slice(0, 3)
          .map(([k, v]) => `${axes.find(a => a.key === k)?.label || k}: ${(v * 100).toFixed(0)}%`)
          .join(", "));
  }

  // Event listeners
  ingestionControl.slider.on("input", function() {
    ingestionRate = parseInt(this.value);
    const text = ingestionRate >= 1000 ? `${(ingestionRate/1000).toFixed(1)}K msg/s` : `${ingestionRate} msg/s`;
    ingestionControl.valueSpan.text(text);
    updateVisualization();
  });

  deviceControl.slider.on("input", function() {
    deviceCount = parseInt(this.value);
    let text;
    if (deviceCount >= 1000000) text = `${(deviceCount/1000000).toFixed(1)}M devices`;
    else if (deviceCount >= 1000) text = `${(deviceCount/1000).toFixed(0)}K devices`;
    else text = `${deviceCount} devices`;
    deviceControl.valueSpan.text(text);
    updateVisualization();
  });

  retentionControl.slider.on("input", function() {
    retentionDays = parseInt(this.value);
    let text;
    if (retentionDays >= 365) text = "1 year";
    else if (retentionDays >= 30) text = `${Math.floor(retentionDays/30)} month${retentionDays >= 60 ? 's' : ''}`;
    else text = `${retentionDays} days`;
    retentionControl.valueSpan.text(text);
    updateVisualization();
  });

  querySelect.on("change", function() {
    queryPattern = this.value;
    updateVisualization();
  });

  workloadSelect.on("change", function() {
    workloadType = this.value;
    updateVisualization();
  });

  consistencySelect.on("change", function() {
    consistency = this.value;
    updateVisualization();
  });

  // Initialize
  updateVisualization();

  return container.node();
}

1286.2 Understanding Database Selection for IoT

1286.2.1 Key Decision Factors

When selecting a database for IoT applications, consider these critical factors:

Factor	Impact	Considerations
Ingestion Rate	Write performance, buffering	High rates favor write-optimized DBs
Query Pattern	Index design, response time	Ad-hoc needs SQL; time-range needs ordering
Consistency	Data accuracy, latency	Financial/safety data needs strong consistency
Scale	Architecture, cost	Horizontal scaling for millions of devices
Retention	Storage, compression	Long retention needs efficient compression

1286.2.2 Database Profiles

InfluxDB

Best for: Pure time-series metrics with high write volume

Native time-series database with retention policies
Flux query language optimized for time-series operations
Excellent for metrics dashboards (Grafana integration)
Cardinality limits may affect high-dimensional data

TimescaleDB

Best for: Time-series data requiring SQL and joins

PostgreSQL extension with automatic partitioning
Full SQL support including JOINs and complex queries
Continuous aggregates for efficient downsampling
Excellent compression for time-series data

MongoDB

Best for: Varied data structures and document flexibility

Schema-less design adapts to changing requirements
Time-series collections (5.0+) with automatic bucketing
Horizontal scaling with sharding
Rich aggregation pipeline for analytics

Cassandra

Best for: Massive scale with high availability needs

Linear scalability across datacenters
Tunable consistency per operation
Excellent write performance
Best for known query patterns (no ad-hoc)

PostgreSQL

Best for: Complex relational queries and ACID requirements

Most mature and proven technology
Excellent for structured data with relationships
Can be enhanced with TimescaleDB or Citus
Rich ecosystem and tooling

1286.2.3 Decision Tree Summary

                    ┌─────────────────────────┐
                    │  What's your primary    │
                    │  workload type?         │
                    └───────────┬─────────────┘
                                │
        ┌───────────┬───────────┼───────────┬───────────┐
        ▼           ▼           ▼           ▼           ▼
   Time-Series  Key-Value   Document   Relational   Mixed
        │           │           │           │           │
        ▼           ▼           ▼           ▼           ▼
   InfluxDB/    Cassandra   MongoDB   PostgreSQL  TimescaleDB
   TimescaleDB

Tradeoff: Schema-on-Write vs Schema-on-Read for IoT Data

Option A: Schema-on-Write (Relational, Pre-defined Structure) - Data validation: Enforced at write time (reject malformed data immediately) - Query performance: Fast - optimizer knows exact column types and indexes - Storage efficiency: High - fixed-width columns, no schema overhead per record - Flexibility: Low - schema changes require migrations, downtime possible - Query latency: 2-10ms (optimized execution plans) - Best for: Device registry, billing, compliance data where structure is stable

Option B: Schema-on-Read (Document/Key-Value, Flexible Structure) - Data validation: Deferred to query time (accept anything, validate later) - Query performance: Variable - must parse documents, no guaranteed indexes - Storage efficiency: Lower - schema metadata embedded in each document - Flexibility: High - add fields anytime, no migrations needed - Query latency: 10-50ms (runtime parsing and type coercion) - Best for: Device telemetry with varying sensor types, event logs, config data

Decision Factors: - Choose Schema-on-Write when: Data structure is well-defined and stable, query performance is critical (<10ms SLA), data quality must be enforced at source, compliance requires data validation audit trails - Choose Schema-on-Read when: Sensor types vary across devices (heterogeneous fleet), schema evolves frequently (new firmware adds fields), rapid prototyping valued over data integrity, analytics team needs raw data flexibility - Hybrid approach: Schema-on-write for core metrics (temperature, humidity with strict types), schema-on-read for metadata (device config, custom attributes stored as JSONB) - Migration example: Adding “battery_level” field - Schema-on-write requires ALTER TABLE + backfill; Schema-on-read accepts new field immediately

Tradeoff: Single-Node vs Distributed Database for IoT

Option A: Single-Node Database (PostgreSQL, SQLite) - Operational complexity: Low - one server to manage, backup, monitor - Write throughput: 10K-100K writes/sec (limited by single CPU/disk) - Query latency: 1-10ms (no network hops, local execution) - Consistency: Strong ACID guarantees (single source of truth) - Cost: $200-500/month for powerful single instance (32GB RAM, NVMe) - Best for: < 50K sensors, < 100K writes/sec, team without distributed systems expertise

Option B: Distributed Database (Cassandra, CockroachDB, TimescaleDB Multi-Node) - Operational complexity: High - coordination, rebalancing, partition management - Write throughput: 500K-5M writes/sec (scales linearly with nodes) - Query latency: 5-50ms (network hops, consensus overhead) - Consistency: Configurable (strong or eventual, per operation) - Cost: $2,000-10,000/month for 5-10 node cluster with replication - Best for: > 100K sensors, > 500K writes/sec, HA requirements, multi-region deployment

Decision Factors: - Choose Single-Node when: Write volume under 100K/sec, total data under 5TB, team is small (1-2 engineers), latency requirements are strict (<10ms p99), budget is limited - Choose Distributed when: Write volume exceeds 200K/sec, need 99.99%+ availability (no single point of failure), data exceeds single-node capacity (10TB+), geographic distribution required (multi-datacenter) - Scaling path: Start single-node (PostgreSQL/TimescaleDB), add read replicas when reads bottleneck, shard when writes bottleneck or storage exceeds 5TB - Cost reality: 10-node cluster costs 10-20x more than single node but provides <10x performance gain for most IoT workloads - right-size your infrastructure

1286.3 Quick Start: Common Scenarios

Use these preset configurations to explore typical IoT database requirements:

Scenario	Ingestion	Query Pattern	Scale	Best Choice
Smart Home Hub	100 msg/s	Real-time dashboard	50 devices	InfluxDB or PostgreSQL
Industrial Monitoring	10K msg/s	Time-range + alerts	1,000 sensors	TimescaleDB
Fleet Tracking	50K msg/s	Geospatial + history	10K vehicles	MongoDB with geo indexes
Smart City Platform	500K msg/s	Analytics + dashboards	100K sensors	Cassandra + InfluxDB hybrid
Healthcare Wearables	5K msg/s	Ad-hoc + compliance	5K patients	PostgreSQL (HIPAA)

1286.4 Knowledge Check

Knowledge Check: IoT Database Selection Quick Check

❓ Question 1: An IoT platform needs to store sensor data from 100,000 devices, each sending 1 reading/second. Which database characteristic is MOST important?

💡 Explanation: At 100K devices × 1 msg/sec = 100,000 writes/second. This exceeds single-node PostgreSQL capacity (~50K/sec). Write throughput is the bottleneck. InfluxDB, TimescaleDB (multi-node), or Cassandra handle this volume. JOINs and ACID are nice-to-have but not the limiting factor.

❓ Question 2: A medical device company needs to store patient vital signs with strict audit requirements. Which database is MOST appropriate?

💡 Explanation: Healthcare data requires HIPAA compliance with strict audit trails, data integrity, and often legal discovery requirements. PostgreSQL’s ACID guarantees, mature audit logging, and regulatory acceptance make it the standard for healthcare. Speed is secondary to compliance.

❓ Question 3: Your startup has 1,000 sensors and a 2-person engineering team. You expect 10x growth in 18 months. What’s the best initial database strategy?

💡 Explanation: For small teams, operational simplicity trumps theoretical scalability. 1,000 sensors at 1 msg/sec = 1,000 writes/sec - easily handled by single-node. TimescaleDB gives growth path (add replicas, then multi-node) without immediate distributed system complexity. Over-engineering early wastes resources.