4 API Contracts and Service Discovery

design-patterns

soa

api

4.1 Start With the Client That Cannot Change Quickly

An IoT API is tested by the client that stays in the field after the backend changes. Firmware, gateways, dashboards, and partner integrations may keep calling the same contract long after the service team has reorganized internals.

The start-simple move is to design the command, retry, error, and version story before implementation. A stable contract lets the platform move behind the scenes without teaching every deployed client a new backend shape.

In 60 Seconds

An IoT API is a contract between devices, applications, and services. Design it around stable resources, predictable HTTP methods, explicit schemas, clear error bodies, rate limits, and a version lifecycle that lets deployed devices keep working while the platform evolves. Service discovery then answers a different question: after the contract exists, how does a client or gateway find the healthy service instance that can honor it?

Minimum Viable Understanding

The API surface is a product boundary. A device firmware team, mobile team, analytics service, or external partner should not need to know the backend database layout.
Versioning is for breaking change, not every change. Additive fields and optional parameters usually fit inside the same version. Removing, renaming, changing types, or changing semantics usually requires a new version.
Discovery belongs behind a stable contract. Devices should call a stable host or gateway. Internal services can use DNS, Kubernetes Services, Consul, or a gateway to locate healthy instances.
Rate limiting and idempotency are API design features. They protect the platform when firmware retries, cellular networks flap, or a command is submitted twice.

Chapter Roadmap

First define the API as the long-lived wire contract.
Then model resources, commands, errors, and schemas.
Next separate compatible changes from breaking changes.
After that put discovery behind stable gateways, DNS, Kubernetes Services, or registries.
Finally add rate limits, retries, idempotency, and a review checklist.

Checkpoint callouts recap design tests; collapsed quizzes and “Try It” sections are optional.

4.2 The API Is The Long-Lived Wire Contract

An IoT API should hide internal service shape while making client behavior predictable. A thermostat, gateway, installer app, operations dashboard, partner integration, and support console may all use the same platform capability at different speeds and with different update cycles.

That is why the contract needs stable resource names, explicit status codes, clear schemas, version rules, and retry behavior. The backend can move from a monolith to services, from one region to many, or from one database table to another without forcing every deployed client to change. The visible contract is the part that field devices, SDKs, dashboards, and partner systems can safely depend on after they leave the development bench.

Consider a building gateway that manages CO2 sensors, thermostats, occupancy counters, and door controllers. The API should expose durable resources such as /v1/devices/{id}, /v1/zones/{id}/telemetry, /v1/devices/{id}/commands, and /v1/alerts/{id}. It should not expose table names, Pod addresses, queue names, or internal class names. If the platform later replaces PostgreSQL tables with TimescaleDB hypertables, moves command dispatch behind Kafka, or adds an Envoy gateway in front of services, the client should still see the same resource shape and error contract.

Resource paths name durable concepts such as devices, fleets, commands, telemetry, rules, alerts, and credentials.
Error bodies tell firmware, SDKs, dashboards, and support tools whether to retry, stop, ask the user, or escalate.
Discovery choices keep constrained clients on a stable DNS name or API gateway while internal services can move behind Kubernetes Services, EndpointSlices, or a registry.

A good API review therefore asks what each deployed client must know. Firmware usually needs a stable host, authentication method, resource path, payload schema, timeout, retry rule, and clock expectation. Operations tools need trace ids, audit events, rate-limit reasons, and command state. Analytics systems need pagination, time windows, units, quality flags, and schema evolution rules. Those responsibilities belong in the contract before scale turns an informal endpoint into a production dependency.

4.3 Design The Command API Before The Button

For a connected lock, valve, or HVAC controller, start by naming the logical command and its lifecycle before building the UI. A button press should create a command resource, return a command id, and let clients observe pending, accepted, applied, rejected, expired, or superseded states.

Request: POST /v1/devices/{deviceId}/commands with an Idempotency-Key, command type, target state, actor id, and optional reason.
Response: 202 Accepted with command_id, current state, server timestamp, trace id, and a link to GET /v1/devices/{deviceId}/commands/{commandId}.
Retry: a duplicate idempotency key returns the original command resource instead of dispatching a second unlock, relay change, or setpoint update.
Failure: 409 Conflict, 423 Locked, 429 Too Many Requests, or 503 Service Unavailable should carry application/problem+json fields and a Retry-After header when retry is appropriate.

Work the contract against a real field scenario. A maintenance technician presses “open valve” from a mobile app while the gateway is on weak LTE. The first request reaches the API gateway, but the response times out on the phone. The retry must use the same idempotency key, because the logical action is still one valve-open request. The server can return the existing command id and state instead of creating a second command that might run after the first already succeeded.

The same review should cover the negative paths. If the valve is locked out by a safety interlock, the API should return 423 Locked or 409 Conflict with a problem-details body naming the interlock state and trace id. If a firmware bug retries every second, a per-device or per-tenant limit should return 429 with Retry-After. If the command service is unavailable, 503 should tell the client whether retry is allowed. These choices turn network failure into observable behavior instead of duplicate side effects.

Before implementation, write sample requests and responses for first submission, duplicate retry, rejected precondition, timeout recovery, rate limit, and final command status. Use OpenAPI examples and consumer tests so mobile, firmware, dashboard, and support teams all exercise the same behavior.

4.4 Compatibility Is Operational Machinery

Compatibility is enforced by tooling and runtime behavior, not by hope. An OpenAPI document, JSON Schema examples, consumer contract tests, gateway validation, and synthetic clients can catch breaking changes before old firmware or partner SDKs see them.

Schema evolution: additive optional fields are usually safe when clients ignore unknown values; renamed fields, new units, reordered coordinates, and changed enum meanings need a new version.
Concurrency: ETag and If-Match protect updates when a dashboard, automation rule, and support agent can edit the same device record.
Gateway controls: Envoy, Kong, NGINX Ingress, AWS API Gateway, or Azure API Management can enforce authentication, quotas, routing, and request size before traffic reaches services.
Service location: Kubernetes Services, CoreDNS, EndpointSlices, Consul, or Eureka belong behind stable entry points so firmware does not depend on Pod IPs or cluster topology.

The machinery has several layers. A CI gate can compare OpenAPI diffs and fail a pull request that removes a required field or changes a response type. Gateway policy can reject requests that exceed body size, omit authentication, use an unsupported media type, or violate a schema. Runtime metrics can separate 4xx client errors, 429 overload, 5xx service failure, idempotency-key reuse, old-version traffic, and command-state transitions. These signals tell operators whether clients are misusing the contract or the platform is failing to honor it.

flowchart TD
  A[Client chooses one logical command] --> B[POST command with Idempotency-Key]
  B --> C{Key already stored?}
  C -->|No| D[Create command resource]
  D --> E[Dispatch to device or gateway]
  E --> F[Persist command state and trace id]
  C -->|Yes| G[Return stored command result]
  F --> H[Client polls or subscribes to command state]
  G --> H
  H --> I{Final state?}
  I -->|Applied| J[Record success for operator and audit views]
  I -->|Rejected or expired| K[Return problem detail and retry boundary]

Idempotent command flow for retries, dispatch, and final command state.

Compatibility also depends on storage and messaging choices. An idempotency table or Redis entry needs a retention window that matches client retry behavior. A command queue such as Kafka, NATS JetStream, RabbitMQ, or Amazon SQS needs a deduplication or command-id rule so a retry does not become a second actuator request downstream. A telemetry query endpoint needs explicit timestamp, unit, quality-flag, and pagination semantics so data warehouses and dashboards do not infer meaning from table layout.

The design is mature when a client can survive retries, stale reads, deprecations, regional failover, and service movement without learning how the backend is deployed. At that point, the API contract is not just documentation; it is a set of tests, gateway policies, storage rules, and operational dashboards that keep deployed IoT clients useful while the service platform changes behind them.

Checkpoint: Contract Boundary

You now know:

The public API should expose durable resources such as /v1/devices/{id}, /v1/zones/{id}/telemetry, /v1/devices/{id}/commands, and /v1/alerts/{id} rather than tables, Pod addresses, or queue names.
A command request should return 202 Accepted, command id, state, timestamp, trace id, and a status link.
Compatibility comes from OpenAPI diffs, JSON Schema examples, consumer tests, gateway validation, and runtime metrics.

4.5 Learning Objectives

By the end of this chapter, you will be able to:

Model IoT device-management and telemetry APIs as stable resources instead of ad hoc remote procedure calls.
Select a versioning approach and classify API changes as compatible or breaking.
Use deprecation and sunset signals to communicate API lifecycle changes.
Compare client-side discovery, server-side discovery, Kubernetes Services, and API gateways.
Design rate-limit, retry, and idempotency behavior for constrained IoT clients.
Review an API contract for maintainability before teams depend on it.

Check Your API Contract Boundary

Most Valuable Understanding

API design is the part of microservices that other teams actually experience. A clean service boundary still fails if the API is chatty, ambiguous, impossible to version, or unable to survive retries. Treat every endpoint, schema field, status code, and deprecation rule as a long-lived contract.

4.6 Prerequisites

SOA and Microservices Fundamentals: Understand modular monoliths, SOA, microservices, and capability decomposition.
MQTT Fundamentals: Understand why device telemetry often uses publish-subscribe while management APIs use request-response.
Cloud Computing for IoT: Know where gateways, services, and managed infrastructure run.

4.7 API Contract Map

An API contract is more than a URL list. It defines the resource model, allowed operations, schemas, errors, lifecycle policy, and discovery path.

REST API architecture pattern showing IoT devices and gateways calling an API gateway that handles authentication, rate limiting, routing, load balancing, and TLS termination before backend authorization, REST services, storage, and analytics systems — Figure 4.1: REST API architecture pattern for constrained IoT clients

Use Figure 4.1 to keep the contract boundary explicit. Device clients see stable endpoints and response behavior; the gateway and backend services can change routing, authorization, storage, and analytics responsibilities without exposing implementation details to deployed firmware.

4.7.1 Resource Model

Names the stable things clients can address: devices, fleets, commands, telemetry streams, rules, and alerts.

4.7.2 Operation Semantics

Defines what GET, POST, PUT, PATCH, and DELETE mean for each resource.

4.7.3 Schema Contract

Defines fields, types, required values, defaults, enum behavior, and compatibility rules.

4.7.4 Error Contract

Defines status codes, retry hints, problem details, trace IDs, and user-actionable messages.

4.7.5 Lifecycle Policy

Defines versions, deprecation signals, sunset dates, migration guides, and monitoring.

4.7.6 Discovery Path

Defines how clients locate the API: public DNS, gateway, Kubernetes Service, service registry, or mesh.

4.8 Resource-Oriented API Design

Resource-oriented design starts from nouns and relationships. Methods then act on those resources. This keeps the API stable even when the backend implementation changes.

# Prefer resources
GET    /v1/devices
POST   /v1/devices
GET    /v1/devices/{deviceId}
PATCH  /v1/devices/{deviceId}
GET    /v1/devices/{deviceId}/telemetry
POST   /v1/devices/{deviceId}/commands
GET    /v1/fleets/{fleetId}/devices

# Avoid RPC-style endpoint sprawl
POST   /getDevices
POST   /createDevice
POST   /updateDeviceMetadata
POST   /sendDeviceCommand
POST   /findDevicesInFleet

4.8.1 Good Resource Signals

Nouns name business concepts, not tables.
Collection and item paths are consistent.
A response schema is stable across related operations.
The client can retrieve a resource after changing it.
Backend migrations do not leak into the public contract.

4.8.2 Design Smells

Verbs dominate the URL.
Every new feature adds a one-off endpoint.
Clients must know database keys or table structure.
A dashboard needs many sequential calls to draw one page.
The same field means different things in different endpoints.

4.9 IoT API Surface

IoT platforms usually need more than one API style. Device telemetry, command, management, analytics, and operator workflows have different latency, payload, and reliability needs.

4.9.1 Device Management API

Creates devices, rotates credentials, updates ownership, reads lifecycle state, and exposes inventory metadata.

4.9.2 Telemetry Query API

Retrieves historical readings, aggregates, windows, and quality flags. Use pagination, time filters, and field selection.

4.9.3 Command API

Creates command requests, returns command state, and supports idempotency keys so retries do not perform the command twice.

4.9.4 Rules and Alerts API

Manages thresholds, alert routes, suppression windows, and notification preferences.

4.9.5 Partner API

Exposes a stable, documented subset of platform capabilities with stronger compatibility guarantees.

4.9.6 Internal Service API

Can be more specialized, but still needs contracts, versioning, observability, and compatibility discipline.

Checkpoint: Resource Surface

You now know:

Resource-oriented APIs start with nouns such as devices, fleets, commands, telemetry, rules, and alerts.
Good resource signals include consistent paths, stable schemas, and no leaked database structure.
Most IoT platforms need several surfaces: device management, telemetry query, command, rules and alerts, partner, and internal service APIs.

Once the resource surface is clear, ask which changes fit the same contract and which need a parallel version.

4.10 Versioning Strategies

Three API versioning choices for IoT service contracts: URL path versioning as the usual IoT default, header or media versioning for capable SDKs and internal clients, and query versioning for deliberate compatibility bridges; a compatibility gate separates additive changes from breaking changes that need a parallel version, migration, deprecation, and sunset — Figure 4.2: API versioning lifecycle choices for IoT service contracts

4.10.1 URL Path Version

Example: /v1/devices

Best when clients are constrained, documentation must be obvious, and gateway routing should be simple.

4.10.2 Header Version

Example: Accept: application/vnd.example.v1+json

Best when clients are capable, URLs should stay clean, and content negotiation is already part of the platform.

4.10.3 Query Version

Example: /devices?version=1

Useful for compatibility in some systems, but easy to misuse and less clear as a long-term public contract.

Practical IoT Default

For device-facing HTTP APIs, URL path versioning is usually the clearest operational choice. Many embedded clients, diagnostics tools, gateways, and support workflows are easier to reason about when the version is visible in the path. Internal service APIs may choose header or package-based versioning if the organization can enforce client behavior.

4.11 Compatibility Rules

Use compatibility rules before deciding that a new version is required.

4.11.1 Usually Compatible

Add an optional response field.
Add a new endpoint.
Add an optional query parameter with the same default behavior.
Add an enum value only when clients are documented to tolerate unknown values.
Add a link, metadata object, or trace identifier.

4.11.2 Usually Breaking

Remove or rename a field.
Change a field type or unit.
Change default behavior.
Make an optional request field required.
Change resource identity or URL structure.
Change visible semantics while keeping the same shape.

Compatibility Test

Ask this question before changing an API: “Would an old client continue to parse, interpret, and act on the new response correctly without a firmware or app update?” If the answer is not clearly yes, treat the change as breaking.

4.12 Compatibility Change Review

Before merging an API change, record the exact contract change, the oldest supported client that must still work, and the compatibility proof. Include schema examples, status-code behavior, retry/idempotency effects, and the metric that will show whether old firmware, apps, SDKs, or partner integrations are still calling the old shape.

Quiz: Versioning Compatibility

4.13 Lifecycle Headers

API versions need an exit path. A deprecation policy tells clients that a resource should no longer be chosen for new work. A sunset policy tells clients when the resource is expected to become unavailable.

HTTP/1.1 200 OK
Deprecation: @1767225599
Sunset: Thu, 31 Dec 2026 23:59:59 GMT
Link: <https://example.com/>; rel="deprecation"; type="text/html"

1. Announce Publish a migration guide before the old endpoint becomes the wrong default.

2. Mark Return deprecation metadata on the old version while it still functions.

3. Measure Track which devices, apps, tenants, and integrations still call the old version.

4. Migrate Move firmware, apps, SDKs, and partner integrations to the replacement.

5. Remove Return a clear permanent failure only after the policy and exception path are complete.

Versioning handles the shape of the contract. Discovery handles where requests go after a client has chosen the contract.

4.14 Service Discovery

In dynamic infrastructure, service instances move, scale, and fail. Service discovery keeps clients from hardcoding addresses.

Service discovery flow with service registration, lookup, and invocation — Figure 4.3: Service discovery flow: registration, discovery, and invocation

4.14.1 Client-Side Discovery

The caller queries a registry, receives healthy instances, and chooses where to send the request.

Good fit for capable internal services that can cache, retry, and observe registry failures.

4.14.2 Server-Side Discovery

The caller sends traffic to a stable load balancer, gateway, or service name. Infrastructure chooses the backend instance.

Good fit for devices and simple clients that should not contain discovery logic.

Server-side discovery where a client calls a stable load balancer that routes to healthy service instances — Figure 4.4: Server-side discovery: Load balancer abstracts service location from clients

4.14.3 Kubernetes Service

Provides a stable virtual endpoint for a set of Pods. Clients use the Service name while Kubernetes updates the backing endpoints.

4.14.4 API Gateway

Centralizes authentication, rate limits, routing, TLS termination, schema enforcement, and public endpoint stability.

4.14.5 Service Registry

Stores service instances and health status. Examples include Consul, Eureka, or registry systems built into orchestration platforms.

4.14.6 DNS and Global Routing

Useful for public entry points, regional routing, and failover, but it should be paired with health checks and operational runbooks.

Quiz: Service Discovery for Devices

Checkpoint: Versions and Discovery

You now know:

URL path versioning such as /v1/devices is usually the clearest device-facing default.
Additive optional fields often stay compatible; removed fields, unit changes, default changes, or semantic changes usually need a new version.
Constrained devices should call a stable API gateway or DNS name while Kubernetes Services, EndpointSlices, Consul, Eureka, or internal routing move backends.

4.15 Rate Limits and Retries

IoT APIs must expect retry storms. Devices reconnect after outages, cellular networks drop packets, and firmware bugs can repeat requests too aggressively.

4.15.1 Per-Device Limit

Limits a single credential or device identity so one faulty device cannot consume the whole API.

4.15.2 Per-Tenant Limit

Protects shared infrastructure when one customer fleet behaves badly.

4.15.3 Global Limit

Protects the platform from overload and preserves headroom for control-plane recovery.

When a client exceeds a limit, return a specific response and a retry hint:

HTTP/1.1 429 Too Many Requests
Retry-After: 30
Content-Type: application/problem+json

{
  "type": "https://example.com/",
  "title": "Too Many Requests",
  "status": 429,
  "detail": "Device exceeded its telemetry write limit.",
  "retry_after_seconds": 30,
  "trace_id": "req-7f2a"
}

Do Not Hide Overload

Returning 200 OK with an application-specific “try later” field teaches clients to ignore HTTP semantics. Use the status code, retry hint, and a structured error body so generic gateways, SDKs, and logs can understand the failure.

4.16 Idempotent Commands

Retries are normal in IoT. A command API should prevent duplicate execution when a device, app, or gateway retries after a timeout.

POST /v1/devices/d-17/commands
Idempotency-Key: 01J64P6RBXW3
Content-Type: application/json

{
  "type": "set_mode",
  "mode": "eco"
}

4.16.1 Server Responsibility

Store the idempotency key with the command result. If the same key is received again, return the original result instead of creating a second command.

4.16.2 Client Responsibility

Generate a stable key for one logical action. Reuse that key only for retries of the same action.

Try It: Retry Contract Card

Choose one command endpoint and write the retry contract before implementation:

Logical action: the exact command that one idempotency key represents.
First request result: status code, command identifier, and response body returned when the command is accepted.
Duplicate retry result: response returned when the same key arrives again after a timeout.
Retry boundary: how long the key is retained and when a client must create a new key.
Overload behavior: the 429, Retry-After, and problem-detail fields clients should obey.
Suppression signal: the metric or log entry that proves duplicate dispatch was suppressed.

Checkpoint: Retry Behavior

You now know:

Rate limits should exist at per-device, per-tenant, and global levels.
Overload should use 429 Too Many Requests, Retry-After, and application/problem+json instead of hiding failure inside 200 OK.
Idempotent command APIs store one Idempotency-Key per logical action so a retry returns the original result rather than creating a duplicate command.

The worked example reviews these pieces as one API surface.

4.17 Worked Example: Smart Building API Review

Scenario: A smart building platform manages sensors, HVAC controllers, occupancy zones, and alerts. The team wants one API surface for dashboards, automation rules, and firmware management.

1. Name Resources Use /v1/buildings, /v1/buildings/{buildingId}/zones, /v1/devices, /v1/devices/{deviceId}/telemetry, and /v1/devices/{deviceId}/commands.

2. Separate Query From Write Use telemetry query endpoints for history and a command endpoint for desired-state changes. Do not overload telemetry writes to also trigger control actions.

3. Add Compatibility Rules Allow optional fields for new sensor capabilities. Create a new version before changing units, field names, or command semantics.

4. Protect the Platform Rate limit by device identity and tenant. Return 429 with Retry-After for retryable overload.

5. Hide Internal Discovery Dashboards and devices call the API gateway. Internal services use Kubernetes Services or a registry behind the gateway.

The review result is not a bigger API. It is a smaller, clearer contract: stable resource paths, explicit versioning, predictable error behavior, and discovery choices hidden from clients that do not need them.

4.18 Design Review Checklist

4.18.1 Resource Shape

Are paths nouns rather than commands?
Are collection and item paths consistent?
Does the API hide database structure?
Are aggregate endpoints available where dashboards would otherwise be chatty?

4.18.2 Compatibility

Are additive changes clearly separated from breaking changes?
Are enums documented for unknown values?
Are default values stable?
Is each version tied to a migration and deprecation policy?

4.18.3 Operational Behavior

Are rate limits applied by device, tenant, and platform?
Do retryable responses include Retry-After?
Are commands idempotent?
Do error responses include trace IDs?

4.18.4 Discovery

Do devices see a stable gateway or DNS name?
Are internal service names managed by the orchestration platform?
Are health checks part of routing?
Can a region or instance fail without changing firmware?

4.19 Common Pitfalls

4.19.1 One Version Forever

Keeping one version while changing behavior breaks deployed devices quietly. Version breaking changes and publish a lifecycle policy.

4.19.2 Chatty Dashboards

A dashboard that calls every device one by one multiplies latency and load. Add list, filter, aggregate, and pagination patterns.

4.19.3 Registry Logic in Firmware

Firmware should not need to know Pod IPs, service registries, or cluster topology. Give devices stable public endpoints.

4.20 Key Concepts

REST API: A resource-oriented HTTP API using standard methods and status codes.
API contract: The stable combination of paths, methods, schemas, errors, lifecycle policy, and behavior.
Backward compatibility: The ability for old clients to keep working with a newer server.
Deprecation: A signal that a resource should no longer be chosen for new work.
Sunset: A signal that a resource is expected to become unavailable at a future time.
Service discovery: The mechanism that maps a stable service name or registry entry to healthy service instances.
API gateway: A stable entry point that can centralize routing, authentication, rate limits, and policy enforcement.
Idempotency key: A client-generated value that lets the server recognize retries of the same logical operation.

Label the Diagram

Code Challenge

4.21 Summary

This chapter covered API design and service discovery for IoT service architectures:

Model APIs around stable resources and standard HTTP behavior.
Use new versions for breaking changes, not every additive change.
Communicate lifecycle with deprecation documentation and sunset dates.
Keep constrained devices behind stable gateways or DNS names.
Use service discovery inside the platform where clients can handle it.
Add rate limits, retry hints, and idempotency to survive real network behavior.

Key Takeaway

In one sentence: an IoT API should let services evolve without forcing every deployed device, dashboard, and partner integration to change at the same time.

4.22 Knowledge Check

Quiz: Rate Limits and Retry Behavior

Try It Yourself: Design an API Migration

Design a migration for an IoT building platform that must replace /v1/devices/{id}/telemetry with a new schema.

Context:

Existing firmware can update, but not all devices connect every day.
Dashboards and automation rules read the same telemetry API.
The new schema adds quality flags and changes the unit representation.
Operators need to know which clients still call the old version.

Tasks:

Choose a versioning strategy for the device-facing API.
Classify each schema change as compatible or breaking.
Define deprecation and sunset signals.
List the usage metrics you need before removal.
Decide whether devices should call a gateway, registry, or direct service endpoint.

Deliverable: A one-page migration plan with endpoint names, timeline, client communications, and rollback criteria.

Interactive Quiz: Match API Design Concepts

Interactive Quiz: Sequence the Steps

4.23 References

4.24 What’s Next

4.24.1 Build Fault-Tolerant Services

SOA Resilience Patterns

Next, connect API behavior to circuit breakers, timeouts, bulkheads, retries, fallbacks, and failure isolation.

4.24.2 Deploy the Service Platform

SOA Container Orchestration

Use containers, Kubernetes, observability, and service routing to run the APIs behind the contracts.

4.24.3 Revisit Service Boundaries

SOA and Microservices Fundamentals

Use API evidence to decide whether a boundary should remain internal, become a service, or be split further.

4.24.4 Review Reference Models

IoT Reference Models and Patterns

Map the API layer to device, edge, platform, application, and operations responsibilities.

4.1 Start With the Client That Cannot Change Quickly

4.2 The API Is The Long-Lived Wire Contract

4.3 Design The Command API Before The Button

4.4 Compatibility Is Operational Machinery

4.5 Learning Objectives

4.6 Prerequisites

4.7 API Contract Map

4.7.1 Resource Model

4.7.2 Operation Semantics

4.7.3 Schema Contract

4.7.4 Error Contract

4.7.5 Lifecycle Policy

4.7.6 Discovery Path

4.8 Resource-Oriented API Design

4.8.1 Good Resource Signals

4.8.2 Design Smells

4.9 IoT API Surface

4.9.1 Device Management API

4.9.2 Telemetry Query API

4.9.3 Command API

4.9.4 Rules and Alerts API

4.9.5 Partner API

4.9.6 Internal Service API

4.10 Versioning Strategies

4.10.1 URL Path Version

4.10.2 Header Version

4.10.3 Query Version

4.11 Compatibility Rules

4.11.1 Usually Compatible

4.11.2 Usually Breaking

4.12 Compatibility Change Review

4.13 Lifecycle Headers

4.14 Service Discovery

4.14.1 Client-Side Discovery

4.14.2 Server-Side Discovery

4.14.3 Kubernetes Service

4.14.4 API Gateway

4.14.5 Service Registry

4.14.6 DNS and Global Routing

4.15 Rate Limits and Retries

4.15.1 Per-Device Limit

4.15.2 Per-Tenant Limit

4.15.3 Global Limit

4.16 Idempotent Commands

4.16.1 Server Responsibility

4.16.2 Client Responsibility

4.17 Worked Example: Smart Building API Review

4.18 Design Review Checklist

4.18.1 Resource Shape

4.18.2 Compatibility

4.18.3 Operational Behavior

4.18.4 Discovery

4.19 Common Pitfalls

4.19.1 One Version Forever

4.19.2 Chatty Dashboards

4.19.3 Registry Logic in Firmware

4.20 Key Concepts

4.21 Summary

4.22 Knowledge Check

4.23 References

4.24 What’s Next

4.24.1 Build Fault-Tolerant Services

4.24.2 Deploy the Service Platform

4.24.3 Revisit Service Boundaries

4.24.4 Review Reference Models

4.25 Navigation

4.25.1 Previous

4.25.2 Current

4.25.3 Next