47  CoAP Advanced Features

In 60 Seconds

CoAP’s advanced features extend it beyond simple request-response: Block-wise Transfer splits large payloads (firmware updates) into sequential blocks over UDP, Resource Discovery via .well-known/core lets clients automatically find available resources, and DTLS provides end-to-end encryption. These features make CoAP production-ready for OTA updates, auto-configuration, and secure deployments.

47.1 Learning Objectives

By the end of this chapter, you will be able to:

  • Implement Block-wise Transfer: Apply RFC 7959 to handle large payloads (firmware updates) over CoAP’s UDP transport
  • Configure Resource Discovery: Construct .well-known/core endpoints with CoRE Link Format attributes
  • Evaluate CoAP Transport Options: Select between UDP, TCP, and WebSockets based on network constraints and device requirements
  • Analyze DTLS Security Overhead: Calculate handshake RTT costs and justify security mode selection for constrained deployments
  • Diagnose Block Transfer Failures: Distinguish between timeout cascade, packet loss, and NAT expiry failure modes
  • CoAP: Constrained Application Protocol — REST-style request/response protocol using UDP instead of TCP
  • Confirmable Message (CON): Requires ACK from recipient — provides reliable delivery over UDP at the cost of one roundtrip
  • Non-confirmable Message (NON): Fire-and-forget UDP datagram — lowest latency, no delivery guarantee
  • Observe Option: CoAP extension enabling publish/subscribe: client registers to receive notifications on resource changes
  • Block-wise Transfer: Fragmentation mechanism for transferring payloads larger than a single CoAP datagram
  • Token: Client-generated value matching responses to requests — enables concurrent request/response pairing
  • DTLS: Datagram TLS — CoAP’s security layer providing encryption and authentication over UDP

47.2 For Beginners: CoAP Advanced Features

Beyond basic request-response messaging, CoAP offers features like resource observation (automatic notifications when data changes), block transfers (sending large data in chunks), and resource discovery (finding what services a device offers). These features make CoAP a powerful, lightweight communication tool for constrained IoT devices.

“I thought CoAP was just simple request-response,” said Sammy the Sensor. “But it can do way more!”

Max the Microcontroller grinned. “You discovered Observe! Instead of the dashboard asking you for temperature every 10 seconds – which wastes energy – you register once and then automatically send updates whenever the temperature changes. It’s like subscribing to a newsletter instead of checking the mailbox every day.”

“And what about when I need to send a big firmware update?” asked Lila the LED. “That’s Block Transfer!” said Max. “CoAP breaks the big file into small blocks, like cutting a pizza into slices. Each slice gets its own delivery confirmation, so if one slice gets lost, you only resend that one – not the whole pizza.”

Bella the Battery was most excited about Resource Discovery: “New devices can ask ‘hey, what services do you offer?’ and get back a list – like a phone directory. So when I join a network, I don’t need to be pre-configured. I just ask around and find out who does what. It’s plug-and-play for IoT!”

47.3 Prerequisites

Before diving into this chapter, you should be familiar with:

47.4 Block-wise Transfer (RFC 7959)

Minimum Viable Understanding: Block-wise Transfer

Core Concept: CoAP messages are limited by UDP MTU (1280-1500 bytes), but firmware updates or images require 100KB+ payloads. Block-wise transfer splits large payloads into sequential blocks with automatic reassembly.

Why It Matters: Without block transfer, CoAP couldn’t handle firmware OTA updates, large configuration files, or image uploads - essential for production IoT deployments.

Key Takeaway: Use Block2 for large response payloads (downloads), Block1 for large request payloads (uploads). Each block includes block number, more-flag, and size.

47.4.1 Block2: Large Response Payloads

Used when server sends large data to client (e.g., firmware download):

CoAP Block2 transfer sequence showing client requesting firmware blocks with Block2 option containing block number (NUM), more flag (M), and block size (SIZE), with server responding with payload chunks until final block with M=0
Figure 47.1: CoAP Block2 transfer sequence showing client requesting firmware blocks with Block2 option containing block number (NUM), more flag (M), and block size (SIZE), with server responding with payload chunks until final block with M=0

Block2 option format:

  • NUM: Block number (0-based sequence)
  • M: More flag (1 = more blocks follow, 0 = last block)
  • SIZE: Block size in bytes (16, 32, 64, 128, 256, 512, 1024)

47.4.2 Block1: Large Request Payloads

Used when client sends large data to server (e.g., uploading logs):

Client -> Server: PUT /logs/daily
                  Block1: NUM=0, M=1, SIZE=512
                  Payload: [first 512 bytes]

Server -> Client: 2.31 Continue
                  Block1: NUM=0, M=1, SIZE=512

Client -> Server: PUT /logs/daily
                  Block1: NUM=1, M=1, SIZE=512
                  Payload: [next 512 bytes]

... until final block with M=0 ...

Server -> Client: 2.04 Changed

47.4.3 Implementation with Reliability

async def download_firmware(uri, output_file):
    block_num = 0
    block_size = 1024
    consecutive_failures = 0

    with open(output_file, 'wb') as f:
        while True:
            # Adaptive timeout with exponential backoff
            timeout = min(2.0 * (1.5 ** consecutive_failures), 30.0)

            try:
                request = Message(
                    code=GET,
                    uri=uri,
                    msg_type=CON,  # Reliable for each block
                    block2=(block_num, False, block_size)
                )
                response = await protocol.request(request, timeout=timeout).response

                # Write block to file
                f.write(response.payload)
                consecutive_failures = 0

                # Check if more blocks available
                if response.opt.block2.more:
                    block_num += 1
                else:
                    break  # Last block received

            except TimeoutError:
                consecutive_failures += 1
                if consecutive_failures > 10:
                    raise TransferFailed(f"Failed at block {block_num}")

    print(f"Downloaded {block_num + 1} blocks")

47.4.4 Performance Analysis

Scenario: 256 KB firmware update over LoRaWAN (10% packet loss)

Approach Blocks Retransmissions Time Success Rate
Single 256KB payload 1 Many 146s ~0%
1KB blocks (NON) 256 0 51s ~0%
1KB blocks + CON 256 Avg 1.11/block 57s ~99.99%
512 byte blocks + CON 512 Avg 1.11/block 114s ~99.999%

The success rate for block-wise transfer can be calculated using probability theory. With packet loss rate \(p = 0.10\), the probability of successful delivery for \(n\) blocks is:

\[P_{\text{success}} = (1 - p)^n\]

For 1KB blocks (256 blocks total) without confirmations (NON):

\[P_{1\text{KB NON}} = (1 - 0.10)^{256} = 0.9^{256} \approx 1.4 \times 10^{-12} \approx 0\%\]

For 1KB blocks with CON (each block confirmed), the expected number of transmissions per block is:

\[E[\text{transmissions}] = \frac{1}{1-p} = \frac{1}{0.9} \approx 1.11\]

Total time for 256 blocks at 200ms per confirmed block:

\[T_{\text{total}} = 256 \times 0.2 \times 1.11 \approx 56.8 \text{ seconds}\]

For 512-byte blocks (512 blocks total), the per-block success probability is the same (\(0.9^1 = 90\%\) per individual block), requiring the same expected retransmissions per block (\(1/0.9 \approx 1.11\) attempts). However, the overall transfer success is higher because each individual lost block wastes only 512 bytes instead of 1024 bytes. Total transfer time at 200ms RTT: \(512 \times 0.2 \times 1.11 \approx 113.7\) seconds.

Key insight: Smaller blocks = higher success rate on lossy links, but more overhead.

Try It: Adjust the firmware size, block size, packet loss rate, and round-trip time to see how they affect transfer time and reliability. Notice how smaller blocks improve success rate at the cost of more overhead.

Key Principle: Block size should balance MTU constraints (avoid IP fragmentation), packet loss recovery (smaller blocks waste less bandwidth on retry), and overhead (fewer blocks = less header overhead).

47.5 Resource Discovery (RFC 6690)

Minimum Viable Understanding: CoAP Resource Discovery

Core Concept: Every CoAP server exposes a standardized /.well-known/core endpoint that returns a machine-readable list of all available resources with their URIs, types, interfaces, and capabilities in Link Format (RFC 6690).

Why It Matters: Resource discovery enables true plug-and-play IoT - new devices can be added to a network and automatically discovered by clients without manual configuration.

Key Takeaway: Always implement /.well-known/core with semantic attributes (rt= for resource type, if= for interface, obs for observability).

47.5.1 Discovery Request-Response

Client Request:
GET coap://sensor.local/.well-known/core

Server Response (Link Format, Content-Format: 40):
</sensors/temp>;rt="temperature";if="sensor";obs,
</sensors/humidity>;rt="humidity";if="sensor";obs,
</sensors/pressure>;rt="pressure";if="sensor",
</actuators/led>;rt="light";if="actuator",
</config/interval>;rt="config";if="parameter"

47.5.3 Filtered Discovery

Request only specific resource types:

# Find all temperature sensors
GET coap://sensor.local/.well-known/core?rt=temperature

Response:
</sensors/temp>;rt="temperature";if="sensor";obs,
</sensors/temp_outdoor>;rt="temperature";if="sensor";obs

# Find all observable resources
GET coap://sensor.local/.well-known/core?obs

47.5.4 Multicast Discovery

Discover all CoAP devices on network simultaneously:

GET coap://[FF02::FD]/.well-known/core

# All CoAP devices respond with their resource list
Device 1: </temp>;rt="temperature"
Device 2: </humidity>;rt="humidity"
Device 3: </pressure>;rt="pressure"

47.6 CoAP over TCP (RFC 8323)

47.6.1 Why CoAP-over-TCP?

Problem: UDP works great for constrained devices, but some networks: - Block UDP entirely (corporate firewalls) - Have asymmetric NAT breaking UDP return paths - Require guaranteed ordering (financial transactions)

Solution: RFC 8323 defines CoAP over reliable transports (TCP, TLS, WebSockets).

47.6.2 Protocol Differences

Feature CoAP/UDP CoAP/TCP
Transport Unreliable UDP Reliable TCP
Message Types CON, NON, ACK, RST Signaling only
Reliability Application (CON/ACK) Transport (TCP)
Message ID Required Optional
Connection None TCP 3-way handshake
Default Port 5683 (coap://) 5683 (coap+tcp://)
Secure Port 5684 (coaps://) 443 (coaps+tcp://)

47.6.3 Header Format Changes

CoAP/UDP header (4 bytes):

|Ver| T |  TKL  |      Code     |          Message ID           |

CoAP/TCP header (2-4 bytes):

| Len | TKL |      Code     |            Token              |

Key changes: - No Message ID (TCP sequence numbers handle ordering) - No Type field (TCP reliability eliminates CON/NON/ACK) - Length field (for framing over stream)

Decision Framework: Select your network constraints, requirements, and device characteristics to see which CoAP transport is recommended for your use case.

47.6.4 When to Use

Use CoAP/TCP when:

  • Corporate/enterprise networks block UDP
  • Web dashboard integration (WebSockets)
  • Guaranteed ordering requirements
  • Long-lived bidirectional streams

Avoid CoAP/TCP when:

  • Battery-powered sensors (UDP more efficient)
  • Multicast scenarios (TCP is unicast only)
  • Intermittent communication (connection overhead wasteful)

47.7 DTLS Security

CoAP uses DTLS (Datagram TLS) for security over UDP:

47.7.1 Security Modes

Mode Description Use Case
NoSec No security Development only
PreSharedKey Symmetric keys pre-installed Factory provisioning
RawPublicKey Asymmetric without certificates Lightweight devices
Certificate Full X.509 certificates Enterprise deployments

47.7.2 DTLS Handshake

DTLS handshake sequence showing ClientHello with supported cipher suites, ServerHello selecting cipher suite, Certificate exchange for authentication, key exchange for deriving session keys, ChangeCipherSpec to activate encryption, and Finished messages to verify handshake integrity
Figure 47.2: DTLS handshake sequence showing ClientHello with supported cipher suites, ServerHello selecting cipher suite, Certificate exchange for authentication, key exchange for deriving session keys, ChangeCipherSpec to activate encryption, and Finished messages to verify handshake integrity

Key Insight: DTLS 1.2 full handshake requires approximately 4 round-trips including the DTLS cookie exchange (vs 1-2 for session resumption). The interactive calculator above uses 6 as a conservative estimate counting each message flight. For battery-powered devices with frequent reconnections, session resumption can reduce security overhead by 50-75% while maintaining encryption.

47.8 Common Pitfalls

Pitfall: Block-Wise Transfer Timeout Cascade

The Mistake: Using default CoAP timeout values for block-wise transfers, causing transfer failures at 60-80% completion on lossy links.

Why It Happens: CoAP’s 2-second ACK timeout works for single messages but fails for multi-block transfers. Any single timeout can cause restart.

The Fix: Implement adaptive timeouts and resumable transfers:

def download_with_resume(uri, resume_from=0):
    block_num = resume_from
    consecutive_failures = 0
    base_timeout = 2.0

    while True:
        timeout = min(base_timeout * (1.5 ** consecutive_failures), 30.0)

        try:
            response = coap_get(uri, block2=(block_num, 0, 1024), timeout=timeout)
            save_progress(block_num, response.payload)
            consecutive_failures = 0

            if response.block2.more:
                block_num += 1
            else:
                return assemble_firmware()

        except TimeoutError:
            consecutive_failures += 1
            if consecutive_failures > 10:
                save_resume_point(block_num)
                raise TransferSuspended(block_num)
Pitfall: Assuming CoAP Works Through NAT Like HTTP

The Mistake: Deploying CoAP devices behind NAT gateways without considering that UDP NAT mappings expire quickly (30-120 seconds).

The Fix:

  • Test UDP connectivity before design
  • Implement NAT keepalive (send messages every 25 seconds)
  • Consider CoAP over TCP for NAT-hostile networks
  • Use DTLS session resumption for faster reconnects

47.9 Worked Example: OTA Firmware Update for 500 Street Lights

Scenario: A city council needs to push a 128 KB firmware update to 500 CoAP-enabled LED street lights over a LoRaWAN network. Each light has a Semtech SX1276 radio module with a 242-byte maximum application payload per LoRaWAN uplink/downlink. The network experiences 8% average packet loss. The council wants to complete the rollout within a single maintenance window (4 AM to 6 AM, 2 hours).

Step 1: Calculate block count and transfer time per device

Firmware size: 128 KB = 131,072 bytes
Block size: 128 bytes (leaving room for CoAP headers within 242-byte limit)
  CoAP header + Block2 option: ~12 bytes
  Total per-message payload: 128 + 12 = 140 bytes (fits in 242-byte LoRaWAN frame)

Block count: 131,072 / 128 = 1,024 blocks per device

Each block requires a CON request (device requests block) and a 2.05 Content response (server sends block). On LoRaWAN Class C (continuous receive), the round-trip is approximately 1-2 seconds per block:

Optimistic transfer time: 1,024 blocks x 1.5 seconds = 1,536 seconds = 25.6 minutes
With 8% packet loss and 1 retry per loss:
  Expected retransmissions: 1,024 x 0.08 = ~82 extra blocks
  Adjusted time: (1,024 + 82) x 1.5 = 1,659 seconds = 27.7 minutes per device

Step 2: Assess parallelism constraints

LoRaWAN gateways can handle approximately 8 simultaneous downlinks on different channels (EU868 has 8 channels). Each device update occupies one channel for ~28 minutes:

Devices per gateway: 8 parallel streams
Time per batch: 28 minutes
Batches needed: 500 / 8 = 62.5 -> 63 batches
Total time with 1 gateway: 63 x 28 = 1,764 minutes = 29.4 hours

This far exceeds the 2-hour window. The council needs more gateways:

Available time: 120 minutes
Batches possible per gateway: 120 / 28 = 4.3 -> 4 batches
Devices per gateway: 4 batches x 8 channels = 32 devices
Gateways needed: 500 / 32 = 15.6 -> 16 gateways

Step 3: Design the update strategy with CoAP Block-wise features

Design Decision Choice Rationale
Block size 128 bytes Fits LoRaWAN frame with CoAP headers
Message type CON Must guarantee delivery for firmware integrity
Resume support Yes (save block_num to flash) Device can resume after power cycle or temporary failure
Integrity check SHA-256 hash verified after final block Corrupted firmware bricks the light
Rollback plan Keep previous firmware in secondary flash partition Failed update reverts automatically

Step 4: Cost analysis

Option A: Sequential update (1 gateway, no parallelism)
  Time: 500 devices x 28 min = 14,000 minutes = 9.7 days
  Gateway cost: 1 x $1,500 = $1,500
  Labor: 0 (unattended)
  Risk: Very slow, devices run outdated firmware for days

Option B: Parallel update (16 gateways, 2-hour window)
  Time: 2 hours (single maintenance window)
  Gateway cost: 16 x $1,500 = $24,000 (but gateways serve normal traffic too)
  Labor: 1 technician x 2 hours = $100
  Risk: Low, all devices updated simultaneously

Option C: Incremental rollout (4 gateways, 4 nights)
  Time: 4 maintenance windows x 2 hours = 8 hours total
  Gateway cost: 4 x $1,500 = $6,000
  Devices per night: 128
  Risk: Medium -- staged rollout catches firmware bugs before full deployment

Recommendation: Option C (incremental rollout). While slower overall, updating 128 devices on night 1 provides a live validation of the firmware. If the update causes issues (dimming failures, communication bugs), only 25% of lights are affected. The city saves $18,000 on gateways compared to Option B and gains the safety net of staged deployment. CoAP’s Block-wise resume capability means any interrupted transfers pick up where they left off the next night.

Knowledge Check: Concept Matching

Match each CoAP advanced feature concept to its correct definition or use case.

47.10 Summary

CoAP’s advanced features enable production IoT deployments:

  • Block-wise transfer: Split large payloads into reliable blocks
  • Resource discovery: Standardized .well-known/core with Link Format
  • CoAP/TCP: Firewall-friendly alternative when UDP is blocked
  • DTLS security: Encryption and authentication for constrained devices

Key decisions:

  • Use smaller blocks (256-512 bytes) on lossy networks
  • Always implement resource discovery with semantic attributes
  • Choose UDP for battery efficiency, TCP for NAT/firewall traversal
  • Use PSK for constrained devices, certificates for enterprise

47.11 Concept Relationships

Advanced CoAP features connect to:

Foundation Concepts:

Practical Applications:

Security:

  • DTLS for IoT - Securing CoAP communications
  • CoAP Security Modes - NoSec, PSK, certificates

47.12 See Also

Standards:

Implementation:

Comparison:

47.13 What’s Next

Chapter Focus Why Read It
CoAP API Design RESTful resource modeling and URI design Apply the resource discovery patterns from this chapter to design well-structured CoAP APIs
CoAP Decision Framework When to use CoAP vs MQTT vs HTTP Evaluate trade-offs using the transport and security knowledge built here
CoAP Fundamentals and Architecture Core CoAP message model and architecture Reinforce the foundation that all advanced features extend
OTA Firmware Updates Production firmware update workflows See Block-wise Transfer applied end-to-end in a real OTA update pipeline
DTLS and Security DTLS handshake mechanics and cipher suites Deepen understanding of the security modes introduced in this chapter
6LoWPAN Overview IPv6 adaptation for constrained networks Understand the network layer that CoAP block transfer and discovery operate over