291  SDN APIs and High Availability

291.1 Learning Objectives

By the end of this chapter, you will be able to:

  • Design Controller APIs: Describe northbound (REST/gRPC) and southbound (OpenFlow) API interactions and message types
  • Implement High Availability: Apply controller clustering and failover strategies for production deployments
  • Configure State Synchronization: Understand Raft consensus, eventual consistency, and distributed transactions for controller clusters
  • Plan Switch-Controller Connections: Configure auxiliary connections and failover behavior for redundancy

291.2 Prerequisites

Before diving into this chapter, you should be familiar with:

Think of the SDN controller like a restaurant kitchen.

Northbound API = How waiters (applications) communicate with the kitchen - “Table 5 wants the salmon, no nuts” -> “Block IP 10.0.1.50 from the network” - High-level requests, no need to know cooking details

Southbound API = How the head chef (controller) directs line cooks (switches) - “Grill salmon at 400F for 8 minutes” -> “Install flow rule: match src=10.0.1.50, action=DROP” - Precise, technical instructions

Clustering = Having multiple kitchens ready to take over - If the main kitchen catches fire, the backup kitchen continues serving - Customers (network traffic) barely notice the switch - More kitchens = higher reliability, but also more coordination overhead

For IoT specifically: - 99.99% uptime requires clustering (only 52 minutes downtime/year) - Failover time of 3-5 seconds means brief traffic disruption during controller switchover - Most IoT deployments use 3-node clusters (survives 1 failure, manageable overhead)

Deep Dives: - SDN Controller Basics (Overview) - Index of all SDN controller topics - SDN Controller Architecture - Internal components and message flow - SDN Controller Comparison - Comparing OpenDaylight, ONOS, Ryu, Floodlight

Advanced Topics: - SDN Analytics and Implementations - Traffic engineering and network slicing - SDN for IoT: Variants and Challenges - IoT-specific optimizations

Architecture: - Edge-Fog Computing - Distributed control planes

WarningCommon Misconception: “More Controllers = Better Performance”

The Myth: Adding more controllers to a cluster always improves network performance and scalability.

The Reality: Controller clustering is about high availability, not raw performance. More controllers can actually decrease performance due to coordination overhead.

Real-World Example:

A smart city deployment tested ONOS controller scaling with 10,000 IoT devices:

  • Single controller: 50,000 flow installations/second, 15ms latency
  • 3-node cluster: 45,000 flow installations/second, 18ms latency (10% slower)
  • 5-node cluster: 38,000 flow installations/second, 25ms latency (24% slower)

Why performance degrades:

  1. State synchronization overhead: Every flow rule must be replicated to all cluster members (3x the network traffic for 3-node cluster)
  2. Consensus protocols: Cluster must agree on state changes (Raft/Paxos adds 5-10ms latency)
  3. Leader election delays: When active controller fails, cluster needs 2-5 seconds to elect new leader

The right approach:

  • Use clustering for reliability (99.99% -> 99.9999% uptime), not performance
  • Deploy 3 controllers (optimal balance: survives 1 failure, minimal overhead)
  • For performance scaling, use controller federation (divide network into domains, each with its own controller)
  • Example: Google’s B4 WAN uses federated controllers (one per datacenter) managing 100,000+ devices, rather than a single massive cluster

Bottom Line: A well-tuned single controller often outperforms a poorly configured cluster. Use clustering when availability requirements exceed 99.9%, not as a default performance optimization.

291.3 Northbound and Southbound APIs

Time: ~12 min | Difficulty: Advanced | Unit: P04.C27.U03

The controller acts as a mediator between applications (northbound) and network devices (southbound).

%% fig-cap: "Northbound and Southbound API Communication Patterns showing REST-based application integration and OpenFlow-based device control"
%% fig-alt: "Diagram showing bidirectional API communication. Top section shows Northbound APIs with three IoT applications (Security, Monitoring, Load Balancer) sending REST/gRPC requests to Controller and receiving JSON responses. Bottom section shows Southbound APIs with Controller sending OpenFlow messages (Flow-Mod, Packet-Out) to switches and receiving OpenFlow messages (Packet-In, Stats-Reply) from switches. Controller in center mediates between both interfaces."

%%{init: {'theme': 'base', 'themeVariables': { 'primaryColor': '#2C3E50', 'primaryTextColor': '#fff', 'primaryBorderColor': '#16A085', 'lineColor': '#16A085', 'secondaryColor': '#E67E22', 'tertiaryColor': '#7F8C8D'}}}%%

graph TB
    subgraph Applications["Applications (Northbound)"]
        SecApp["Security App"]
        MonApp["Monitoring App"]
        LBApp["Load Balancer"]
    end

    subgraph NorthAPI["Northbound API (REST/gRPC)"]
        REST1["POST /firewall/block"]
        REST2["GET /topology"]
        REST3["POST /path/compute"]
    end

    Controller["SDN Controller<br/>(Policy Translation)"]

    subgraph SouthAPI["Southbound API (OpenFlow)"]
        OF1["Flow-Mod"]
        OF2["Packet-Out"]
        OF3["Packet-In"]
        OF4["Stats-Request"]
    end

    subgraph Switches["Network Devices (Southbound)"]
        SW1["Switch 1"]
        SW2["Switch 2"]
        SW3["Switch 3"]
    end

    SecApp -->|"REST: Block IP"| REST1
    MonApp -->|"REST: Get topology"| REST2
    LBApp -->|"gRPC: Compute path"| REST3

    REST1 --> Controller
    REST2 --> Controller
    REST3 --> Controller

    Controller -->|"Translate policy<br/>to flow rules"| OF1
    Controller --> OF2
    Controller --> OF4

    OF1 --> SW1
    OF1 --> SW2
    OF2 --> SW3
    OF4 --> SW1

    SW1 -->|"Packet-In (no match)"| OF3
    SW2 -->|"Stats-Reply"| Controller
    SW3 -->|"Barrier-Reply"| Controller

    OF3 --> Controller

    style Applications fill:#E67E22,stroke:#2C3E50,stroke-width:2px,color:#fff
    style Controller fill:#2C3E50,stroke:#16A085,stroke-width:3px,color:#fff
    style Switches fill:#16A085,stroke:#2C3E50,stroke-width:2px,color:#fff
    style NorthAPI fill:#7F8C8D,stroke:#2C3E50,stroke-width:1px,color:#333
    style SouthAPI fill:#7F8C8D,stroke:#2C3E50,stroke-width:1px,color:#333

Figure 291.1: Diagram showing bidirectional API communication

291.4 Northbound APIs

Northbound APIs allow applications to interact with the controller using high-level abstractions.

291.4.1 Common API Types

1. REST API (most common)

  • HTTP-based (GET/POST/PUT/DELETE)
  • JSON payloads
  • Stateless communication

2. gRPC (modern alternative)

  • Protocol Buffers for serialization
  • Bidirectional streaming
  • Lower latency than REST

3. NETCONF (configuration management)

  • XML-based
  • Transactional operations
  • Used for device provisioning

291.4.2 Example REST API Calls

# Get network topology
curl -X GET http://controller:8181/restconf/operational/network-topology:network-topology

# Block traffic from IoT device
curl -X POST http://controller:8181/restconf/operations/firewall:block \
  -H "Content-Type: application/json" \
  -d '{"source-ip": "10.0.1.50", "action": "drop"}'

# Query flow statistics
curl -X GET http://controller:8181/restconf/operational/opendaylight-inventory:nodes/node/openflow:1/flow-statistics

291.4.3 Northbound API Comparison

Protocol Latency Throughput Complexity Best For
REST 5-20ms 10K req/sec Low General apps
gRPC 1-5ms 100K req/sec Medium High-perf apps
NETCONF 10-50ms 1K req/sec High Configuration

291.5 Southbound APIs

Southbound APIs control network devices using standardized protocols.

291.5.1 Primary Protocols

1. OpenFlow (dominant standard)

  • Flow-based forwarding
  • Version 1.3+ most common in IoT
  • Supports meters, groups, multi-table pipelines

2. NETCONF (device configuration)

  • Configure device parameters
  • Firmware updates
  • State retrieval

3. OVSDB (Open vSwitch Database)

  • Manage virtual switches
  • Port configuration
  • Tunnel setup

291.5.2 OpenFlow Message Types

Message Type Direction Purpose Example Use
Packet-In Switch -> Controller No matching flow rule New IoT device sends first packet
Flow-Mod Controller -> Switch Install/modify flow rule “Forward sensor data to analytics server”
Packet-Out Controller -> Switch Send specific packet Controller generates ARP reply
Stats-Request Controller -> Switch Query statistics “How many bytes on port 5?”
Stats-Reply Switch -> Controller Statistics response “Port 5: 1.5 GB, 500K packets”
Barrier-Request Controller -> Switch Synchronization checkpoint “Confirm all previous rules installed”
Barrier-Reply Switch -> Controller Confirmation “All rules committed”

291.5.3 OpenFlow Flow-Mod Example

{
  "flow": {
    "id": "sensor-to-gateway-001",
    "table_id": 0,
    "priority": 100,
    "match": {
      "in-port": 1,
      "eth-type": "0x0800",
      "ipv4-source": "10.0.1.0/24"
    },
    "instructions": {
      "apply-actions": {
        "action": [
          {"output-action": {"output-node-connector": 5}}
        ]
      }
    },
    "hard-timeout": 0,
    "idle-timeout": 300
  }
}

291.6 High Availability and Clustering

Time: ~15 min | Difficulty: Advanced | Unit: P04.C27.U04

CautionPitfall: Assuming Switches Retain Flow Rules During Controller Failover

The Mistake: Expecting that when the active SDN controller fails and a backup takes over, all existing flow rules on switches remain intact and operational. Teams design failover assuming zero traffic disruption.

Why It Happens: Unlike traditional switches that maintain forwarding tables independently, OpenFlow switches may clear flow tables or mark flows as invalid when controller connection is lost, depending on flow timeout settings and switch implementation. The assumption that “data plane continues while control plane recovers” is only partially true.

The Fix: Configure appropriate flow timeouts: use hard timeouts for security-sensitive flows (force re-authentication), but set idle timeouts for stable traffic patterns (keeps rules while traffic flows). Install critical flows as “permanent” (no timeout) via the controller. Test failover scenarios with production-like traffic to measure actual packet loss. Most importantly, configure switches with multiple controller connections (primary + backup) so they can immediately request new master role from backup controller, reducing failover time from 30+ seconds to under 5 seconds.

For production IoT deployments, controller failure is unacceptable. Clustering provides redundancy.

%% fig-cap: "SDN Controller Clustering Architecture showing 3-node cluster with state synchronization and switch connections"
%% fig-alt: "Diagram of controller high availability setup with three controller nodes in a cluster. Controllers labeled Controller-1 (Master), Controller-2 (Slave), and Controller-3 (Slave) are interconnected with bidirectional State Sync arrows. Below, three OpenFlow switches (Switch A, B, C) connect to all three controllers via dashed lines (backup connections) and solid lines (active connections). Failure scenario shown: Controller-1 fails, Controllers 2 and 3 elect new master in 3-5 seconds. Switches reconnect automatically."

%%{init: {'theme': 'base', 'themeVariables': { 'primaryColor': '#2C3E50', 'primaryTextColor': '#fff', 'primaryBorderColor': '#16A085', 'lineColor': '#16A085', 'secondaryColor': '#E67E22', 'tertiaryColor': '#7F8C8D'}}}%%

graph TB
    subgraph Cluster["Controller Cluster (High Availability)"]
        C1["Controller-1<br/>(Master)"]
        C2["Controller-2<br/>(Slave)"]
        C3["Controller-3<br/>(Slave)"]
    end

    subgraph StateSync["State Synchronization"]
        Raft["Raft Consensus<br/>(Leader Election)"]
        StateDB["Distributed State DB<br/>(Topology, Flows)"]
    end

    subgraph Network["Network Infrastructure"]
        SW1["Switch A"]
        SW2["Switch B"]
        SW3["Switch C"]
    end

    C1 <-->|"State Sync"| C2
    C2 <-->|"State Sync"| C3
    C3 <-->|"State Sync"| C1

    C1 --> Raft
    C2 --> Raft
    C3 --> Raft

    Raft --> StateDB

    C1 -.->|"Backup"| SW1
    C1 ==>|"Active"| SW2
    C1 -.->|"Backup"| SW3

    C2 ==>|"Active"| SW1
    C2 -.->|"Backup"| SW2
    C2 -.->|"Backup"| SW3

    C3 -.->|"Backup"| SW1
    C3 -.->|"Backup"| SW2
    C3 ==>|"Active"| SW3

    Failure["Failure Scenario:<br/>Controller-1 fails"]
    Recovery["Recovery:<br/>- Controllers 2&3 detect failure<br/>- New master elected (3-5s)<br/>- Switches reconnect automatically"]

    Failure -.-> C1
    Failure --> Recovery

    style C1 fill:#E67E22,stroke:#2C3E50,stroke-width:2px,color:#fff
    style C2 fill:#2C3E50,stroke:#16A085,stroke-width:2px,color:#fff
    style C3 fill:#2C3E50,stroke:#16A085,stroke-width:2px,color:#fff
    style Cluster fill:#ECF0F1,stroke:#2C3E50,stroke-width:2px
    style StateSync fill:#ECF0F1,stroke:#16A085,stroke-width:2px
    style Failure fill:#E74C3C,stroke:#2C3E50,stroke-width:2px,color:#fff
    style Recovery fill:#27AE60,stroke:#2C3E50,stroke-width:2px,color:#fff

Figure 291.2: Diagram of controller high availability setup with three controller nodes in a cluster

291.7 Clustering Strategies

291.7.1 1. Active-Standby (Simplest)

  • One controller active, others on standby
  • Standby takes over if active fails
  • Failover time: 5-30 seconds (heartbeat detection + state recovery)
  • Advantage: Simple, no state conflicts
  • Disadvantage: Standby resources wasted

291.7.2 2. Active-Active (ONOS approach)

  • All controllers active, network partitioned across controllers
  • Each controller manages subset of switches
  • Failover time: 3-5 seconds (just reassign switches)
  • Advantage: Better resource utilization
  • Disadvantage: Complex state synchronization

291.7.3 3. Distributed Hash Table (ODL approach)

  • Network state distributed across cluster using consistent hashing
  • Each controller owns portion of state space
  • Failover time: 2-5 seconds
  • Advantage: Scales to large clusters
  • Disadvantage: More complex programming model

291.7.4 Clustering Comparison

Strategy Failover Time Resource Efficiency Complexity Best For
Active-Standby 5-30s 50% (standby idle) Low Simple HA
Active-Active 3-5s 90%+ Medium Production
DHT-based 2-5s 85%+ High Large scale

291.8 State Synchronization

Controllers in a cluster must maintain consistent view of network state.

291.8.1 What Needs Synchronization

  • Topology: Which devices are connected, which links are active
  • Flow rules: What forwarding rules are installed on each switch
  • Statistics: Traffic counts, port status
  • Application data: Custom state maintained by applications

291.8.2 Synchronization Mechanisms

1. Raft consensus (ONOS uses this)

  • Leader election ensures one controller makes decisions
  • All state changes replicated to followers
  • Guarantees consistency but adds latency (~5-10ms)

2. Eventual consistency (Cassandra-style)

  • Changes propagate asynchronously
  • Faster but risk of temporary inconsistencies
  • Acceptable for non-critical data (statistics)

3. Distributed transactions (ODL MD-SAL)

  • Two-phase commit for critical operations
  • Strong consistency guarantee
  • Higher latency (~10-20ms)

291.8.3 Consistency vs Performance Tradeoffs

Mechanism Consistency Latency Use Case
Raft Strong +5-10ms Flow rules, topology
Eventual Weak +1-2ms Statistics, counters
2PC Strong +10-20ms Cross-domain operations

291.9 Switch-Controller Connections

Switches can connect to multiple controllers for redundancy.

291.9.1 OpenFlow Auxiliary Connections

Switch configuration:
- Primary controller: 10.0.0.1:6653
- Backup controller: 10.0.0.2:6653
- Backup controller: 10.0.0.3:6653

Behavior:
- Switch connects to all three controllers
- Primary controller is "master" (can modify flow tables)
- Backup controllers are "slave" (read-only access)
- If master fails, switch promotes backup to master

291.9.2 Failover Behavior

  1. Switch detects master failure (TCP connection drops or echo timeout)
  2. Switch sends OFPT_ROLE_REQUEST to backup controller: “Please become master”
  3. Backup controller (now master) sends OFPT_ROLE_REPLY: “I accept”
  4. Switch now accepts Flow-Mod from new master
  5. Total time: 3-10 seconds depending on echo interval

291.9.3 Controller Role Configuration

Role Flow-Mod Stats Packet-In Use Case
Master Yes Yes Yes Primary controller
Slave No Yes Optional Backup, monitoring
Equal Yes Yes Yes Load balancing (rare)

291.10 Selecting a Controller for IoT

Time: ~10 min | Difficulty: Intermediate | Unit: P04.C27.U05

Choosing the right controller depends on your deployment requirements.

291.10.1 Decision Matrix

Requirement Recommended Controller Rationale
Learning/Education Ryu Python, simplest API, best tutorials
Prototype/PoC Ryu or Floodlight Quick setup, good performance
Enterprise deployment OpenDaylight Comprehensive features, multi-protocol
High availability critical ONOS Best clustering, carrier-grade reliability
High performance ONOS or Floodlight 1M+ and 600K flows/sec respectively
Large scale (10K+ devices) ONOS Designed for scalability
Mixed network (IoT + legacy) OpenDaylight Supports most protocols
Cloud-native deployment ONOS Microservices architecture
On-premises/embedded Ryu or Floodlight Lightweight, lower resource usage

291.10.2 Real-World Example: AT&T Domain 2.0 Migration

Background: AT&T’s global network carries 197 petabytes daily across 135,000 route miles serving 340M+ connections. Traditional hardware-based network required 18-36 months to deploy new services.

SDN Migration (2013-2020): - Controller Platform: ONOS (Open Network Operating System) - Scale: 75% of network traffic virtualized by 2020 - Switches Managed: 65,000+ virtual and physical switches - Control Plane Instances: 500+ ONOS controller clusters (3-5 nodes each)

Results: - Service Deployment Time: 18 months -> 90 minutes (99.7% reduction) - Network Efficiency: 40-60% cost savings through software-defined routing - Reliability: 99.999% uptime (5.26 minutes downtime/year) despite centralized control - OPEX Reduction: $2B+ annual savings through automation and dynamic optimization

Key Technical Achievements: - Flow Rule Scale: Controllers manage 500K+ flow rules per cluster with <10ms flow setup latency - Failover Time: <2 seconds controller cluster failover with zero packet loss - Multi-Tenancy: Network slicing supports 50+ business units on shared infrastructure - Dynamic Routing: Real-time traffic engineering reroutes around congestion in <5 seconds vs. 15-30 minutes with traditional OSPF

IoT Implications: AT&T’s success demonstrates SDN scales to carrier-grade deployments with millions of endpoints. For IoT, key lessons include: - Proactive Flow Installation: Pre-install rules for known traffic patterns (sensors -> gateway) to avoid PACKET_IN overhead - Controller Clustering: 3-5 node clusters provide high availability without sacrificing performance - Hierarchical Control: Regional controllers manage local switches, report to centralized orchestrator - Policy-Based Management: Define high-level policies (“prioritize emergency services”) rather than per-device rules

291.11 Knowledge Check

Test your understanding of SDN APIs and high availability.

Question 1: An SDN controller cluster uses Raft consensus with 3 nodes. During a network partition, the cluster splits into [Controller-1] vs [Controller-2, Controller-3]. Which partition can continue accepting new flow installations?

Explanation: Raft consensus requires a quorum (majority of nodes) to commit new entries. With 3 nodes, quorum = ceiling(3/2) + 1 = 2 nodes minimum. The [Controller-2, Controller-3] partition has 2/3 nodes (majority) so it maintains quorum and can continue operations. The [Controller-1] partition has only 1/3 (minority) and becomes read-only. This prevents split-brain scenarios where two partitions accept conflicting updates.

Question 2: The northbound API and southbound API in an SDN controller serve different purposes. Which statement correctly describes their roles?

Explanation: SDN controllers have two interfaces: Northbound API faces upward toward applications (Traffic Engineering, Security, Load Balancer) using high-level abstractions like REST, gRPC, or NETCONF. Applications express intent like “block IP X” without knowing OpenFlow details. Southbound API faces downward toward switches using protocols like OpenFlow to install flow rules. The controller translates application intent into specific match-action rules for switches.

Question 3: A production SDN deployment uses active-active controller clustering. Adding more controllers to the cluster from 3 to 7 nodes will primarily achieve what?

Explanation: Controller clustering is primarily for high availability, not performance. A 3-node cluster tolerates 1 failure (2/3 quorum). A 7-node cluster tolerates 3 failures (4/7 quorum). However, performance may actually decrease with more nodes due to state synchronization overhead. Google’s B4 uses 3-5 node clusters for optimal balance. For performance scaling, use controller federation (divide network into domains) rather than larger clusters.

Question 4: An IoT gateway receives a packet that doesn’t match any flow rules in the switch. What sequence of events occurs in an SDN network?

Explanation: When a packet has no matching flow rule, the switch triggers reactive flow installation: (1) Switch sends Packet-In message to controller with packet header; (2) Controller queries topology database and computes best path; (3) Controller sends Flow-Mod to install rules on affected switches; (4) Controller may also send Packet-Out to forward the buffered packet. This is the fundamental SDN control loop. Typical latency: 20-50ms for reactive, <1ms if rules are proactively installed.

291.12 Summary

Key Takeaways:

  1. Northbound APIs (REST/gRPC) allow applications to program the network without OpenFlow knowledge
  2. Southbound APIs (OpenFlow/NETCONF) control network devices with standardized protocols
  3. OpenFlow messages (Packet-In, Flow-Mod, Stats-Request) form the control loop between controller and switches
  4. Clustering provides high availability (99.99%+) but at performance cost (10-25% slower)
  5. State synchronization uses Raft consensus for strong consistency or eventual consistency for performance
  6. Switch-controller connections with multiple controllers enable sub-10s failover

Practical Guidelines:

  • Use REST APIs for application integration, not direct OpenFlow manipulation
  • Deploy 3-node clusters for 99.99%+ uptime requirements
  • Configure switches with multiple controller connections (primary + 2 backups)
  • Use proactive flow installation for known IoT traffic patterns
  • Set appropriate flow timeouts: permanent for critical flows, idle-timeout for dynamic traffic

291.14 What’s Next

Now that you understand APIs and high availability:

Hands-on Practice:

  • Configure ONOS 3-node cluster and test failover
  • Write REST API client to query topology and install flows
  • Simulate controller failure with Mininet and measure recovery time