124  SDN Production Framework

In 60 Seconds

Production SDN deployments require a four-pillar framework: high-availability controller clusters (ONOS or OpenDaylight), proactive flow installation to avoid reactive latency, network segmentation for multi-tenant isolation, and continuous monitoring with sub-second failover between redundant controllers.

124.1 Learning Objectives

By the end of this chapter, you will be able to:

  • Design enterprise SDN architecture: Construct three-tier architectures with management, control, and data plane separation
  • Evaluate production controller platforms: Compare ONOS, OpenDaylight, Floodlight, and Ryu on performance, clustering, and IoT suitability
  • Apply deployment checklists: Prioritize high availability, TLS security, scalability, and monitoring requirements for production rollout
  • Select appropriate controllers: Match controller capabilities to specific IoT deployment requirements, scale, and team expertise

124.2 Prerequisites

Required Chapters:

Technical Background:

  • Control plane vs data plane
  • OpenFlow protocol
  • Network programmability

SDN Architecture Layers:

Layer Function Example
Application Business logic Load balancer
Control Network intelligence SDN controller
Infrastructure Forwarding OpenFlow switch

Estimated Time: 20 minutes

Cross-Hub Connections

This chapter connects to multiple learning resources:

Interactive Learning:

  • Simulations Hub - Try the Network Topology Visualizer to understand how SDN controllers optimize routing across different topologies
  • Videos Hub - Watch SDN deployment tutorials and controller configuration walkthroughs

Knowledge Assessment:

  • Quizzes Hub - Test your understanding of controller clustering and flow table optimization
  • Knowledge Gaps Hub - Review common misconceptions about SDN failover behavior

Reference Material:

  • Knowledge Map - See how SDN production practices connect to OpenFlow fundamentals and edge computing

Production SDN deployments require enterprise-grade considerations beyond basic SDN concepts:

  • High Availability: Multiple controllers ensure the network continues if one fails
  • Scalability: The system must handle thousands to millions of devices
  • Security: Control channels need encryption and authentication
  • Monitoring: Real-time visibility into network health and performance

This chapter introduces the frameworks and platforms that make production SDN possible. Start with the architecture overview, then explore specific controller platforms.

Common Misconception: “SDN Controller Failure Breaks All Network Traffic”

The Misconception: Many believe that when an SDN controller fails, the entire network goes offline immediately, making SDN unsuitable for production environments requiring high availability.

The Reality: OpenFlow switches maintain local flow tables that continue forwarding traffic independently of controller connectivity. Only NEW flows fail during controller outages.

Real-World Evidence from Barcelona Smart City Deployment:

Scenario: During a planned controller maintenance window, Barcelona’s SDN network (19,500 IoT sensors) experienced a 45-second controller outage while upgrading from OpenDaylight 0.8 to 0.9.

Actual Impact:

  • Existing flows: 18,200 active sensor connections (93.3%) continued operating normally through pre-installed flow rules
  • New flows: 127 new sensor boot-ups (0.65%) failed initial connection, automatically retried after controller recovery
  • Data loss: ZERO packets dropped for established flows
  • Recovery time: 8 seconds for all 127 sensors to reconnect after controller came back online
  • Total downtime: 0 seconds for 93.3% of devices, 53 seconds for 0.65% of devices

Key Lesson: Production SDN deploys use proactive flow installation (pre-populate rules for expected traffic patterns) + controller clustering (3-5 node redundancy) to achieve 99.99%+ availability. Google’s B4 WAN achieves 99.999% uptime (5 minutes/year downtime) using SDN, exceeding traditional routing reliability.

Best Practice: Install wildcard rules covering common traffic patterns (e.g., “all sensors -> gateway”) proactively. Reserve reactive PACKET_IN for unusual flows only.


Key Concepts
  • SDN (Software-Defined Networking): An architectural approach separating the network control plane (routing decisions) from the data plane (packet forwarding), centralizing control in a software controller for programmable network management
  • Control Plane: The network intelligence layer making routing and forwarding decisions, centralized in an SDN controller rather than distributed across individual switches as in traditional networking
  • Data Plane: The network forwarding layer physically moving packets based on rules installed by the control plane — in SDN, this is the switch hardware executing OpenFlow flow table entries
  • OpenFlow: The foundational SDN protocol enabling communication between an SDN controller and network switches, allowing the controller to install, modify, and delete flow table entries that govern packet forwarding
  • SDN Production Framework: A structured set of processes and tools covering controller cluster management, flow policy lifecycle, network monitoring, incident response, and change management for reliable SDN operations at scale
  • Controller High Availability: The configuration of SDN controller clusters with automatic failover ensuring network programming continues without manual intervention during individual controller node failures
  • Flow Policy Lifecycle: The full process from policy requirements through design, testing, staged rollout, monitoring, and retirement of SDN flow rules — analogous to software deployment lifecycle but for network policy

124.3 How It Works: ONOS Distributed Controller Cluster Operation

Step 1: Cluster Formation (Initial Deployment)

  • Three ONOS controller nodes (C1, C2, C3) initialize on separate physical servers
  • Nodes discover each other via configured IP addresses
  • Atomix consensus protocol (Raft-based) elects C1 as LEADER for network management
  • C2 and C3 become FOLLOWERS, maintaining synchronized state copies

Step 2: State Partitioning and Sharding

  • Network topology divided into “shards” - logical partitions of switches/flows
  • Shard 1 (Switches 1-50): managed by C1 (master), C2 (backup)
  • Shard 2 (Switches 51-100): managed by C2 (master), C3 (backup)
  • Shard 3 (Switches 101-150): managed by C3 (master), C1 (backup)
  • Load distribution: Each controller actively manages 1/3 of network

Step 3: Switch Connection and Master Selection

  • Switch S25 connects to cluster (tries C1, C2, C3 in configured order)
  • All three controllers accept connection, but S25 is in Shard 1
  • Cluster internally elects C1 as MASTER for S25 (C2 is BACKUP)
  • S25 receives flow rules from C1 only; C2 monitors but remains passive

Step 4: Flow Installation with Consensus

  • Application requests new flow rule via northbound API
  • C1 (master for S25) proposes rule to cluster
  • Raft consensus: C1 + C2 + C3 vote (requires 2/3 majority = quorum)
  • After consensus, C1 sends FLOW_MOD to S25
  • C2 and C3 record the flow in synchronized state store (for failover)

Step 5: Controller Failure and Mastership Migration

  • C1 crashes (hardware failure, maintenance)
  • C2 and C3 detect heartbeat timeout (1-2 seconds)
  • Cluster immediately promotes C2 to MASTER for Shard 1 (S25)
  • S25 TCP connection to C1 times out (~10s), reconnects to C2
  • Total failover time: ~12 seconds (heartbeat detection + TCP reconnect)

Step 6: Distributed State Recovery

  • C2 already has S25’s flow table state (synchronized via Raft)
  • No need to query switch or rebuild state from scratch
  • C2 resumes management where C1 left off
  • Application layer sees no interruption (northbound API calls continue)

Key Insight: Distributed clustering provides both scale and availability. Unlike active-standby (1 controller active, N idle), ONOS distributes load across all nodes while maintaining synchronized state. Each controller actively manages its shards, giving linear scaling (3 controllers = 3x throughput). Raft consensus ensures state consistency - even if 1 node has stale data, majority vote (2/3) prevails. This architecture powers carrier-grade networks handling millions of flows.

ONOS Performance Metrics (real-world production): - Cluster throughput: 1M+ flow setups/sec (3-node cluster) - Failover time: <2 seconds (detected via heartbeat + mastership migration) - State sync latency: ~10ms (Atomix distributed store replication) - Supported scale: 250+ switches, 50,000+ flows per controller node


124.4 Enterprise SDN Architecture

Production SDN deployments require careful attention to high availability, scalability, and security. Network Function Virtualization (NFV) complements SDN by virtualizing network services that traditionally ran on dedicated hardware.

NFV infrastructure architecture showing virtualized network functions running on commodity hardware with NFVI layer providing compute storage and networking resources, VIM managing virtual resources, and MANO orchestrating service lifecycle

NFV Infrastructure Architecture
Figure 124.1: NFV infrastructure enabling software-based network functions on commodity hardware

The following architecture illustrates a typical enterprise SDN deployment:

Enterprise SDN three-tier architecture diagram showing management plane with orchestration and policy, control plane with clustered SDN controllers, and data plane with OpenFlow switches performing packet forwarding
Figure 124.2: Enterprise SDN Three-Tier Architecture: Management, Control, and Data Planes

Alternative View - Horizontal Flow:

Horizontal flow diagram showing policy definition flowing left-to-right through controller cluster to data plane switches, with bidirectional state synchronization between controller nodes via Raft consensus
Figure 124.3: Horizontal flow view emphasizing the left-to-right control flow from policy definition through controller cluster to data plane hierarchy, with state synchronization between controller nodes shown explicitly.

Alternative View - Reactive Flow Installation:

Sequence diagram showing reactive flow installation where the first packet triggers PACKET_IN to the controller, which installs a flow rule, and subsequent packets are forwarded at line rate without controller involvement
Figure 124.4: Alternative view: Sequence diagram showing the reactive flow installation process. First packet triggers controller involvement (PACKET_IN), but subsequent packets match the installed rule and forward at line rate without controller interaction. This temporal view clarifies why SDN achieves high performance despite centralized control.

124.5 Production Deployment Checklist

Successful SDN production deployments require careful planning across multiple dimensions:

Category Requirement Priority Implementation Approach
High Availability Controller redundancy (3+ nodes) Critical Active-standby or distributed clustering
Security TLS for OpenFlow channels Critical Certificate-based authentication, encrypted control traffic
Scalability Horizontal controller scaling High Distributed controller architecture (ONOS, OpenDaylight)
Performance Flow setup latency <10ms High Proactive flow installation, local switch intelligence
Monitoring Flow statistics collection High Continuous telemetry export, anomaly detection
Backup Configuration state backup Medium Regular snapshots of flow rules and network state
Testing Controller failover drills Medium Automated chaos testing, planned failure scenarios
Documentation Network topology maps Medium Automated discovery and visualization tools

124.6 Production Controller Platforms

Several mature SDN controller platforms support production IoT deployments:

124.6.1 ONOS (Open Network Operating System)

Strengths:

  • Distributed architecture with built-in clustering (Atomix/Raft consensus)
  • Carrier-grade reliability (used by AT&T, Deutsche Telekom)
  • Intent-based northbound API for high-level policy specification
  • Strong performance: 1M+ flows/sec throughput per cluster

IoT Use Cases:

  • Large-scale smart city networks (street lighting, traffic sensors)
  • Industrial IoT with strict reliability requirements
  • Multi-tenant campus networks with network slicing

Deployment Example:

# ONOS cluster deployment (3 nodes)
onos-service 192.168.1.101 start
onos-service 192.168.1.102 start
onos-service 192.168.1.103 start

# Form cluster
onos-form-cluster 192.168.1.101 192.168.1.102 192.168.1.103

# Verify cluster state
onos> nodes
id=192.168.1.101, state=READY *
id=192.168.1.102, state=READY
id=192.168.1.103, state=READY

A 3-node ONOS cluster uses Raft consensus for state replication. Each flow installation requires majority agreement (2 of 3 nodes). Calculate consensus overhead:

\[\text{Consensus Latency} = \text{Leader propose (1ms)} + \text{Follower ACK (2ms)} + \text{Commit broadcast (1ms)} = 4 \text{ ms}\]

Without clustering, a single controller installs flows in 1ms. Clustering adds 4ms overhead (4× slower per flow). But for 10,000 flows/sec throughput:

\[\text{Single Controller} = 1 \text{ thread} \times \frac{1000 \text{ flows/sec}}{1 \text{ ms/flow}} = 1000 \text{ flows/sec (saturated at 10\% load)}\]

\[\text{3-Node Cluster} = 3 \text{ nodes} \times \frac{1000}{4} = 750 \text{ flows/sec per node, but 3 nodes = 2250 capacity}\]

Clustering achieves 2.25× total throughput despite 4× latency per flow – horizontal scaling compensates for consensus overhead.

124.6.2 OpenDaylight (ODL)

Strengths:

  • Modular plugin architecture (MD-SAL model-driven service abstraction layer)
  • Multi-protocol support: OpenFlow, NETCONF, RESTCONF, BGP-LS
  • Rich ecosystem of applications (NetVirt, SFC, IoTDM)
  • Java-based with strong enterprise integration

IoT Use Cases:

  • Heterogeneous IoT networks mixing protocols (Zigbee, LoRaWAN, IP)
  • Integration with legacy systems via NETCONF
  • NFV (Network Function Virtualization) for IoT edge processing

124.6.3 Floodlight

Strengths:

  • Lightweight, easy to deploy (Java-based)
  • REST API for application integration
  • Good performance for moderate scale (100K flows)
  • Active community with extensive documentation

IoT Use Cases:

  • Small-to-medium campus deployments (1,000-10,000 devices)
  • Research and prototyping environments
  • Edge SDN controllers for local intelligence

124.6.4 Ryu

Strengths:

  • Python-based, easy to customize and extend
  • Component-based architecture (clean separation of concerns)
  • Excellent for custom IoT applications
  • Well-suited for research and specialized deployments

IoT Use Cases:

  • Custom SDN applications for domain-specific IoT networks
  • Integration with Python-based IoT platforms (Home Assistant, Node-RED)
  • Machine learning integration for traffic prediction

124.7 Controller Comparison Matrix

Feature ONOS OpenDaylight Floodlight Ryu
Language Java Java Java Python
Architecture Distributed Modular Monolithic Component-based
Clustering Built-in (Atomix) Akka-based Limited Manual
Performance 1M+ flows/sec 500K flows/sec 100K flows/sec 50K flows/sec
Scalability Excellent Good Moderate Limited
Ease of Use Moderate Complex Easy Very Easy
IoT Support Strong Very Strong (IoTDM) Moderate Excellent (custom)
Community Active Very Active Active Active
Best For Carrier networks Enterprise, NFV Campus, research Custom apps, prototyping

124.8 Knowledge Check

A startup is building a smart agriculture IoT platform with 500 sensor nodes. They need rapid prototyping, custom machine-learning-based routing, and Python integration. Which SDN controller is most appropriate?

  1. ONOS – carrier-grade distributed architecture
  2. OpenDaylight – modular Java-based enterprise platform
  3. Ryu – Python-based with easy customization
  4. Floodlight – high-performance Java controller
Click for answer

Answer: C) Ryu – Python-based with easy customization

Ryu is the best fit because it is Python-based (matching their ML stack), easy to customize for domain-specific applications, and well-suited for moderate-scale deployments. While ONOS and OpenDaylight offer more enterprise features, they are overkill for 500 nodes and harder to integrate with Python-based ML pipelines.

During a planned SDN controller maintenance window, what happens to existing network flows on OpenFlow switches?

  1. All traffic stops immediately because switches cannot forward without the controller
  2. Existing flows continue forwarding normally; only new flows are affected
  3. Switches revert to traditional routing protocols automatically
  4. All flow tables are cleared and must be reprogrammed
Click for answer

Answer: B) Existing flows continue forwarding normally; only new flows are affected

OpenFlow switches maintain local flow tables that operate independently of controller connectivity. Pre-installed flow rules continue to match and forward packets at line rate. Only new flows that trigger PACKET_IN messages (requiring controller decision) will fail during the outage. This is why production deployments use proactive flow installation to minimize dependency on real-time controller availability.

When deploying SDN for a city-wide IoT network with 10,000+ sensors, which production requirement should be addressed FIRST?

  1. Flow statistics monitoring and dashboard setup
  2. Controller clustering with 3+ redundant nodes and TLS encryption
  3. Documentation of network topology maps
  4. Automated chaos testing and failover drills
Click for answer

Answer: B) Controller clustering with 3+ redundant nodes and TLS encryption

High availability and security are the two critical-priority items in the production checklist. Without controller redundancy, a single controller failure affects all new flows across 10,000+ devices. Without TLS, the control channel is vulnerable to man-in-the-middle attacks that could compromise the entire network. Monitoring, documentation, and testing are important but come after foundational reliability and security.

Production SDN is like building a real airport control tower – it needs backup systems, safety rules, and round-the-clock monitoring!

124.8.1 The Sensor Squad Adventure: Building the Control Tower

The Sensor Squad decided to build the BEST traffic control system for Sensor City. But Max the Microcontroller said, “We cannot just have ONE traffic controller. What if it breaks?”

So they built a Production System with four rules:

  1. Three Controllers (Not One!): Sammy, Lila, and Bella each run their own control desk. If Sammy falls asleep, Lila and Bella keep everything running. That is called a cluster!

  2. Pre-Written Rules: Instead of asking the controller for EVERY car, they wrote the most common rules down on cards at each traffic light. “If the light sees a bus, let it through.” Now even if ALL controllers nap, traffic still flows for most vehicles!

  3. Security Guards: Every message between the controller and the traffic lights is in a secret code (TLS encryption). No sneaky hackers can change the rules!

  4. Health Monitors: Little helper robots check every controller, every light, and every road sensor every few seconds. If something looks wrong, they blow a whistle!

Max said, “NOW we have a real production system. Not just a science project!”

124.8.2 Key Words for Kids

Word What It Means
Controller Cluster Multiple brain computers working together so one can rest while others work
Proactive Rules Pre-written instructions so things keep working even without the boss
TLS Encryption A secret code that keeps messages safe from bad guys
Monitoring Constantly checking that everything is healthy, like a doctor’s checkup

124.9 Worked Example: SDN Controller Cluster Sizing for a Smart Campus

Scenario: A university (12,000 students, 35 buildings) is deploying SDN to manage 8,500 IoT devices across its campus: 6,000 BLE beacons for indoor navigation, 1,500 IP cameras, 800 environmental sensors, and 200 access points. The network has 120 OpenFlow switches.

Step 1 – Flow Rate Estimation:

Device Type Count New flows/sec (peak) Flow table entries
BLE beacons (via gateways) 6,000 50 (aggregated) 200
IP cameras (HD streams) 1,500 300 1,500
Environmental sensors 800 15 400
Access points (student traffic) 200 2,000 12,000
Total 8,500 2,365 flows/sec 14,100

Step 2 – Controller Selection:

  • Peak flow rate: 2,365 flows/sec (with 3x headroom for class changes = 7,095 flows/sec)
  • Ryu: 50K flows/sec – sufficient but no HA, single-threaded
  • Floodlight: 600K flows/sec – sufficient, Java-based
  • ONOS: 1M+ flows/sec – selected for HA clustering and carrier-grade resilience

Step 3 – Cluster Sizing (ONOS):

  • 3 controller nodes (minimum for consensus – RAFT protocol needs 2N+1 for N failures)
  • Each node: 8-core CPU, 32 GB RAM, 500 GB SSD
  • Hardware cost: 3 x GBP 4,200 = GBP 12,600
  • Annual licensing: GBP 0 (ONOS is open-source)
  • Controller-to-switch latency target: <10 ms (achieved with campus-local deployment)

Step 4 – Failover Budget:

Failure Scenario Recovery Time Impact
1 controller failure 1.2 sec (RAFT leader election) No packet loss – flows cached in switches
2 controller failure 3.5 sec (degraded consensus) 0.02% flows affected (new flows only)
All 3 controllers down Switches operate on cached rules No new flows; existing traffic continues
Switch failure 0 sec (redundant paths pre-installed) Traffic rerouted via backup path

Key Insight: The 120 switches each have 4,000-entry TCAM flow tables (total capacity: 480,000 entries). With only 14,100 entries needed, the campus uses just 2.9% of switch capacity. This headroom allows proactive installation of backup paths, meaning the network survives even a total controller outage for hours.

124.9.1 Interactive: SDN Controller Cluster Sizing

Estimate controller cluster requirements and TCAM utilization for your IoT deployment.

Key Takeaway

In one sentence: Production SDN requires a four-pillar framework of controller clustering, proactive flow installation, security hardening, and continuous monitoring to achieve enterprise-grade reliability.

Remember this rule: Existing flows survive controller outages because switches maintain local flow tables – production SDN focuses on minimizing the impact of new-flow failures during failover.


Your Mission: A manufacturing company needs an SDN controller for their smart factory with these requirements:

Deployment Scenario:

  • 8,000 IoT devices: 5,000 sensors (temp, vibration, pressure), 2,000 PLCs (Programmable Logic Controllers), 1,000 IP cameras
  • 60 OpenFlow switches across 3 factory buildings
  • Mixed protocols: MQTT (sensors), Modbus/TCP (PLCs), RTSP (cameras)
  • Strict requirements: <10ms control latency, 99.99% uptime, support for future protocols

Controller Options:

  1. ONOS - Distributed clustering, carrier-grade, complex setup
  2. OpenDaylight - Multi-protocol, modular, steep learning curve
  3. Floodlight - Simple, lightweight, limited scale
  4. Ryu - Python-based, highly customizable, requires development

Step 1: Analyze Requirements vs Capabilities Create a decision matrix:

Requirement ONOS OpenDaylight Floodlight Ryu
Scale (8K devices) ✓ (1M+ flows) ✓ (100K+) ? (10K limit) ? (custom)
HA (99.99%) ✓ (built-in clustering) ✓ (manual setup) ✗ (active-standby only) ✗ (DIY)
Multi-protocol ? ✓ (NETCONF plugins) ? ✓ (Python flexibility)
Latency <10ms ?
Team expertise Java required Java required Java required Python (easier)

Step 2: Calculate Estimated Deployment Costs

  • ONOS: 3-node cluster, 2 weeks training, $8K consulting
  • OpenDaylight: 3-node cluster, 3 weeks training + integration, $15K consulting
  • Floodlight: Single-node, 1 week training, $2K consulting
  • Ryu: 2-node (manual HA), 1 week training + 4 weeks custom dev, $12K development

Step 3: Risk Assessment

  • Can Floodlight handle 8,000 devices? (Check: 8K flows << 100K benchmark = yes, but no HA)
  • Does factory have Java developers for ONOS/ODL maintenance?
  • Can Ryu team deliver custom multi-protocol support in 3 months?

What to Observe:

  • Floodlight is cheapest ($2K) but FAILS on 99.99% uptime requirement (no clustering)
  • ONOS wins on scale + HA but requires Java expertise
  • OpenDaylight best for multi-protocol (Modbus plugin available) but highest cost ($15K)
  • Ryu attractive for Python teams but risky timeline (custom development)

Challenge Extension:

  • Factory plans to expand to 25,000 devices in 2 years
  • Re-evaluate: Does Floodlight still work? (25K > 100K limit = NO)
  • What’s the migration cost if you start with Floodlight and outgrow it?

Expected Outcome: You’ll learn there’s no “best” controller - only the best fit for your requirements, team, and budget. Production deployments often start with simpler controllers (Floodlight for PoC), then migrate to distributed platforms (ONOS/ODL) when scale demands it. The $13K difference between ONOS and ODL often comes down to “do you need multi-protocol now or can you add it later?”


124.10 Concept Relationships

This Concept Relates To Relationship Type Why It Matters
Controller Clustering High Availability Redundancy Pattern Distributed clustering (ONOS) provides both scalability (load distribution) and availability (automatic failover) - 3-node cluster tolerates 1 failure
Proactive Flow Installation Controller Outage Resilience Reliability Strategy Pre-installed flow rules enable switches to forward traffic during controller downtime - data plane continues while control plane recovers
Raft Consensus State Synchronization Distributed Algorithm Requires (N/2+1) quorum for decisions - prevents split-brain where conflicting controllers install different rules to same switch
Northbound API SDN Applications Abstraction Interface REST/Python APIs decouple applications from controller internals - change ONOS to OpenDaylight without rewriting apps (if using standard API)
TCAM Limitation Controller Platform Selection Hardware Constraint Switches with 2K-8K TCAM entries require controllers with efficient rule aggregation - ONOS/ODL have better wildcard support than basic controllers

124.11 See Also

Next Steps - Deep Dives:

Related Concepts:


Place these steps in the correct order for establishing a production SDN framework.

Common Pitfalls

Deploying new SDN flow policies without a tested rollback procedure. When a policy change causes unexpected traffic drops, the ability to revert within 5 minutes prevents extended outages. Document and test rollback for every planned flow policy change.

Managing an SDN deployment without real-time topology visualization showing controller connectivity status, switch-to-switch links, and flow table utilization. Operating SDN blind makes incident detection slow and root cause analysis very difficult. Deploy topology visualization before production launch.

Not modeling controller CPU and memory requirements for the target device count and flow installation rate. An undersized controller cluster becomes a bottleneck during IoT fleet expansion events, delaying new device onboarding and causing flow installation timeouts.

Using only traditional network SLOs (uptime, packet loss) without SDN-specific metrics (flow installation latency p99, controller sync lag, table miss rate). SDN-specific SLOs enable early detection of degradation before it affects IoT device communication.

124.12 Summary

This chapter covered the foundational elements for production SDN deployments:

Key Takeaways:

  1. Three-Tier Architecture: Management, Control, and Data planes with clear separation of concerns

  2. High Availability: Controller clustering with 3+ nodes ensures resilience during failures

  3. Controller Platforms: ONOS for carrier-grade, OpenDaylight for enterprise/NFV, Floodlight for mid-scale, Ryu for custom applications

  4. Deployment Checklist: Critical items include TLS security, controller redundancy, monitoring, and failover testing

  5. Controller Failure Resilience: Existing flows continue via installed rules; only new flows require controller connectivity

Related Chapters:

124.13 Knowledge Check

124.14 What’s Next

If you want to… Read this
Review SDN production best practices SDN Production Best Practices
Study SDN production case studies SDN Production Case Studies
Explore SDN analytics and implementations SDN Analytics and Implementations
Learn about production architecture management Production Architecture Management
Study IoT reference architectures IoT Reference Architectures