%%{init: {'theme': 'base', 'themeVariables': { 'primaryColor': '#2C3E50', 'primaryTextColor': '#fff', 'primaryBorderColor': '#16A085', 'lineColor': '#16A085', 'secondaryColor': '#E67E22', 'tertiaryColor': '#7F8C8D'}}}%%
graph TB
subgraph Apps[Application Layer]
Search[Search Index Sync<br/>High Priority]
Video[Video Distribution<br/>Medium Priority]
Backup[Data Backup<br/>Best Effort]
end
subgraph Controller[SDN Controller - CTE]
TE[Traffic Engineering]
Topo[Topology Manager]
BW[Bandwidth Allocator]
end
subgraph WAN[WAN Data Plane]
DC1[Data Center 1]
DC2[Data Center 2]
DC3[Data Center 3]
DC4[Data Center 4]
end
Apps -->|Traffic Demands| Controller
Controller -->|Flow Rules| WAN
DC1 <-->|Link Util 95%| DC2
DC2 <-->|Link Util 95%| DC3
DC3 <-->|Link Util 95%| DC4
DC4 <-->|Link Util 95%| DC1
style Apps fill:#7F8C8D,color:#fff
style Controller fill:#2C3E50,color:#fff
style WAN fill:#16A085,color:#fff
300 SDN Production Case Studies
300.1 Learning Objectives
By the end of this chapter, you will be able to:
- Analyze Google B4 WAN architecture achieving 95%+ link utilization with centralized traffic engineering
- Evaluate Barcelona smart city network slicing for 19,500 IoT sensors with differentiated QoS
- Understand Siemens industrial IoT deterministic scheduling using SDN with Time-Sensitive Networking
- Extract key lessons for applying SDN principles to your own IoT deployments
300.2 Prerequisites
Required Chapters: - SDN Production Framework - Enterprise architecture and controller platforms
Technical Background: - Controller clustering concepts - OpenFlow flow tables - Network slicing and QoS
Estimated Time: 15 minutes
Interactive Learning: - Simulations Hub - Explore network topology and traffic engineering simulations - Videos Hub - Watch case study presentations from Google and Siemens
Knowledge Assessment: - Quizzes Hub - Test your understanding of SDN deployment patterns
These case studies demonstrate how major organizations solve real networking challenges with SDN:
- Google B4: How to achieve near-100% link utilization through centralized control
- Barcelona: How to run multiple isolated services on shared infrastructure
- Siemens: How to guarantee sub-millisecond timing for industrial control
Each case study includes architecture diagrams, results, and lessons you can apply to your own projects.
300.3 Case Study 1: Google B4 WAN
Background: Google operates a global WAN connecting data centers for inter-DC traffic (e.g., search index replication, video distribution to edge caches). Traditional WAN routing achieved only 30-40% average link utilization due to conservative traffic engineering.
SDN Implementation: - Custom SDN controller (Central Traffic Engineering - CTE) - OpenFlow-based switches with centralized path computation - Bandwidth-aware routing with application-level priorities
Architecture:
Alternative View - Utilization Comparison:
%%{init: {'theme': 'base', 'themeVariables': { 'primaryColor': '#2C3E50', 'primaryTextColor': '#fff', 'primaryBorderColor': '#16A085', 'lineColor': '#16A085', 'secondaryColor': '#E67E22', 'tertiaryColor': '#7F8C8D'}}}%%
graph LR
subgraph Traditional["Traditional WAN Routing"]
T_Link["Link Capacity: 100 Gbps"]
T_Used["Used: 30-40 Gbps<br/>(conservative buffers)"]
T_Waste["Wasted: 60-70 Gbps<br/>Reserved for bursts"]
end
subgraph SDN_B4["Google B4 SDN Routing"]
S_Link["Link Capacity: 100 Gbps"]
S_Used["Used: 95 Gbps<br/>(dynamic allocation)"]
S_Reserve["Reserve: 5 Gbps<br/>Managed headroom"]
end
subgraph Savings["Annual Impact"]
Cost["3.17x Efficiency Gain<br/>$10M+ deferred CAPEX"]
Time["Recovery: <2s<br/>vs 15-30 min OSPF"]
end
T_Link --> T_Used
T_Used --> T_Waste
S_Link --> S_Used
S_Used --> S_Reserve
T_Waste -.->|"SDN Transforms"| S_Used
S_Reserve --> Cost
S_Reserve --> Time
style Traditional fill:#7F8C8D,color:#fff
style SDN_B4 fill:#16A085,color:#fff
style Savings fill:#E67E22,color:#fff
style T_Waste fill:#c0392b,color:#fff
300.3.1 Results
- 95%+ link utilization (vs. 30-40% traditional routing)
- Near-zero packet loss through congestion-aware routing
- Rapid failure recovery (<2 seconds) via centralized rerouting
- Cost savings: Better utilization = fewer links needed
300.3.2 Key Lessons for IoT
- Centralized visibility enables better resource allocation
- Application-aware routing improves QoS significantly
- SDN scales to massive networks (Google’s WAN is planetary scale)
300.4 Case Study 2: Smart City IoT - Barcelona
Background: Barcelona deployed 19,500 IoT sensors across the city for smart lighting, parking, environmental monitoring, and public Wi-Fi. Traditional network management couldn’t handle dynamic traffic patterns and multi-tenant isolation.
SDN Implementation: - OpenDaylight controller cluster (3 nodes) - Network slicing for different city services - QoS policies prioritizing emergency services over convenience apps
Network Slices:
%%{init: {'theme': 'base', 'themeVariables': { 'primaryColor': '#2C3E50', 'primaryTextColor': '#fff', 'primaryBorderColor': '#16A085', 'lineColor': '#16A085', 'secondaryColor': '#E67E22', 'tertiaryColor': '#7F8C8D'}}}%%
graph TB
subgraph Physical[Physical Infrastructure]
SW1[OpenFlow Switch 1]
SW2[OpenFlow Switch 2]
SW3[OpenFlow Switch 3]
end
subgraph Slice1[Emergency Services Slice - VLAN 100]
Fire[Fire Sensors]
Police[Police Systems]
end
subgraph Slice2[Environmental Monitoring - VLAN 200]
Air[Air Quality]
Noise[Noise Sensors]
end
subgraph Slice3[Convenience Services - VLAN 300]
Parking[Smart Parking]
Lighting[Street Lights]
end
Slice1 -->|High Priority<br/>QoS| Physical
Slice2 -->|Medium Priority| Physical
Slice3 -->|Best Effort| Physical
style Slice1 fill:#E67E22,color:#fff
style Slice2 fill:#16A085,color:#fff
style Slice3 fill:#7F8C8D,color:#fff
style Physical fill:#2C3E50,color:#fff
Alternative View - QoS Priority Enforcement:
%%{init: {'theme': 'base', 'themeVariables': { 'primaryColor': '#2C3E50', 'primaryTextColor': '#fff', 'primaryBorderColor': '#16A085', 'lineColor': '#16A085', 'secondaryColor': '#E67E22', 'tertiaryColor': '#7F8C8D'}}}%%
graph LR
subgraph Congestion["Network Congestion Event"]
BW["Available Bandwidth:<br/>100 Mbps total"]
end
subgraph Priority["QoS Priority Enforcement"]
P1["Emergency: 50 Mbps<br/>Guaranteed minimum"]
P2["Environmental: 30 Mbps<br/>Minimum reserve"]
P3["Convenience: 20 Mbps<br/>Best effort only"]
end
subgraph Result["Congestion Resolution"]
R1["Fire alert: Delivered<br/><50ms latency"]
R2["Air quality: Delivered<br/>~100ms latency"]
R3["Parking update: Queued<br/>Delayed 2 seconds"]
end
BW --> P1
BW --> P2
BW --> P3
P1 --> R1
P2 --> R2
P3 --> R3
style Congestion fill:#7F8C8D,color:#fff
style P1 fill:#E67E22,color:#fff
style P2 fill:#16A085,color:#fff
style P3 fill:#2C3E50,color:#fff
300.4.1 Results
- Energy savings: 30% reduction in street lighting costs via adaptive scheduling
- Traffic reduction: 20% fewer cars circling for parking
- Response time: Emergency services get guaranteed <50ms latency
- Multi-tenancy: Clean isolation between city departments
300.4.2 Key Lessons for IoT
- Network slicing essential for mixed-priority IoT workloads
- SDN enables dynamic policy updates without rewiring
- Centralized monitoring provides citywide visibility
300.5 Case Study 3: Industrial IoT - Siemens Factory
Background: Siemens manufacturing plant with 3,000 industrial IoT sensors (vibration monitors, temperature sensors, robotic arms) requiring deterministic latency and ultra-reliability (99.9999% uptime).
SDN Implementation: - ONOS controller with Time-Sensitive Networking (TSN) extensions - Scheduled traffic for deterministic flows (robotic control) - In-band Network Telemetry (INT) for microsecond-level monitoring
Deterministic Scheduling:
%%{init: {'theme': 'base', 'themeVariables': { 'primaryColor': '#2C3E50', 'primaryTextColor': '#fff', 'primaryBorderColor': '#16A085', 'lineColor': '#16A085', 'secondaryColor': '#E67E22', 'tertiaryColor': '#7F8C8D'}}}%%
gantt
title Time-Sliced TSN Traffic Schedule (Deterministic)
dateFormat X
axisFormat %L ms
section Robot Control
Robot Slot 1 :active, 0, 100
Robot Slot 2 :active, 200, 300
Robot Slot 3 :active, 400, 500
section Sensor Data
Sensor Slot 1 :crit, 100, 200
Sensor Slot 2 :crit, 300, 400
section Best Effort
Monitoring 1 :done, 150, 200
Monitoring 2 :done, 350, 400
300.5.1 Results
- 99.9999% uptime (5 minutes downtime per year)
- Deterministic latency: <1ms jitter for robotic control loops
- Predictive maintenance: SDN flow analytics detect anomalies before failures
- Production increase: 15% throughput improvement via optimized coordination
300.5.2 Key Lessons for IoT
- Industrial IoT demands determinism that SDN-TSN can provide
- Flow monitoring enables predictive analytics
- SDN complements rather than replaces domain-specific protocols (OPC-UA, Modbus)
300.6 Cross-Case Comparison
| Metric | Google B4 | Barcelona | Siemens |
|---|---|---|---|
| Scale | Planetary WAN | City-wide (19,500 sensors) | Factory (3,000 sensors) |
| Controller | Custom CTE | OpenDaylight | ONOS + TSN |
| Primary Goal | Utilization | Multi-tenancy | Determinism |
| Latency Target | Seconds | <50ms critical | <1ms |
| Availability | 99.999% | 99.99% | 99.9999% |
| Key Innovation | Traffic engineering | Network slicing | TSN scheduling |
300.7 Understanding Check
Scenario: Google’s WAN connects datacenters worldwide. Traditional OSPF routing achieves 30-40% average link utilization (conservative to avoid congestion). B4 SDN achieves 95%+ utilization. Two 10 Gbps links: Link A (shortest path) at 100% capacity, Link B (alternate) at 20%.
Think about: 1. Why does traditional routing leave Link B underutilized? 2. How does centralized controller rebalance traffic across both links? 3. Calculate throughput improvement: 30% vs 95% utilization on 100 Gbps total capacity
Key Insight: Traditional routing uses shortest-path only -> Link A overloaded (packet loss), Link B idle. Operators set conservative link weights -> average 30% utilization to avoid congestion. SDN solution: Controller sees global topology + real-time utilization. Routes 60% traffic on Link A, 40% on Link B -> both at ~60% -> no congestion, better utilization. Application-aware: high-priority traffic gets low-latency path, bulk transfers use alternate paths. Throughput: Traditional (30% util) = 30 Gbps used, 70 Gbps wasted. B4 SDN (95% util) = 95 Gbps used. 3.17x improvement. Annual savings: defer capacity expansion by 3+ years (~$10M).
Scenario: Barcelona smart city SDN manages emergency services (VLAN 100), environmental monitoring (VLAN 200), parking sensors (VLAN 300). 5,000 parking sensors simultaneously update, flooding network with 50 Mbps burst traffic. Emergency fire alarm must reach dispatch in <50ms.
Think about: 1. How do priority-based flow rules + queue scheduling ensure emergency latency? 2. Calculate queue servicing: strict priority Queue 1 vs best-effort Queue 3 3. Why doesn’t parking flood starve emergency traffic?
Key Insight: Controller installs priority-based rules: Emergency (priority=1000) -> Queue 1 (strict priority, 10 Mbps reserved). Parking (priority=100) -> Queue 3 (best-effort). Parking flood: 5,000 sensors -> 50 Mbps burst -> Queue 3 fills up -> packets delayed/dropped. Emergency packet arrives: Matches VLAN 100 rule -> Queue 1 -> switch services Queue 1 BEFORE Queue 3 -> fire alarm forwarded in <1ms despite Queue 3 congestion. Isolation: Physical infrastructure shared, but performance differentiated. Emergency gets guaranteed service, parking gets leftover bandwidth. Network slicing = virtual networks on shared hardware.
Scenario: Siemens factory robotic assembly line requires <1ms jitter for closed-loop control (robot arm updates position every 1ms). Traditional Ethernet: best-effort forwarding causes 0.1-10ms variable latency (100x jitter). SDN + TSN provides deterministic scheduling.
Think about: 1. How does time-triggered scheduling eliminate jitter? 2. Calculate bounded latency: robot packet arrives at 99us, scheduled window 0-100us 3. Why can’t traditional Ethernet provide deterministic guarantees?
Key Insight: TSN pre-allocates transmission windows synchronized across all switches (IEEE 802.1AS clock sync <1us accuracy). Schedule: Robot control: 0-100us (reserved), Sensor data: 100-200us, Best-effort: 200-1000us. Bounded latency: Worst case = packet arrives at 99us -> buffered 1us -> transmitted at 100us -> jitter bounded to 100us. Traditional Ethernet fails: Best-effort queuing -> robot packet waits for bulk data transfer -> 10ms delay -> control loop misses deadline -> robot positioning error. SDN role: Controller computes end-to-end schedule, installs time-triggered flow rules on all switches. Result: deterministic <1ms jitter for industrial control, AR/VR, medical devices.
300.8 Summary
This chapter examined three production SDN deployments demonstrating different IoT applications:
Key Takeaways:
Google B4: Centralized traffic engineering achieves 95%+ link utilization, 3.17x improvement over traditional routing
Barcelona Smart City: Network slicing isolates 19,500 sensors across emergency, environmental, and convenience services with differentiated QoS
Siemens Factory: SDN + TSN provides 99.9999% uptime with <1ms jitter for industrial robotic control
Common Themes: Centralized visibility, programmable policies, and application-awareness enable optimizations impossible with distributed routing
Scale Diversity: SDN applies from factory floors to planetary-scale WANs
Related Chapters: - SDN Production Framework - Controller platforms and architecture - SDN Production Best Practices - HA, security, monitoring, and optimization
300.9 What’s Next?
Continue to learn best practices for SDN production deployments including high availability, security hardening, and monitoring.