Stream processing analyzes data in motion rather than at rest, enabling IoT systems to make real-time decisions with millisecond latency. Windowing strategies divide infinite data streams into finite chunks: tumbling windows (non-overlapping) for periodic reports, sliding windows (overlapping) for trend detection at 5-10x memory cost, and session windows (activity-based) for grouping bursty events. A smart factory with 10,000 sensors at 100 messages/second generates 1 GB per 10-second window, making memory-aware window design critical.
Differentiate between batch processing and stream processing paradigms in IoT contexts
Classify windowing strategies (tumbling, sliding, session) by their memory cost, overlap behavior, and IoT applicability
Calculate memory and compute requirements for different windowing strategies
Design real-time IoT data pipelines with appropriate window types and stages
Handle late-arriving data and out-of-order events using watermarks and event-time semantics
Apply stream processing concepts to real-world IoT scenarios with production constraints
3.2 Introduction
A modern autonomous vehicle generates approximately 1 gigabyte of sensor data every second from LIDAR, cameras, radar, GPS, and inertial measurement units. If you attempt to store this data to disk and then query it for analysis, you’ve already crashed into the obstacle ahead. The vehicle needs to make decisions in under 10 milliseconds based on continuously arriving sensor streams.
This is the fundamental challenge that stream processing solves: processing data in motion, not data at rest. Traditional batch processing analyzes historical data after it’s been collected and stored. Stream processing analyzes data as it arrives, enabling real-time decisions, alerts, and actions.
In IoT systems, stream processing has become essential infrastructure. Whether you’re monitoring industrial equipment for failure prediction, balancing electrical grids in real-time, or detecting fraud in payment systems, the ability to process continuous data streams with low latency determines system effectiveness.
Key Takeaway
In one sentence: Process data as it arrives - waiting for batch windows means missing the moment when action could have prevented the problem.
Remember this rule: If the value of your insight degrades over time, use stream processing; if you need the complete picture, use batch.
For Beginners: Batch vs Stream
Think of batch processing like counting votes after an election closes. You wait until all ballots are collected, then count them all at once to determine the winner. This works well when you can afford to wait.
Stream processing is like a live vote counter that updates the tally in real-time as each ballot arrives. You see the current state continuously, even before voting ends. This is essential when waiting isn’t an option.
In IoT, batch processing might analyze yesterday’s temperature readings to find trends. Stream processing monitors temperature readings as they arrive to trigger an immediate alert if your server room overheats.
For Kids: Meet the Sensor Squad!
Stream processing is like having a lifeguard who watches the pool RIGHT NOW, instead of checking photos from yesterday!
3.2.1 The Sensor Squad Adventure: The Never-Ending River Race
The Sensor Squad has a new mission: watching the Data River, where information flows like water - constantly moving, never stopping! The question is: how should they watch it?
Coach Batch has one idea: “Let’s collect ALL the water in buckets throughout the day. At midnight, we’ll look at all the buckets together and see what we collected!” This is called batch processing - wait until you have a big pile of data, then look at it all at once.
But Sammy the Temperature Sensor is worried. “What if the water gets too hot RIGHT NOW? By midnight, it might be too late!”
Captain Stream has a different idea: “Let’s taste the water as it flows by - every single second!” This is stream processing - looking at each drop of data the moment it arrives.
To test both ideas, they set up an experiment at the Data River:
The Batch Team (Coach Batch, Pressi the Pressure Sensor, and Bucketbot): - Collected water samples in buckets all day - At midnight, they discovered: “Oh no! The water was dangerously hot at 2:15 PM!” - But it’s now midnight - that was 10 hours ago! The fish have already left the river.
The Stream Team (Sammy, Lux the Light Sensor, and Motio the Motion Detector): - Watched the water every second - At 2:15 PM exactly, Sammy shouted: “ALERT! Temperature spiking NOW!” - They immediately opened the shade covers to cool the river - The fish stayed happy and safe!
“See the difference?” explains Captain Stream. “Batch processing is great when you can WAIT - like figuring out what the most popular ice cream flavor was last month. But stream processing is essential when you need to act FAST - like knowing when the ice cream truck arrives so you can run outside!”
The Sensor Squad learned an important lesson: not all data problems are the same!
Use Batch when: You’re doing homework about last week’s weather patterns
Use Stream when: You need to know if it’s raining RIGHT NOW so you can grab an umbrella
Motio the Motion Detector added one more example: “It’s like fire alarms! A batch fire alarm would check for smoke once a day at midnight - terrible idea! A stream fire alarm checks CONSTANTLY - that’s how you stay safe!”
3.2.2 Key Words for Kids
Word
What It Means
Stream Processing
Looking at data the instant it arrives - like watching a movie as it plays instead of waiting until it ends
Batch Processing
Collecting lots of data into a pile, then looking at it all at once later - like saving all your mail and reading it every Sunday
Real-Time
Happening right now, with almost no delay - like a live video call with grandma
Latency
How long you have to wait for something - low latency means fast, high latency means slow
3.2.3 Try This at Home!
The Temperature Detective Game!
This experiment shows the difference between batch and stream processing:
What you need:
A glass of water
Ice cubes
A timer
A pencil and paper
Batch Method (try this first):
Put ice in the water
Set a timer for 10 minutes
DON’T touch or look at the water!
After 10 minutes, feel the water temperature
Write down: “After 10 minutes, it was _____ cold”
Stream Method (try this second):
Put new ice in fresh water
Touch the water every 30 seconds for 10 minutes
Write down the temperature feeling each time: “30 sec: a little cold, 1 min: colder, 1:30: really cold…”
What you’ll discover:
Batch method: You only know ONE thing - how cold it was at the end
Stream method: You know EXACTLY when it got coldest and when it started warming up again!
Think about it:
Which method would you use if you wanted to drink the water at the PERFECT coldness?
Which method would you use if you just wanted to know if the ice melted?
What if the ice had a dangerous chemical and you needed to know the INSTANT it started melting?
Real-world connection:
Doctors use stream processing to monitor your heartbeat - they need to know RIGHT AWAY if something is wrong!
Weather scientists use batch processing to study climate patterns from the last 100 years - there’s no rush!
Common Misconception: “Real-Time Means Instant”
The Misconception: Real-time processing means zero latency, immediate results.
Why It’s Wrong:
“Real-time” means predictable, bounded latency - not zero
Hard real-time: Guaranteed max latency (industrial control)
Dashboard “real-time”: Updates every 5 seconds (near real-time)
All called “real-time” but vastly different guarantees
The Correct Understanding: | Term | Latency | Guarantee | Example | |——|———|———–|———| | Hard real-time | <10ms | Deterministic | PLC control | | Soft real-time | <100ms | Statistical (99.9%) | VoIP, online gaming | | Near real-time | <1s | Best effort | Analytics dashboards | | Batch | Minutes-hours | None | Daily reports |
Specify latency requirements numerically. “Real-time” is ambiguous.
3.3 Why Stream Processing for IoT?
⏱️ ~10 min | ⭐⭐ Intermediate | 📋 P10.C14.U01
IoT applications have stringent latency requirements that batch processing cannot satisfy. Consider these real-world constraints:
Application Domain
Required Latency
Consequence of Delay
Autonomous vehicles
<10 milliseconds
Collision, injury, death
Industrial safety systems
<100 milliseconds
Equipment damage, worker injury
Smart grid load balancing
<1 second
Blackouts, equipment failure
Predictive maintenance
<1 minute
Unexpected downtime, cascading failures
Financial fraud detection
<5 seconds
Monetary loss, regulatory penalties
Healthcare monitoring
<10 seconds
Patient deterioration, delayed intervention
Beyond latency, stream processing offers several advantages for IoT workloads:
Memory Efficiency: Process data as it arrives rather than storing massive datasets
Continuous Insights: Monitor changing conditions in real-time
Event-Driven Actions: Trigger immediate responses to critical conditions
Temporal Accuracy: Analyze data based on when events actually occurred
Infinite Data Handling: Process unbounded streams that never “complete”
Batch vs Stream Processing Timeline
Figure 3.1: Batch vs Stream Processing Timeline Comparison
Knowledge Check: Stream vs Batch Processing
Alternative View: Lambda Architecture for Hybrid Processing
This architecture diagram shows how real-world IoT systems often combine batch and stream processing in a Lambda Architecture, using each paradigm for its strengths.
Figure 3.2: Lambda Architecture combining stream and batch processing: IoT data enters through Kafka, flowing simultaneously to Speed Layer (Flink for sub-second alerts on recent data) and Batch Layer (Spark for accurate historical aggregates on all data). The Serving Layer merges both views to provide dashboards with both real-time responsiveness and historical accuracy. This hybrid approach acknowledges that stream processing optimizes for latency while batch processing optimizes for correctness and completeness.
Figure 3.3: Understanding the fundamental difference between batch and stream processing is crucial for IoT system design. Batch processing excels when historical analysis and complex computations are acceptable with hours of latency. Stream processing is essential when millisecond-to-second latency determines system value, such as fraud detection, anomaly alerts, or autonomous vehicle control. Many production IoT systems use both: stream processing for real-time alerts and batch processing for model retraining and trend analysis.
Tradeoff: Batch Processing vs Stream Processing for IoT Analytics
Option A: Batch Processing - Process accumulated data in scheduled intervals - Latency: Minutes to hours (scheduled job completion) - Throughput: Very high (1M+ events/second during batch window) - Resource cost: Burst usage (pay only during processing) - Complexity: Lower (simpler failure recovery, replay from source) - Use case examples: Daily reports, ML model training, historical trend analysis - Infrastructure: Spark on EMR/Dataproc ~$0.05/GB processed
Option B: Stream Processing - Process each event as it arrives - Latency: Milliseconds to seconds (continuous) - Throughput: High (100K-10M+ events/second sustained, depending on framework) - Resource cost: Always-on (24/7 compute allocation) - Complexity: Higher (state management, exactly-once semantics, late data handling) - Use case examples: Real-time alerts, fraud detection, live dashboards - Infrastructure: Kafka + Flink managed ~$500-2,000/month base cost
Decision Factors:
Choose Batch when: Insight value doesn’t degrade within hours, complex ML models need full dataset context, cost optimization is critical (pay per job), regulatory reports have fixed deadlines (daily/monthly)
Choose Stream when: Alert value degrades in seconds (equipment failure, security breach), user-facing dashboards require <10 second updates, event-driven actions needed (automated shutoffs), compliance requires real-time audit trails
Lambda Architecture (Hybrid): Stream for real-time approximations + batch for accurate historical corrections (most production IoT systems use this pattern)
3.3.1 Stream Processing Architecture Overview
Moving from paradigm comparison to system design, modern stream processing systems combine multiple architectural components to handle real-time IoT data at scale. Figure 3.4 contrasts the data flow patterns, while subsequent diagrams show how these concepts translate into production architectures.
Figure 3.4: The geometric representation highlights how data volume and velocity differ between paradigms. Batch systems optimize for throughput by processing large chunks efficiently, while stream systems optimize for latency by maintaining minimal buffers and processing events individually or in micro-batches.
A complete stream processing system includes three layers: ingestion (receiving data from sensors and APIs), processing (applying transformations, aggregations, and windowing), and output (writing results to databases, dashboards, or alerting systems).
Figure 3.5: End-to-end stream processing architecture from ingestion to action
Within the processing layer, data flows through a multi-stage pipeline. Each stage performs a specific transformation, with checkpointing between stages to ensure fault tolerance. Backpressure mechanisms prevent fast producers from overwhelming slow consumers.
Figure 3.6: Stream processing pipeline with fault-tolerant checkpointing
The stream processing topology defines how operators connect to transform data. Each operator performs a specific function – filtering, mapping, or aggregating – with data flowing between operators as unbounded streams. The topology forms a directed acyclic graph (DAG) that can be parallelized across multiple processing nodes.
Figure 3.7: Stream processing operator topology for scalable event processing
3.4 Core Concepts
⏱️ ~15 min | ⭐⭐⭐ Advanced | 📋 P10.C14.U02
3.4.1 Event Time vs Processing Time
One of the fundamental concepts in stream processing is the distinction between event time and processing time:
Event Time: The timestamp when the event actually occurred (e.g., when a sensor took a reading)
Processing Time: The timestamp when the system processes the event (e.g., when the server receives the data)
Example: A temperature sensor in a remote oil pipeline takes a reading at 14:00:00 showing 95°C. However, due to intermittent network connectivity, this reading doesn’t reach your processing system until 14:05:23. The event time is 14:00:00 (when the temperature was actually 95°C), but the processing time is 14:05:23.
For IoT applications, event time is typically more meaningful because you want to understand when conditions actually changed, not when you learned about them. However, this creates challenges:
Out-of-order events: Events may arrive in a different order than they occurred
Late data: Events may arrive long after they occurred
Clock skew: Different sensors may have slightly different clock times
Event time versus processing time
Figure 3.8: Event time versus processing time showing network and processing delays causing 5-minute difference between sensor reading timestamp and system processing timestamp
Understanding Windowing
Core Concept: Windowing divides infinite data streams into finite, bounded chunks that can be aggregated and analyzed. Without windows, you cannot compute “average temperature” because the stream never ends.
Why It Matters: The window type you choose determines both your system’s latency and memory requirements. Tumbling windows (non-overlapping) use minimal memory but can miss patterns that span window boundaries. Sliding windows (overlapping) catch boundary-spanning events but multiply memory usage by the overlap factor. Session windows (activity-based) adapt to irregular data patterns but complicate capacity planning.
Key Takeaway: Match window type to your analytics goal. Use tumbling for periodic reports (hourly billing summaries), sliding for trend detection (moving averages that update smoothly), and session for activity analysis (user sessions, machine operation cycles). A 10-second sliding window with 1-second advance uses 10x the memory of a tumbling window of the same size.
Putting Numbers to It
Why does sliding window memory explode? A 10-second sliding window advancing every 1 second maintains 10 overlapping windows simultaneously. For 1,000 sensors at 10 Hz (10 readings/sec):
At 100 bytes per reading: tumbling = 10 MB, sliding = 100 MB (10x memory cost). Scale to 10,000 sensors: tumbling = 100 MB vs sliding = 1 GB. Choose window size carefully based on RAM constraints.
Interactive: Stream Processing Window Types
Interactive: Event Time vs Processing Time
3.4.2 Windowing Strategies
With the need to bound infinite streams established, let’s examine the three primary windowing strategies in detail: tumbling, sliding, and session windows. Each makes different tradeoffs between memory usage, latency, and analytical accuracy.
3.4.2.1 Tumbling Windows
Non-overlapping, fixed-size windows that cover the entire stream without gaps.
Use Case: Calculate average temperature every 5 minutes
Tumbling windows diagram
Figure 3.9: Tumbling windows showing non-overlapping 5-minute intervals with events assigned to single windows based on timestamp
Characteristics:
Each event belongs to exactly one window
Simple to implement and reason about
Fixed computation frequency
Best for periodic aggregations
3.4.2.2 Sliding Windows
Overlapping windows that slide by a specified interval, smaller than the window size.
Use Case: Calculate moving average of last 10 minutes, updated every 1 minute
Sliding windows diagram
Figure 3.10: Sliding windows showing overlapping 10-minute windows advancing by 1-minute increments with single event appearing in multiple windows
Characteristics:
Events may belong to multiple overlapping windows
Provides smoother, continuous results
Higher computational cost (more windows to maintain)
Best for trend detection and moving averages
3.5 Tradeoff: Tumbling Windows vs Sliding Windows for IoT Aggregation
Option A: Tumbling Windows (Non-overlapping)
Memory usage: Low - only 1-2 windows active per key at any time
Memory for 10K sensors, 5-min window: ~200 MB aggregate state (1 window of running stats per sensor, ~20 KB each)
Computational cost: O(n) - each event processed once
Choose Tumbling when: Memory is constrained (edge devices, many keys), exact aggregation boundaries matter (hourly billing), processing cost must be minimized, output frequency matches business requirements
Hybrid approach: Tumbling for storage (persist hourly aggregates), sliding for real-time display (smooth visualization) - Flink supports both on the same stream
3.5.0.1 Session Windows
Dynamic windows based on activity periods, separated by gaps of inactivity.
Use Case: Group all sensor readings from a machine during an active work session
Session windows diagram
Figure 3.11: Session windows showing dynamic window sizes based on activity with 5-minute inactivity timeout separating three distinct sessions
Characteristics:
Window size varies based on data patterns
Automatically adapts to activity levels
Ideal for user behavior and machine operation cycles
Best for analyzing complete work sessions or usage periods
Key Insight: Windows Bound Infinite Streams
Figure 3.12: Windowing Transforms Unbounded Streams into Finite Aggregatable Chunks
In Plain English: You can’t compute “average temperature” over infinite data - it never ends! Windows create finite chunks you can actually process. “Average temperature in the last 5 minutes” is computable.
Figure 3.13: Window type comparison: Tumbling (fixed non-overlapping), Sliding (fixed overlapping for smooth trends), Session (variable gap-based for activity bursts)
Figure 3.14: IoT temperature sensor windowing example: Tumbling for periodic averages, Sliding for trend detection, Session for machine operation analysis
Figure 3.15: Late data handling: Events arriving after window close can update results during grace period or be dropped/side-output if grace expires
Figure 3.16: Decision tree for selecting the appropriate window type based on use case requirements
Figure 3.17: Memory footprint comparison showing how sliding windows consume significantly more memory than tumbling windows due to overlap
Try It: IoT Stream Processing with Sliding and Tumbling Windows
Objective: Simulate real-time IoT sensor stream processing in Python using asyncio. This demonstrates tumbling windows (non-overlapping aggregation for periodic reports), sliding windows (overlapping for smooth trend detection), and event-time vs processing-time semantics. All concepts map directly to Apache Flink/Kafka Streams equivalents.
Tumbling Window – non-overlapping, fixed-size windows where each reading belongs to exactly one window:
import timefrom collections import dequeclass TumblingWindow:def__init__(self, window_seconds):self.window_seconds = window_secondsself.current_start =Noneself.buffer= []def process(self, reading): event_time = reading["event_time"] window_start =int(event_time /self.window_seconds) *self.window_secondsifself.current_start and window_start !=self.current_start: result =self._emit()self.buffer= [reading]self.current_start = window_startreturn resultself.current_start = window_startself.buffer.append(reading)returnNonedef _emit(self): temps = [r["temperature"] for r inself.buffer]return {"count": len(temps),"avg_temp": round(sum(temps) /len(temps), 1),"min_temp": min(temps), "max_temp": max(temps)}
Sliding Window – overlapping windows with configurable slide interval; each reading can appear in multiple windows:
class SlidingWindow:def__init__(self, window_seconds, slide_seconds):self.window_seconds = window_secondsself.slide_seconds = slide_secondsself.buffer= deque()self.last_emit =0def process(self, reading):self.buffer.append(reading) now = reading["event_time"]# Evict readings older than window cutoff = now -self.window_secondswhileself.bufferandself.buffer[0]["event_time"] < cutoff:self.buffer.popleft()if now -self.last_emit >=self.slide_seconds:self.last_emit = now temps = [r["temperature"] for r inself.buffer]return {"readings": len(temps),"moving_avg": round(sum(temps)/len(temps), 1)}returnNone
Key difference: Tumbling windows use O(window_size) memory with no overlap. Sliding windows buffer O(window_size / slide_interval) more readings but produce smoother trends. In production, Flink/Kafka Streams manage this automatically.
What to Observe:
Tumbling windows produce one result per 5-second interval – each reading counted exactly once (ideal for billing, periodic reports)
Sliding windows update every 2 seconds with a 10-second lookback – readings appear in up to 5 overlapping windows (ideal for smooth dashboards, anomaly detection)
The sliding window uses ~5x more memory than the tumbling window because of the overlap factor (10s / 2s = 5)
Event time is when the sensor measured, arrival time includes network delay – using event time ensures correct aggregation even with delayed readings
Late-arriving data (readings that arrive after their window closed) would need a grace period or side-output in production systems
3.5.1 Interactive: Window Memory Calculator
Use this calculator to explore how windowing parameters affect memory requirements for your IoT stream processing deployment. Adjust the sliders to match your scenario and observe how memory scales with different window strategies.
3.5.2 Worked Example: Kafka Windowing for Industrial Sensor Analytics
Real-world stream processing requires careful window design to balance memory usage, latency, and analytical accuracy. This example demonstrates how to calculate resource requirements for different windowing strategies in a high-throughput industrial IoT scenario.
Worked Example: Stream Processing Window Design
Context: A smart factory with 10,000 sensors monitoring industrial equipment for predictive maintenance and anomaly detection.
Given:
Parameter
Value
Number of sensors
10,000 devices
Message rate per sensor
100 messages/second
Total message rate
1,000,000 messages/second (1M msg/sec)
Message size (JSON payload)
100 bytes
Analytics goal
Detect anomalies within 10-second windows
Available memory per Kafka Streams instance
16 GB
Required latency
<10 seconds for anomaly detection
Problem: Design an appropriate windowing strategy that fits within memory constraints while meeting latency requirements.
3.5.3 Step 1: Calculate Base Data Rate
First, determine how much data flows through the system:
Tumbling windows are non-overlapping, so at any moment we maintain: - Current window: Actively receiving events (1 GB) - Processing window: Computing aggregations from just-closed window (1 GB)
Critical Finding: With 1-second slide, we use 10 GB of 16 GB available - 62.5% utilization. This leaves limited headroom for: - State store overhead (~20%) - Garbage collection spikes - Traffic bursts
3.5.6 Step 4: Calculate Optimized Configuration
For production safety, target 50% memory utilization:
If latest event has timestamp 14:00:30 and max out-of-orderness is 10s, watermark is 14:00:20
Watermark advancing to 14:00:10 means “I believe all events before 14:00:10 have arrived”
Windows close when watermark passes their end time
Note: “max out-of-orderness” (watermark delay) and “allowed lateness” are different concepts – the former controls watermark position, the latter controls how long after window close late events are still accepted
Step 4: Handle Late Arrivals
Event arrives with timestamp 14:00:05 when watermark is already 14:00:20
Window [14:00:00 - 14:00:10) already closed and emitted results
Late event goes to side output or triggers result update (depending on allowed-lateness config)
If allowed-lateness permits, late events can trigger updated results
Why This Matters: Processing-time windowing (based on arrival) is simpler but produces incorrect results when network delays cause events to arrive out of order. Event-time windowing with watermarks ensures correctness at the cost of bounded latency.
3.7 Concept Check
🏷️ Label the Diagram
3.8 Summary
Stream processing enables IoT systems to analyze data in motion rather than at rest, which is essential when insight value degrades over time. The key concepts covered in this chapter include:
Batch vs stream processing: Batch optimizes for throughput and completeness; stream optimizes for latency and continuous insight. Lambda Architecture combines both.
Event time vs processing time: Event time reflects when data was generated; processing time reflects when it was received. Event-time semantics ensure correct results despite network delays.
Windowing strategies: Tumbling windows (non-overlapping, low memory, periodic reports), sliding/hopping windows (overlapping, higher memory, smooth trend detection), and session windows (activity-based, variable size, bursty workloads).
Memory-latency tradeoff: Sliding windows with overlap factor N require N times the memory of tumbling windows. Production systems should target 50% memory utilization for headroom.
Watermarks and late data: Watermarks track event-time progress to determine when windows can close. Late events (arriving after window close) are handled via side outputs or allowed-lateness policies.
The worked example demonstrated that a 10,000-sensor factory at 1M messages/second generates 1 GB per 10-second window, making window strategy selection a critical design decision with direct memory and latency consequences.
3.9 Concept Relationships
Stream processing fundamentals connect across IoT domains:
Requires: Big Data Overview – understanding data volume/velocity motivates streaming
1. Using Batch Processing Patterns for Streaming Problems
Developers familiar with batch processing often apply batch patterns (read all data, process, write results) to streaming — buffering events until a large batch accumulates before processing. This introduces unnecessary latency (minutes to hours) for problems that could be solved event-by-event with millisecond latency. Identify stream processing problems by their nature (continuous data, need for immediate response) and use streaming patterns from the start.
2. Conflating Event Time With Processing Time
Using system time (when the processor receives the event) instead of event time (when the sensor generated it) produces incorrect temporal analytics. A temperature aggregation using processing time will show wrong results when events arrive out of order or delayed. Always use event timestamps embedded in sensor data for time-based calculations.
3. Stateful Computations Without Recovery Planning
Streaming pipelines maintaining state (running averages, session tracking, anomaly detection models) lose all state on failure without checkpointing. Recovering from failure by reprocessing all historical data may take hours for data streams that have been running for months. Implement incremental checkpointing to limit recovery time to seconds or minutes.
4. Not Setting Consumer Lag Alerting
Consumer lag (how far behind the stream processor is from the latest events) is the leading indicator of streaming pipeline degradation. A growing lag means the processor is falling behind the ingestion rate and will eventually drop events or run indefinitely behind real-time. Set up lag alerting with a threshold of 10,000 events or 60 seconds before the pipeline becomes unacceptably delayed.
Continue to Stream Processing Architectures to learn about Apache Kafka, Apache Flink, and Spark Streaming, including architecture comparisons and technology selection guidance.