3  Stream Processing Fundamentals

Chapter Topic
Stream Processing Overview of batch vs. stream processing for IoT
Fundamentals Windowing strategies, watermarks, and event-time semantics
Architectures Kafka, Flink, and Spark Streaming comparison
Pipelines End-to-end ingestion, processing, and output design
Challenges Late data, exactly-once semantics, and backpressure
Pitfalls Common mistakes and production worked examples
Basic Lab ESP32 circular buffers, windows, and event detection
Advanced Lab CEP, pattern matching, and anomaly detection on ESP32
Game & Summary Interactive review game and module summary
Interoperability Four levels of IoT interoperability
Interop Fundamentals Technical, syntactic, semantic, and organizational layers
Interop Standards SenML, JSON-LD, W3C WoT, and oneM2M
Integration Patterns Protocol adapters, gateways, and ontology mapping
In 60 Seconds

Stream processing analyzes data in motion rather than at rest, enabling IoT systems to make real-time decisions with millisecond latency. Windowing strategies divide infinite data streams into finite chunks: tumbling windows (non-overlapping) for periodic reports, sliding windows (overlapping) for trend detection at 5-10x memory cost, and session windows (activity-based) for grouping bursty events. A smart factory with 10,000 sensors at 100 messages/second generates 1 GB per 10-second window, making memory-aware window design critical.

3.1 Learning Objectives

By the end of this chapter, you will be able to:

  • Differentiate between batch processing and stream processing paradigms in IoT contexts
  • Classify windowing strategies (tumbling, sliding, session) by their memory cost, overlap behavior, and IoT applicability
  • Calculate memory and compute requirements for different windowing strategies
  • Design real-time IoT data pipelines with appropriate window types and stages
  • Handle late-arriving data and out-of-order events using watermarks and event-time semantics
  • Apply stream processing concepts to real-world IoT scenarios with production constraints

3.2 Introduction

A modern autonomous vehicle generates approximately 1 gigabyte of sensor data every second from LIDAR, cameras, radar, GPS, and inertial measurement units. If you attempt to store this data to disk and then query it for analysis, you’ve already crashed into the obstacle ahead. The vehicle needs to make decisions in under 10 milliseconds based on continuously arriving sensor streams.

This is the fundamental challenge that stream processing solves: processing data in motion, not data at rest. Traditional batch processing analyzes historical data after it’s been collected and stored. Stream processing analyzes data as it arrives, enabling real-time decisions, alerts, and actions.

In IoT systems, stream processing has become essential infrastructure. Whether you’re monitoring industrial equipment for failure prediction, balancing electrical grids in real-time, or detecting fraud in payment systems, the ability to process continuous data streams with low latency determines system effectiveness.

Key Takeaway

In one sentence: Process data as it arrives - waiting for batch windows means missing the moment when action could have prevented the problem.

Remember this rule: If the value of your insight degrades over time, use stream processing; if you need the complete picture, use batch.

Think of batch processing like counting votes after an election closes. You wait until all ballots are collected, then count them all at once to determine the winner. This works well when you can afford to wait.

Stream processing is like a live vote counter that updates the tally in real-time as each ballot arrives. You see the current state continuously, even before voting ends. This is essential when waiting isn’t an option.

In IoT, batch processing might analyze yesterday’s temperature readings to find trends. Stream processing monitors temperature readings as they arrive to trigger an immediate alert if your server room overheats.

Stream processing is like having a lifeguard who watches the pool RIGHT NOW, instead of checking photos from yesterday!

3.2.1 The Sensor Squad Adventure: The Never-Ending River Race

The Sensor Squad has a new mission: watching the Data River, where information flows like water - constantly moving, never stopping! The question is: how should they watch it?

Coach Batch has one idea: “Let’s collect ALL the water in buckets throughout the day. At midnight, we’ll look at all the buckets together and see what we collected!” This is called batch processing - wait until you have a big pile of data, then look at it all at once.

But Sammy the Temperature Sensor is worried. “What if the water gets too hot RIGHT NOW? By midnight, it might be too late!”

Captain Stream has a different idea: “Let’s taste the water as it flows by - every single second!” This is stream processing - looking at each drop of data the moment it arrives.

To test both ideas, they set up an experiment at the Data River:

The Batch Team (Coach Batch, Pressi the Pressure Sensor, and Bucketbot): - Collected water samples in buckets all day - At midnight, they discovered: “Oh no! The water was dangerously hot at 2:15 PM!” - But it’s now midnight - that was 10 hours ago! The fish have already left the river.

The Stream Team (Sammy, Lux the Light Sensor, and Motio the Motion Detector): - Watched the water every second - At 2:15 PM exactly, Sammy shouted: “ALERT! Temperature spiking NOW!” - They immediately opened the shade covers to cool the river - The fish stayed happy and safe!

“See the difference?” explains Captain Stream. “Batch processing is great when you can WAIT - like figuring out what the most popular ice cream flavor was last month. But stream processing is essential when you need to act FAST - like knowing when the ice cream truck arrives so you can run outside!”

The Sensor Squad learned an important lesson: not all data problems are the same!

  • Use Batch when: You’re doing homework about last week’s weather patterns
  • Use Stream when: You need to know if it’s raining RIGHT NOW so you can grab an umbrella

Motio the Motion Detector added one more example: “It’s like fire alarms! A batch fire alarm would check for smoke once a day at midnight - terrible idea! A stream fire alarm checks CONSTANTLY - that’s how you stay safe!”

3.2.2 Key Words for Kids

Word What It Means
Stream Processing Looking at data the instant it arrives - like watching a movie as it plays instead of waiting until it ends
Batch Processing Collecting lots of data into a pile, then looking at it all at once later - like saving all your mail and reading it every Sunday
Real-Time Happening right now, with almost no delay - like a live video call with grandma
Latency How long you have to wait for something - low latency means fast, high latency means slow

3.2.3 Try This at Home!

The Temperature Detective Game!

This experiment shows the difference between batch and stream processing:

What you need:

  • A glass of water
  • Ice cubes
  • A timer
  • A pencil and paper

Batch Method (try this first):

  1. Put ice in the water
  2. Set a timer for 10 minutes
  3. DON’T touch or look at the water!
  4. After 10 minutes, feel the water temperature
  5. Write down: “After 10 minutes, it was _____ cold”

Stream Method (try this second):

  1. Put new ice in fresh water
  2. Touch the water every 30 seconds for 10 minutes
  3. Write down the temperature feeling each time: “30 sec: a little cold, 1 min: colder, 1:30: really cold…”

What you’ll discover:

  • Batch method: You only know ONE thing - how cold it was at the end
  • Stream method: You know EXACTLY when it got coldest and when it started warming up again!

Think about it:

  • Which method would you use if you wanted to drink the water at the PERFECT coldness?
  • Which method would you use if you just wanted to know if the ice melted?
  • What if the ice had a dangerous chemical and you needed to know the INSTANT it started melting?

Real-world connection:

  • Doctors use stream processing to monitor your heartbeat - they need to know RIGHT AWAY if something is wrong!
  • Weather scientists use batch processing to study climate patterns from the last 100 years - there’s no rush!

The Misconception: Real-time processing means zero latency, immediate results.

Why It’s Wrong:

  • “Real-time” means predictable, bounded latency - not zero
  • Hard real-time: Guaranteed max latency (industrial control)
  • Soft real-time: Statistical latency bounds (streaming)
  • Near real-time: Low but not guaranteed (analytics)
  • Zero latency violates physics (processing takes time)

Real-World Example:

  • Stock trading “real-time” system: 1-50ms latency depending on strategy (not instant)
  • Factory alarm “real-time”: 10ms guaranteed (hard real-time)
  • Dashboard “real-time”: Updates every 5 seconds (near real-time)
  • All called “real-time” but vastly different guarantees

The Correct Understanding: | Term | Latency | Guarantee | Example | |——|———|———–|———| | Hard real-time | <10ms | Deterministic | PLC control | | Soft real-time | <100ms | Statistical (99.9%) | VoIP, online gaming | | Near real-time | <1s | Best effort | Analytics dashboards | | Batch | Minutes-hours | None | Daily reports |

Specify latency requirements numerically. “Real-time” is ambiguous.

3.3 Why Stream Processing for IoT?

⏱️ ~10 min | ⭐⭐ Intermediate | 📋 P10.C14.U01

IoT applications have stringent latency requirements that batch processing cannot satisfy. Consider these real-world constraints:

Application Domain Required Latency Consequence of Delay
Autonomous vehicles <10 milliseconds Collision, injury, death
Industrial safety systems <100 milliseconds Equipment damage, worker injury
Smart grid load balancing <1 second Blackouts, equipment failure
Predictive maintenance <1 minute Unexpected downtime, cascading failures
Financial fraud detection <5 seconds Monetary loss, regulatory penalties
Healthcare monitoring <10 seconds Patient deterioration, delayed intervention

Beyond latency, stream processing offers several advantages for IoT workloads:

  1. Memory Efficiency: Process data as it arrives rather than storing massive datasets
  2. Continuous Insights: Monitor changing conditions in real-time
  3. Event-Driven Actions: Trigger immediate responses to critical conditions
  4. Temporal Accuracy: Analyze data based on when events actually occurred
  5. Infinite Data Handling: Process unbounded streams that never “complete”

Timeline comparison diagram showing batch processing waiting to accumulate data for hours before analysis versus stream processing analyzing each data point immediately as it arrives, with latency differences illustrated from hours down to milliseconds

Batch vs Stream Processing Timeline
Figure 3.1: Batch vs Stream Processing Timeline Comparison
Knowledge Check: Stream vs Batch Processing

This architecture diagram shows how real-world IoT systems often combine batch and stream processing in a Lambda Architecture, using each paradigm for its strengths.

Architecture diagram showing lambda architecture view components and layers
Figure 3.2: Lambda Architecture combining stream and batch processing: IoT data enters through Kafka, flowing simultaneously to Speed Layer (Flink for sub-second alerts on recent data) and Batch Layer (Spark for accurate historical aggregates on all data). The Serving Layer merges both views to provide dashboards with both real-time responsiveness and historical accuracy. This hybrid approach acknowledges that stream processing optimizes for latency while batch processing optimizes for correctness and completeness.
Artistic comparison of batch and stream processing architectures showing batch processing as periodic data trucks delivering accumulated data to a warehouse, versus stream processing as a continuous conveyor belt with real-time analysis stations processing individual items as they flow through
Figure 3.3: Understanding the fundamental difference between batch and stream processing is crucial for IoT system design. Batch processing excels when historical analysis and complex computations are acceptable with hours of latency. Stream processing is essential when millisecond-to-second latency determines system value, such as fraud detection, anomaly alerts, or autonomous vehicle control. Many production IoT systems use both: stream processing for real-time alerts and batch processing for model retraining and trend analysis.
Tradeoff: Batch Processing vs Stream Processing for IoT Analytics

Option A: Batch Processing - Process accumulated data in scheduled intervals - Latency: Minutes to hours (scheduled job completion) - Throughput: Very high (1M+ events/second during batch window) - Resource cost: Burst usage (pay only during processing) - Complexity: Lower (simpler failure recovery, replay from source) - Use case examples: Daily reports, ML model training, historical trend analysis - Infrastructure: Spark on EMR/Dataproc ~$0.05/GB processed

Option B: Stream Processing - Process each event as it arrives - Latency: Milliseconds to seconds (continuous) - Throughput: High (100K-10M+ events/second sustained, depending on framework) - Resource cost: Always-on (24/7 compute allocation) - Complexity: Higher (state management, exactly-once semantics, late data handling) - Use case examples: Real-time alerts, fraud detection, live dashboards - Infrastructure: Kafka + Flink managed ~$500-2,000/month base cost

Decision Factors:

  • Choose Batch when: Insight value doesn’t degrade within hours, complex ML models need full dataset context, cost optimization is critical (pay per job), regulatory reports have fixed deadlines (daily/monthly)
  • Choose Stream when: Alert value degrades in seconds (equipment failure, security breach), user-facing dashboards require <10 second updates, event-driven actions needed (automated shutoffs), compliance requires real-time audit trails
  • Lambda Architecture (Hybrid): Stream for real-time approximations + batch for accurate historical corrections (most production IoT systems use this pattern)

3.3.1 Stream Processing Architecture Overview

Moving from paradigm comparison to system design, modern stream processing systems combine multiple architectural components to handle real-time IoT data at scale. Figure 3.4 contrasts the data flow patterns, while subsequent diagrams show how these concepts translate into production architectures.

Geometric diagram contrasting batch and stream processing data flows with batch showing large blocks of data moving through discrete processing stages versus stream showing continuous small data elements flowing through parallel processing pipelines with real-time aggregation outputs
Figure 3.4: The geometric representation highlights how data volume and velocity differ between paradigms. Batch systems optimize for throughput by processing large chunks efficiently, while stream systems optimize for latency by maintaining minimal buffers and processing events individually or in micro-batches.

A complete stream processing system includes three layers: ingestion (receiving data from sensors and APIs), processing (applying transformations, aggregations, and windowing), and output (writing results to databases, dashboards, or alerting systems).

Comprehensive stream processing architecture diagram showing data ingestion layer with multiple sources, stream processing engine with operators and state management, and output sinks for databases, dashboards, and alerting systems
Figure 3.5: End-to-end stream processing architecture from ingestion to action

Within the processing layer, data flows through a multi-stage pipeline. Each stage performs a specific transformation, with checkpointing between stages to ensure fault tolerance. Backpressure mechanisms prevent fast producers from overwhelming slow consumers.

Multi-stage stream processing pipeline showing data flow from IoT sensors through ingestion, parsing, enrichment, aggregation, and output stages with backpressure handling and checkpointing for fault tolerance
Figure 3.6: Stream processing pipeline with fault-tolerant checkpointing

The stream processing topology defines how operators connect to transform data. Each operator performs a specific function – filtering, mapping, or aggregating – with data flowing between operators as unbounded streams. The topology forms a directed acyclic graph (DAG) that can be parallelized across multiple processing nodes.

Stream processing topology diagram showing directed acyclic graph of operators including source, filter, map, window aggregate, and sink nodes with parallel partitions for scalable processing of IoT event streams
Figure 3.7: Stream processing operator topology for scalable event processing

3.4 Core Concepts

⏱️ ~15 min | ⭐⭐⭐ Advanced | 📋 P10.C14.U02

3.4.1 Event Time vs Processing Time

One of the fundamental concepts in stream processing is the distinction between event time and processing time:

  • Event Time: The timestamp when the event actually occurred (e.g., when a sensor took a reading)
  • Processing Time: The timestamp when the system processes the event (e.g., when the server receives the data)

Example: A temperature sensor in a remote oil pipeline takes a reading at 14:00:00 showing 95°C. However, due to intermittent network connectivity, this reading doesn’t reach your processing system until 14:05:23. The event time is 14:00:00 (when the temperature was actually 95°C), but the processing time is 14:05:23.

For IoT applications, event time is typically more meaningful because you want to understand when conditions actually changed, not when you learned about them. However, this creates challenges:

  • Out-of-order events: Events may arrive in a different order than they occurred
  • Late data: Events may arrive long after they occurred
  • Clock skew: Different sensors may have slightly different clock times

Flowchart illustrating the difference between event time when a sensor reading actually occurred at 14:00:00 and processing time when the system received it at 14:05:23, showing network delays and intermittent connectivity causing the 5-minute gap

Event time versus processing time
Figure 3.8: Event time versus processing time showing network and processing delays causing 5-minute difference between sensor reading timestamp and system processing timestamp
Understanding Windowing

Core Concept: Windowing divides infinite data streams into finite, bounded chunks that can be aggregated and analyzed. Without windows, you cannot compute “average temperature” because the stream never ends.

Why It Matters: The window type you choose determines both your system’s latency and memory requirements. Tumbling windows (non-overlapping) use minimal memory but can miss patterns that span window boundaries. Sliding windows (overlapping) catch boundary-spanning events but multiply memory usage by the overlap factor. Session windows (activity-based) adapt to irregular data patterns but complicate capacity planning.

Key Takeaway: Match window type to your analytics goal. Use tumbling for periodic reports (hourly billing summaries), sliding for trend detection (moving averages that update smoothly), and session for activity analysis (user sessions, machine operation cycles). A 10-second sliding window with 1-second advance uses 10x the memory of a tumbling window of the same size.

Why does sliding window memory explode? A 10-second sliding window advancing every 1 second maintains 10 overlapping windows simultaneously. For 1,000 sensors at 10 Hz (10 readings/sec):

\[\text{Tumbling (10s)}: 1{,}000 \times 10\text{ Hz} \times 10\text{ s} = 100{,}000\text{ readings stored}\]

\[\text{Sliding (10s / 1s)}: 100{,}000 \times 10\text{ overlap factor} = 1{,}000{,}000\text{ readings stored}\]

At 100 bytes per reading: tumbling = 10 MB, sliding = 100 MB (10x memory cost). Scale to 10,000 sensors: tumbling = 100 MB vs sliding = 1 GB. Choose window size carefully based on RAM constraints.

3.4.2 Windowing Strategies

With the need to bound infinite streams established, let’s examine the three primary windowing strategies in detail: tumbling, sliding, and session windows. Each makes different tradeoffs between memory usage, latency, and analytical accuracy.

3.4.2.1 Tumbling Windows

Non-overlapping, fixed-size windows that cover the entire stream without gaps.

Use Case: Calculate average temperature every 5 minutes

Diagram showing non-overlapping tumbling windows in 5-minute intervals, with sensor events assigned to exactly one window based on their timestamp, creating distinct time buckets for periodic aggregation

Tumbling windows diagram
Figure 3.9: Tumbling windows showing non-overlapping 5-minute intervals with events assigned to single windows based on timestamp

Characteristics:

  • Each event belongs to exactly one window
  • Simple to implement and reason about
  • Fixed computation frequency
  • Best for periodic aggregations

3.4.2.2 Sliding Windows

Overlapping windows that slide by a specified interval, smaller than the window size.

Use Case: Calculate moving average of last 10 minutes, updated every 1 minute

Diagram showing overlapping sliding windows with 10-minute size advancing every 1 minute, illustrating how a single event at time T appears in 10 consecutive windows to enable smooth moving average calculations

Sliding windows diagram
Figure 3.10: Sliding windows showing overlapping 10-minute windows advancing by 1-minute increments with single event appearing in multiple windows

Characteristics:

  • Events may belong to multiple overlapping windows
  • Provides smoother, continuous results
  • Higher computational cost (more windows to maintain)
  • Best for trend detection and moving averages

3.5 Tradeoff: Tumbling Windows vs Sliding Windows for IoT Aggregation

Option A: Tumbling Windows (Non-overlapping)

  • Memory usage: Low - only 1-2 windows active per key at any time
  • Memory for 10K sensors, 5-min window: ~200 MB aggregate state (1 window of running stats per sensor, ~20 KB each)
  • Computational cost: O(n) - each event processed once
  • Output frequency: Fixed (every window_size interval)
  • Latency to insight: Up to window_size delay for boundary events
  • Use cases: Periodic reports (hourly averages), batch-like aggregations, memory-constrained edge devices

Option B: Sliding Windows (Overlapping)

  • Memory usage: High - (window_size / slide_interval) windows active per key
  • Memory for 10K sensors, 5-min window, 30s slide: ~2 GB aggregate state (10 overlapping windows per sensor)
  • Computational cost: O(n * overlap_factor) - each event processed multiple times
  • Output frequency: Every slide_interval (more granular)
  • Latency to insight: Maximum slide_interval delay
  • Use cases: Real-time anomaly detection, moving averages, dashboards requiring smooth updates

Decision Factors:

  • Choose Tumbling when: Memory is constrained (edge devices, many keys), exact aggregation boundaries matter (hourly billing), processing cost must be minimized, output frequency matches business requirements
  • Choose Sliding when: Smooth real-time visualization needed, anomaly detection requires continuous evaluation, users expect frequent dashboard updates (every 30s), memory budget allows 5-10x higher usage
  • Hybrid approach: Tumbling for storage (persist hourly aggregates), sliding for real-time display (smooth visualization) - Flink supports both on the same stream

3.5.0.1 Session Windows

Dynamic windows based on activity periods, separated by gaps of inactivity.

Use Case: Group all sensor readings from a machine during an active work session

Diagram showing variable-length session windows where events are grouped based on activity bursts, with 5-minute inactivity gaps closing one session and starting a new one, creating three distinct sessions of different durations

Session windows diagram
Figure 3.11: Session windows showing dynamic window sizes based on activity with 5-minute inactivity timeout separating three distinct sessions

Characteristics:

  • Window size varies based on data patterns
  • Automatically adapts to activity levels
  • Ideal for user behavior and machine operation cycles
  • Best for analyzing complete work sessions or usage periods

Key Insight: Windows Bound Infinite Streams
Conceptual diagram showing how windowing transforms an infinite continuous stream of sensor data into finite, bounded chunks that can be aggregated, with arrows illustrating the transformation from unbounded to windowed data
Figure 3.12: Windowing Transforms Unbounded Streams into Finite Aggregatable Chunks

In Plain English: You can’t compute “average temperature” over infinite data - it never ends! Windows create finite chunks you can actually process. “Average temperature in the last 5 minutes” is computable.

Comparison chart showing three stream processing window types side by side - tumbling windows as fixed non-overlapping blocks producing periodic results, sliding windows as overlapping intervals for smooth trend detection, and session windows as variable-length periods defined by activity gaps
Figure 3.13: Window type comparison: Tumbling (fixed non-overlapping), Sliding (fixed overlapping for smooth trends), Session (variable gap-based for activity bursts)
IoT sensor windowing example showing temperature readings flowing into three window types: 5-minute tumbling window producing periodic average reports for dashboard, 5-minute sliding window with 1-minute advance detecting temperature trends for anomaly alerts, and session window tracking machine operation periods for audit logging.
Figure 3.14: IoT temperature sensor windowing example: Tumbling for periodic averages, Sliding for trend detection, Session for machine operation analysis
Stream processing late data handling sequence showing normal event flow arriving on time versus late arrivals due to network delay. When events arrive after window close time, they either update the window during the grace period (triggering aggregate recomputation) or are dropped/sent to side output if grace period has expired.
Figure 3.15: Late data handling: Events arriving after window close can update results during grace period or be dropped/side-output if grace expires
Decision tree diagram for window selection tree
Figure 3.16: Decision tree for selecting the appropriate window type based on use case requirements
Comparison diagram showing window memory comparison
Figure 3.17: Memory footprint comparison showing how sliding windows consume significantly more memory than tumbling windows due to overlap

Objective: Simulate real-time IoT sensor stream processing in Python using asyncio. This demonstrates tumbling windows (non-overlapping aggregation for periodic reports), sliding windows (overlapping for smooth trend detection), and event-time vs processing-time semantics. All concepts map directly to Apache Flink/Kafka Streams equivalents.

Tumbling Window – non-overlapping, fixed-size windows where each reading belongs to exactly one window:

import time
from collections import deque

class TumblingWindow:
    def __init__(self, window_seconds):
        self.window_seconds = window_seconds
        self.current_start = None
        self.buffer = []

    def process(self, reading):
        event_time = reading["event_time"]
        window_start = int(event_time / self.window_seconds) * self.window_seconds
        if self.current_start and window_start != self.current_start:
            result = self._emit()
            self.buffer = [reading]
            self.current_start = window_start
            return result
        self.current_start = window_start
        self.buffer.append(reading)
        return None

    def _emit(self):
        temps = [r["temperature"] for r in self.buffer]
        return {"count": len(temps),
                "avg_temp": round(sum(temps) / len(temps), 1),
                "min_temp": min(temps), "max_temp": max(temps)}

Sliding Window – overlapping windows with configurable slide interval; each reading can appear in multiple windows:

class SlidingWindow:
    def __init__(self, window_seconds, slide_seconds):
        self.window_seconds = window_seconds
        self.slide_seconds = slide_seconds
        self.buffer = deque()
        self.last_emit = 0

    def process(self, reading):
        self.buffer.append(reading)
        now = reading["event_time"]
        # Evict readings older than window
        cutoff = now - self.window_seconds
        while self.buffer and self.buffer[0]["event_time"] < cutoff:
            self.buffer.popleft()
        if now - self.last_emit >= self.slide_seconds:
            self.last_emit = now
            temps = [r["temperature"] for r in self.buffer]
            return {"readings": len(temps),
                    "moving_avg": round(sum(temps)/len(temps), 1)}
        return None

Key difference: Tumbling windows use O(window_size) memory with no overlap. Sliding windows buffer O(window_size / slide_interval) more readings but produce smoother trends. In production, Flink/Kafka Streams manage this automatically.

What to Observe:

  1. Tumbling windows produce one result per 5-second interval – each reading counted exactly once (ideal for billing, periodic reports)
  2. Sliding windows update every 2 seconds with a 10-second lookback – readings appear in up to 5 overlapping windows (ideal for smooth dashboards, anomaly detection)
  3. The sliding window uses ~5x more memory than the tumbling window because of the overlap factor (10s / 2s = 5)
  4. Event time is when the sensor measured, arrival time includes network delay – using event time ensures correct aggregation even with delayed readings
  5. Late-arriving data (readings that arrive after their window closed) would need a grace period or side-output in production systems

3.5.1 Interactive: Window Memory Calculator

Use this calculator to explore how windowing parameters affect memory requirements for your IoT stream processing deployment. Adjust the sliders to match your scenario and observe how memory scales with different window strategies.

3.5.2 Worked Example: Kafka Windowing for Industrial Sensor Analytics

Real-world stream processing requires careful window design to balance memory usage, latency, and analytical accuracy. This example demonstrates how to calculate resource requirements for different windowing strategies in a high-throughput industrial IoT scenario.

Worked Example: Stream Processing Window Design

Context: A smart factory with 10,000 sensors monitoring industrial equipment for predictive maintenance and anomaly detection.

Given:

Parameter Value
Number of sensors 10,000 devices
Message rate per sensor 100 messages/second
Total message rate 1,000,000 messages/second (1M msg/sec)
Message size (JSON payload) 100 bytes
Analytics goal Detect anomalies within 10-second windows
Available memory per Kafka Streams instance 16 GB
Required latency <10 seconds for anomaly detection

Problem: Design an appropriate windowing strategy that fits within memory constraints while meeting latency requirements.


3.5.3 Step 1: Calculate Base Data Rate

First, determine how much data flows through the system:

\[\text{Data Rate} = \text{Messages/sec} \times \text{Message Size}\] \[\text{Data Rate} = 1,000,000 \times 100 \text{ bytes} = 100 \text{ MB/sec}\]

For a 10-second window: \[\text{Data per Window} = 100 \text{ MB/sec} \times 10 \text{ sec} = 1 \text{ GB}\]

Key Insight: Each 10-second window must buffer approximately 1 GB of sensor data before computing aggregations.


3.5.4 Step 2: Analyze Tumbling Window Memory Requirements

Tumbling windows are non-overlapping, so at any moment we maintain: - Current window: Actively receiving events (1 GB) - Processing window: Computing aggregations from just-closed window (1 GB)

\[\text{Memory (Tumbling)} = 2 \times \text{Window Size Data} = 2 \times 1 \text{ GB} = 2 \text{ GB}\]

Metric Value Status
Active windows 2 (current + processing)
Memory per window 1 GB
Total memory required 2 GB Within 16 GB budget
Maximum latency 10 seconds (window size) Meets requirement
Output frequency Every 10 seconds

3.5.5 Step 3: Analyze Sliding Window Memory Requirements

Sliding windows overlap, with each event potentially belonging to multiple windows. For a 10-second window with 1-second slide:

\[\text{Overlapping Windows} = \frac{\text{Window Size}}{\text{Slide Interval}} = \frac{10 \text{ sec}}{1 \text{ sec}} = 10 \text{ windows}\]

Each event appears in 10 different windows simultaneously:

\[\text{Memory (Sliding 1s)} = 10 \times 1 \text{ GB} = 10 \text{ GB}\]

Slide Interval Overlapping Windows Memory Required Status
1 second 10 10 GB Within 16 GB budget
2 seconds 5 5 GB Acceptable
5 seconds 2 2 GB Optimal
10 seconds 1 1 GB Same as tumbling

Critical Finding: With 1-second slide, we use 10 GB of 16 GB available - 62.5% utilization. This leaves limited headroom for: - State store overhead (~20%) - Garbage collection spikes - Traffic bursts


3.5.6 Step 4: Calculate Optimized Configuration

For production safety, target 50% memory utilization:

\[\text{Target Memory} = 0.5 \times 16 \text{ GB} = 8 \text{ GB}\]

For sliding windows: \[\text{Max Overlapping Windows} = \frac{8 \text{ GB}}{1 \text{ GB/window}} = 8 \text{ windows}\]

\[\text{Optimal Slide} = \frac{10 \text{ sec}}{8} \approx 1.25 \text{ sec} \rightarrow \text{Round to } 2 \text{ sec}\]

With 2-second slide: - Overlapping windows: 5 - Memory: 5 GB (31% utilization) - Latency: 2 seconds for updated results - Safety margin: 11 GB for spikes


3.5.7 Step 5: Consider Session Windows Alternative

For sensors with irregular reporting patterns (e.g., motion-activated sensors), session windows may be more efficient:

Scenario Active Sessions Events/Session Memory
All sensors active 10,000 Variable ~1-5 GB
50% sensors active 5,000 Variable ~0.5-2.5 GB
Sparse events (10%) 1,000 Variable ~100 MB

Session windows are ideal when: Activity is bursty rather than continuous, reducing average memory usage significantly.


3.5.8 Final Answer: Window Strategy Comparison

Strategy Slide Memory Latency Use Case Recommendation
Tumbling 10s 2 GB 10s max Periodic batch analytics Simple, predictable
Sliding (1s) 1s 10 GB 1s Real-time dashboards High memory cost
Sliding (2s) 2s 5 GB 2s Balanced monitoring Recommended
Sliding (5s) 5s 2 GB 5s Moderate latency OK Memory-efficient
Session Variable 0.1-5 GB Gap-based Sparse/bursty events Event-driven scenarios

3.5.9 Implementation in Kafka Streams

// Tumbling window: 10-second non-overlapping windows
KTable<Windowed<String>, SensorStats> tumblingStats = sensorStream
    .groupByKey()
    .windowedBy(TimeWindows.ofSizeWithNoGrace(Duration.ofSeconds(10)))
    .aggregate(
        SensorStats::new,
        (key, value, stats) -> stats.update(value),
        Materialized.<String, SensorStats, WindowStore<Bytes, byte[]>>as("tumbling-store")
            .withValueSerde(sensorStatsSerde)
    );

// Hopping window: 10-second window, 2-second advance (optimized)
// Note: Kafka Streams calls advance-based overlapping windows "hopping windows"
// using TimeWindows.advanceBy(), not SlidingWindows (which is event-driven)
KTable<Windowed<String>, SensorStats> hoppingStats = sensorStream
    .groupByKey()
    .windowedBy(TimeWindows.ofSizeAndGrace(
        Duration.ofSeconds(10),    // Window size
        Duration.ofSeconds(30))    // Grace period for late data
        .advanceBy(Duration.ofSeconds(2)))  // 2-second hop interval
    .aggregate(
        SensorStats::new,
        (key, value, stats) -> stats.update(value),
        Materialized.<String, SensorStats, WindowStore<Bytes, byte[]>>as("hopping-store")
            .withValueSerde(sensorStatsSerde)
            .withCachingEnabled()  // Reduce state store writes
    );

// Session window: 5-second inactivity gap closes session
KTable<Windowed<String>, SensorStats> sessionStats = sensorStream
    .groupByKey()
    .windowedBy(SessionWindows.ofInactivityGapWithNoGrace(Duration.ofSeconds(5)))
    .aggregate(
        SensorStats::new,
        (key, value, stats) -> stats.update(value),
        (key, stats1, stats2) -> stats1.merge(stats2),  // Merge sessions
        Materialized.<String, SensorStats, SessionStore<Bytes, byte[]>>as("session-store")
            .withValueSerde(sensorStatsSerde)
    );

3.5.10 Key Insights

  1. Memory scales linearly with overlap: Sliding windows with N overlaps require N times the memory of tumbling windows

  2. Latency-memory trade-off: Faster updates (smaller slide) require more memory for concurrent windows

  3. Production headroom: Target 50% memory utilization to handle traffic spikes and GC overhead

  4. Session windows excel for sparse data: When sensors report irregularly, session windows dramatically reduce memory versus fixed windows

  5. Kafka configuration matters:

    • state.dir should use fast SSD storage for RocksDB
    • cache.max.bytes.buffering controls memory for state store caching
    • num.stream.threads should match available CPU cores

3.6 How It Works: Event-Time Window Processing Step-by-Step

Understanding how stream processors handle event-time windows with late data:

Step 1: Assign Event Timestamps

// Extract event time from payload (Apache Flink Java API)
WatermarkStrategy<SensorReading> strategy = WatermarkStrategy
    .<SensorReading>forBoundedOutOfOrderness(Duration.ofSeconds(10))
    .withTimestampAssigner((reading, timestamp) -> reading.getEventTime());

Step 2: Assign Windows Based on Event Time

  • Event with timestamp 14:00:05 goes into window [14:00:00 - 14:00:10) for 10-second tumbling windows
  • Multiple events may arrive out of order but are bucketed by their event time
  • Key insight: Window assignment uses event time, not arrival time

Step 3: Track Watermarks

Watermark = max(event_time_seen) - max_out_of_orderness
  • If latest event has timestamp 14:00:30 and max out-of-orderness is 10s, watermark is 14:00:20
  • Watermark advancing to 14:00:10 means “I believe all events before 14:00:10 have arrived”
  • Windows close when watermark passes their end time
  • Note: “max out-of-orderness” (watermark delay) and “allowed lateness” are different concepts – the former controls watermark position, the latter controls how long after window close late events are still accepted

Step 4: Handle Late Arrivals

  • Event arrives with timestamp 14:00:05 when watermark is already 14:00:20
  • Window [14:00:00 - 14:00:10) already closed and emitted results
  • Late event goes to side output or triggers result update (depending on allowed-lateness config)

Step 5: Emit Window Results

  • Window closes when watermark >= window_end_time
  • Results emitted: {window: [14:00:00-14:00:10), count: 47, avg: 22.3°C}
  • If allowed-lateness permits, late events can trigger updated results

Why This Matters: Processing-time windowing (based on arrival) is simpler but produces incorrect results when network delays cause events to arrive out of order. Event-time windowing with watermarks ensures correctness at the cost of bounded latency.

3.7 Concept Check

3.8 Summary

Stream processing enables IoT systems to analyze data in motion rather than at rest, which is essential when insight value degrades over time. The key concepts covered in this chapter include:

  • Batch vs stream processing: Batch optimizes for throughput and completeness; stream optimizes for latency and continuous insight. Lambda Architecture combines both.
  • Event time vs processing time: Event time reflects when data was generated; processing time reflects when it was received. Event-time semantics ensure correct results despite network delays.
  • Windowing strategies: Tumbling windows (non-overlapping, low memory, periodic reports), sliding/hopping windows (overlapping, higher memory, smooth trend detection), and session windows (activity-based, variable size, bursty workloads).
  • Memory-latency tradeoff: Sliding windows with overlap factor N require N times the memory of tumbling windows. Production systems should target 50% memory utilization for headroom.
  • Watermarks and late data: Watermarks track event-time progress to determine when windows can close. Late events (arriving after window close) are handled via side outputs or allowed-lateness policies.

The worked example demonstrated that a 10,000-sensor factory at 1M messages/second generates 1 GB per 10-second window, making window strategy selection a critical design decision with direct memory and latency consequences.

3.9 Concept Relationships

Stream processing fundamentals connect across IoT domains:

These connections show streaming as infrastructure, not an isolated technique.

3.10 See Also

Framework Documentation:

Academic Papers:

Common Pitfalls

Developers familiar with batch processing often apply batch patterns (read all data, process, write results) to streaming — buffering events until a large batch accumulates before processing. This introduces unnecessary latency (minutes to hours) for problems that could be solved event-by-event with millisecond latency. Identify stream processing problems by their nature (continuous data, need for immediate response) and use streaming patterns from the start.

Using system time (when the processor receives the event) instead of event time (when the sensor generated it) produces incorrect temporal analytics. A temperature aggregation using processing time will show wrong results when events arrive out of order or delayed. Always use event timestamps embedded in sensor data for time-based calculations.

Streaming pipelines maintaining state (running averages, session tracking, anomaly detection models) lose all state on failure without checkpointing. Recovering from failure by reprocessing all historical data may take hours for data streams that have been running for months. Implement incremental checkpointing to limit recovery time to seconds or minutes.

Consumer lag (how far behind the stream processor is from the latest events) is the leading indicator of streaming pipeline degradation. A growing lag means the processor is falling behind the ingestion rate and will eventually drop events or run indefinitely behind real-time. Set up lag alerting with a threshold of 10,000 events or 60 seconds before the pipeline becomes unacceptably delayed.

3.12 What’s Next

If you want to… Read this
Learn stream processing architecture patterns Stream Processing Architectures
Understand specific streaming challenges Handling Stream Processing Challenges
Build streaming pipelines step by step Building IoT Streaming Pipelines
Get hands-on with stream processing in a lab Lab: Stream Processing

Continue to Stream Processing Architectures to learn about Apache Kafka, Apache Flink, and Spark Streaming, including architecture comparisons and technology selection guidance.