2 Stream Processing Fundamentals

Event Streams, Event Time, Windows, Watermarks, State, Checkpoints, and Retest Evidence

iot

stream-processing

fundamentals

Keywords

stream processing fundamentals, IoT event streams, event time, stream windows, watermarks, late events, streaming state

2.1 Start With One Event That Cannot Wait

Imagine a temperature event from a cold-chain sensor arriving while the shipment is still moving. The useful question is not just “what was the reading?” It is whether this event belongs to the right time, window, and state before anyone acts on it.

Stream processing starts there: one live event, a clock, a window rule, and a review record that explains what the system did. Once that path is visible, terms like watermark, late event, checkpoint, and recovery become tools for making the decision repeatable.

2.2 Streams Are Data In Motion

Stream processing handles events while new events are still arriving. In IoT, that matters when a reading should change a live status, alert, control review, anomaly review, or downstream record before a complete historical dataset exists.

The fundamentals are not tied to a single tool. A reviewable stream design states the event source, the timestamp that controls the calculation, the window that bounds the question, the rule for late or duplicated events, the state that must be remembered, and the evidence that should trigger retest.

Worked example: a freezer sensor sends one temperature reading per minute, but the device buffers data during a cellular outage. When connectivity returns, five older readings may arrive after newer readings. A stream that answers “was the package above 8 C for 15 minutes?” should use event time, not arrival time, because the question is about when the package was warm. The record also needs the allowed-late rule so reviewers know whether a delayed reading can revise the result.

The same stream can produce more than one output. A live dashboard may show the latest received value, while a compliance window uses event-time readings and correction markers. Both are valid if their contracts are explicit; the pitfall is letting the dashboard’s freshness silently define the durable record.

If you only need the intuition, this layer is enough: use streams when current events change the decision; use event time when the real-world observation time matters; use windows to make an endless flow answerable; and keep late-event, state, and recovery rules explicit.

flowchart LR
source[IoT event source] --> validate[Validate event contract]
validate --> time[Apply event-time basis]
time --> window[Assign window or sequence]
window --> state[Update keyed state]
state --> output[Emit or revise output]
output --> retest[Retest trigger]

Core Ideas

Event stream

An ongoing sequence of sensor readings, device events, commands, alerts, or upstream messages.

Event time

The time the observation happened, which may differ from when the platform received or processed it.

Window

A bounded interval, moving range, session, or sequence that turns an unbounded stream into a computable question.

State and recovery

The counts, aggregates, joins, sessions, output markers, and checkpoints needed to resume and explain results.

Beginner Examples

A live freezer alert usually needs stream processing because action value falls when the result waits for a daily report.
A monthly energy summary may not need streaming if no consumer acts on the partial result.
A delayed sensor reading should be evaluated by its event-time and late-event rule, not discarded only because it arrived late.
A dashboard latest value can update quickly, but the durable stream record still needs enough evidence to explain corrections.

Overview Knowledge Check

2.3 Build The Stream Review Record

A practical stream-processing review should produce a small record that another engineer can test against representative events. The record should explain why the stream is needed, what the output means, how events are grouped, what happens when events are late or duplicated, and which assumption reopens the decision.

This keeps the design from becoming a tool-first claim such as “we use a streaming platform, so the result is real time.” The useful proof is narrower: given this source trace and this rule set, the processor emits this result, handles this degraded case, and can recover its state without changing the meaning of the output.

Review Sequence

Name the consumer decision. State who acts on the result and how quickly the result needs to change.
Classify the source events. Record event meaning, required fields, timestamp basis, and validation behavior.
Choose the bounded question. Select a tumbling, sliding, session, or sequence rule that matches the consumer’s question.
Define lateness behavior. State whether late or corrected events update a result, mark it provisional, route to review, or remain out of scope.
Record state and output evidence. Name maintained state, checkpoint expectations, emitted result, correction behavior, and retest triggers.

Evidence Ledger

Decision

Evidence To Record

Common Failure

Retest Trigger

Batch or stream

Consumer action, maximum useful delay, and whether partial results have value.

Streaming everything even when no decision changes before a scheduled run.

The consumer starts acting on current events or needs corrections while events arrive.

Time basis

Event-time field, receive-time evidence, processing-time use, and missing-time behavior.

Using receive time for analysis without noticing offline buffering or network delay.

Device timestamp meaning, clock behavior, ingestion rule, or schema changes.

Window rule

Tumbling, sliding, session, or sequence rule plus close condition.

Choosing a window that changes the business meaning of the result.

Consumer asks a different question or source activity pattern changes.

State and output

State key, maintained values, checkpoint evidence, emitted result, and correction rule.

Recovering software state but losing the semantic link between events and output.

Restart behavior, deduplication key, state retention, or output contract changes.

Worked Review: Delayed Sensor Events

Suppose a condition-monitoring stream computes a five-minute vibration status for each motor. One reading arrives after newer readings from the same motor, but its event-time timestamp is trustworthy.

The review does not start by asking which product should process the stream. It starts by checking the event-time rule, window type, allowed lateness, duplicate policy, state key, and consumer tolerance for correction. If the rule allows updates, the affected window is corrected and the output contract tells consumers that the status changed. If the rule does not allow updates, the late reading is still recorded according to policy so the team can review whether the lateness assumption is still valid.

Practitioner Knowledge Check

2.4 Time Progress, State, And Recovery

Under the hood, a stream processor turns an unbounded flow into state transitions. Each accepted event is parsed, assigned to a key or grouping rule, placed against a timestamp basis, applied to maintained state, and then used to emit or revise an output when the policy says enough evidence exists.

Watermarks and checkpoints support that process, but they do not replace the stream contract. A watermark is a processor’s estimate that event time has advanced far enough to close or publish a result. A checkpoint is recovery evidence for maintained state. Neither proves that the chosen window, timestamp, or late-event rule is semantically correct for the consumer.

Worked example: a motor stream keeps a five-minute rolling count of over-vibration events by asset id. The processor state includes the asset key, the active window, the current count, and whether the emitted status is provisional or final. A checkpoint can restore that count after restart, but the restored bytes are only meaningful if the same event-time rule, duplicate policy, and output contract are still in force.

To test recovery, replay a short trace through a restart: one on-time reading, one duplicate retry, one late reading, and one malformed payload. The expected result should say which events changed state, which were rejected, whether an output was revised, and which checkpoint or replay marker proves the restart did not duplicate the output.

Also record what the consumer sees after recovery. If the status is recalculated from the same events, the output should keep the same identity and show a recovery note, not create a second alert. If a late event changes the count, the output should identify the revised window and the event that caused the correction.

Internal Responsibilities

Validation boundary

Malformed, missing, impossible, or unknown-version events should be rejected, quarantined, or normalized before they change stream state.

State key

The key controls which events share counts, aggregates, sessions, joins, or pattern history.

Watermark policy

The policy defines when a result can be emitted and what happens to events that arrive after that point.

Recovery evidence

Checkpoints, replay behavior, deduplication rules, and output markers explain how a restart avoids losing or duplicating meaning.

Failure Modes To Review

Clock drift: event-time values may be inconsistent even when events are syntactically valid.
Offline buffering: receive order may differ from observation order after devices reconnect.
Duplicate events: retries can re-send a reading unless the design has a stable deduplication key.
State growth: sessions, joins, and long windows can retain state longer than expected.
Output ambiguity: consumers may treat provisional results as final unless the contract names correction behavior.

Retest Signals

Retest the stream when a source timestamp changes meaning, a firmware release changes event shape, offline buffering patterns shift, state retention changes, checkpoint or replay behavior changes, a consumer stops accepting corrections, or an output moves from advisory review to automatic action.

Under-The-Hood Knowledge Check

2.5 Summary

Stream-processing fundamentals make data in motion reviewable. The core decisions are whether the consumer truly needs current events, which timestamp defines the event, which window or sequence makes the question finite, how late or duplicate events behave, what state must be maintained, and what evidence proves recovery and retest behavior.

Good stream designs keep these contracts visible. They avoid hiding semantic decisions behind tool names, fast dashboards, or operational checkpoints that do not explain what the result means.

2.6 Key Takeaway

Use stream processing when decisions depend on events in motion, then make event time, windowing, late-event behavior, state, recovery, and retest evidence explicit.

2 Stream Processing Fundamentals

2.1 Start With One Event That Cannot Wait

2.2 Streams Are Data In Motion

Core Ideas

Event stream

Event time

Window

State and recovery

Beginner Examples

Overview Knowledge Check

2.3 Build The Stream Review Record

Review Sequence

Evidence Ledger

Worked Review: Delayed Sensor Events

Practitioner Knowledge Check

2.4 Time Progress, State, And Recovery

Internal Responsibilities

Validation boundary

State key

Watermark policy

Recovery evidence

Failure Modes To Review

Retest Signals

Under-The-Hood Knowledge Check

2.5 Summary

2.6 Key Takeaway

2.7 See Also

Stream Processing Architectures

Building IoT Streaming Pipelines

Common Pitfalls and Worked Examples

Basic Stream Processing Lab