16  Wi-Fi and Sensing Privacy

16.1 Learning Objectives

By the end of this chapter, you will be able to:

  • Identify Wi-Fi Privacy Leaks: Explain how Wi-Fi probe requests and MAC addresses enable tracking
  • Explain Sensing De-anonymization: Describe how motion sensor data creates unique behavioral fingerprints
  • Assess MAC Randomization: Evaluate the effectiveness of MAC address randomization as a privacy defense
  • Recognize Side-Channel Attacks: Analyze how motion sensors enable keystroke and activity inference
In 60 Seconds

Wi-Fi probe requests broadcast device MAC addresses passively, enabling tracking even without network association. Combined with accelerometer, gyroscope, and ambient sensor data, Wi-Fi sensing creates behavioral fingerprints that re-identify individuals even from supposedly anonymous datasets. Defending against Wi-Fi tracking requires MAC address randomization, probe suppression, and user awareness.

Key Concepts

  • Wi-Fi Probe Requests: Broadcast messages sent by Wi-Fi-enabled devices searching for known networks; contain device MAC address and historically remembered SSIDs, enabling passive tracking.
  • MAC Address Randomization: Privacy feature in modern operating systems rotating the MAC address used in probe requests; reduces passive tracking effectiveness.
  • Behavioral Fingerprinting: Creating unique individual profiles from sensor data patterns (Wi-Fi probe history, accelerometer signatures, app usage timing) enabling re-identification.
  • Passive Tracking Infrastructure: Networks of sniffing sensors deployed in retail, transit, and public spaces capturing Wi-Fi probes and Bluetooth advertisements for movement analytics.
  • Sensor Fusion for Identification: Combining multiple ambient sensor streams to create unique behavioral signatures more reliable than any single sensor for individual identification.
  • De-anonymization: Process of linking supposedly anonymous data back to specific individuals using auxiliary information or behavioral patterns.
  • Sensing Privacy Attack: Adversarial technique using ambient sensing infrastructure (smart speakers, cameras with microphones, Wi-Fi sensing) to infer sensitive information about co-located individuals.

Privacy and compliance for IoT are about protecting people’s personal information and following the laws that govern data collection. Think of it like the rules a doctor follows to keep medical records confidential. IoT devices in homes, workplaces, and public spaces collect sensitive data about people’s lives, and there are strict requirements about how this data must be handled.

“Did you know that your phone shouts its name to every Wi-Fi access point nearby?” Sammy the Sensor revealed. “It sends out probe requests saying ‘Hey, are you my favorite coffee shop Wi-Fi?’ And anyone listening can track your movements based on these signals!”

Max the Microcontroller explained further. “Wi-Fi sensing goes even deeper. By analyzing how Wi-Fi signals bounce off objects – including people – researchers can detect human presence, count people in a room, track movements, and even recognize gestures. All without cameras! The Wi-Fi router in your home could potentially tell if someone is sleeping, walking, or cooking.”

“Motion sensors on your phone are also privacy risks,” Lila the LED added. “Accelerometers and gyroscopes can reveal your walking pattern, which is unique like a fingerprint. Researchers have even shown that phone sensors can pick up what you are typing on a nearby keyboard through vibration analysis!”

“The defense is MAC address randomization, which modern phones now support,” Bella the Battery said. “Instead of broadcasting the same identifier everywhere, your phone uses a random fake address each time. But some apps and networks find ways around this. The key takeaway: sensors are everywhere, and they can reveal more about you than you might think!”

16.2 Prerequisites

Before diving into this chapter, you should be familiar with:

  • Knowledge Gaps Tracker: Common confusion points include assuming MAC randomization prevents tracking (probe requests still leak configured SSIDs). Document your gaps here for targeted review.

  • Networking foundations (Bluetooth, Wi-Fi): Understand how wireless protocols expose device identifiers and enable tracking

16.3 Introduction

Beyond GPS location, mobile devices leak privacy through Wi-Fi and Bluetooth signals. Even without connecting to networks, devices continuously broadcast probe requests containing unique identifiers. Motion sensors provide another tracking vector through behavioral fingerprinting.

16.4 Wi-Fi-Based Privacy Leaks

Wi-Fi connections reveal sensitive information:

  • MAC Address: Permanent device identifier, enables tracking across locations
  • WLAN Fingerprints: Nearby visible access points create a location signature (even without connecting)
  • Preferred Network Lists: SSIDs stored on the device are broadcast in probe requests, revealing frequented locations
  • Social Relationships: Shared configured networks between devices indicate social connections
Diagram showing how Wi-Fi probe requests from mobile devices expose MAC addresses, configured SSID lists, and timing patterns, enabling device tracking, social graph inference, and location history reconstruction by passive observers
Figure 16.1: Wi-Fi Probe Request Privacy Attacks: Device Tracking and Social Graph Inference

16.4.1 Wi-Fi Attack Scenarios

  1. Retail Tracking: Stores track MAC addresses to analyze foot traffic
  2. Social Graph Inference: Shared Wi-Fi configs reveal family, coworkers
  3. Location History: Scanned network list reveals travel patterns
  4. De-anonymization: MAC address + Wi-Fi fingerprint = unique identifier

16.5 MAC Address Randomization

Modern devices attempt to protect privacy by randomizing MAC addresses. However, this defense has significant limitations.

16.5.1 Why MAC Randomization Fails

MAC Randomization Limitations
  1. Probe request timing: Even with random MAC, probe request patterns (timing, order, RSSI) fingerprint devices (87% re-identification accuracy)
  2. SSID leakage: Randomized MAC still broadcasts configured SSID list—unique network combinations identify users
  3. Association fallback: Many devices revert to a persistent per-network MAC upon connection—enabling cross-session tracking on that network
  4. Bluetooth co-tracking: BLE randomization not synchronized with Wi-Fi—correlated signals de-anonymize

16.6 Mobile Sensing De-anonymization

Even “anonymized” datasets can be de-anonymized using behavioral patterns from motion sensors.

16.6.1 Data Sparsity Creates Unique Patterns

Mobile sensing data exhibits unique patterns that serve as fingerprints: - Activity correlations (gym after train ride) - Temporal patterns (coffee at 8am daily) - Location sequences (home then gym then work)

16.6.2 Auxiliary Information Attacks

Attackers can observe targets to collect samples: - Public social media check-ins - Physical observation - Social engineering

Netflix Challenge Lesson: With 8 movie ratings and dates (plus or minus 14 days), researchers identified 99% of users in “anonymized” dataset by cross-referencing IMDB.

Mobile Sensing is Worse: Broader range of activities and stronger correlations make de-anonymization easier.

16.7 Motion Sensor Side-Channel Attacks

Zero-Permission Sensors

Mobile apps can access motion sensors (accelerometer, gyroscope) without requesting runtime permissions, enabling side-channel attacks that infer user behavior and keystrokes.

16.7.1 Motion Sensor Attack Capabilities

Attack Sensor Accuracy Implication
Keystroke inference Accelerometer 70-80% PIN/password theft
Activity recognition Accelerometer + Gyro 90%+ Behavior profiling
Indoor location Accelerometer + Gyro Room-level Retail tracking
Speech detection Accelerometer Limited Eavesdropping

16.9 Case Study: London Trash Bins (2013) and the Wi-Fi Tracking Debate

In 2013, a company called Renew installed Wi-Fi tracking devices in 12 recycled trash bins on the streets of London’s financial district. The bins contained digital advertising screens on the outside and a small Wi-Fi module inside that captured MAC addresses from passing smartphones.

What the bins collected: Over a single week, the 12 bins captured 4.009 million unique device signatures from an estimated 946,016 unique phones. The company mapped pedestrian flows, dwell times at shops, and commuter routes through the financial district.

The scale of the privacy breach:

Metric Value
Unique phones tracked in 1 week 946,016
Average detections per phone 4.2 times per week
Repeat visitors identified 68% (same phone seen at 2+ bins)
Commuter routes mapped 12,400+ regular patterns
Data collected without consent 100% of tracked individuals

Regulatory response: The City of London ordered the tracking stopped within days of public disclosure. The UK Information Commissioner’s Office investigated. The incident directly influenced the GDPR’s requirements for explicit consent before collecting personal identifiers, including MAC addresses.

Technical lesson: The entire tracking infrastructure cost under $500 per bin. A passive Wi-Fi receiver on a Raspberry Pi Zero can capture probe requests within a 50-meter radius. The barrier to Wi-Fi surveillance is essentially zero, making protocol-level defenses (MAC randomization, probe request suppression) the only viable protection.

Current status (2024+): Despite MAC randomization in iOS 14+ and Android 10+, commercial Wi-Fi analytics companies (e.g., Cisco Meraki, Aruba Analytics, Purple Wi-Fi) still operate by analyzing connected clients, probe request patterns, and signal strength triangulation. GDPR requires signage and opt-out mechanisms, but compliance varies widely.

Common Mistake: Believing MAC Randomization Prevents All Tracking

The Mistake: Users and developers assume MAC address randomization fully prevents Wi-Fi tracking.

Why It Fails: As detailed in the MAC Randomization Limitations section above, timing patterns, SSID lists, association behavior, and BLE correlation all undermine randomization. Retail trackers combine these vectors for multi-factor fingerprinting that achieves 72% re-identification even with MAC randomization enabled.

Real Example: The London Renew bins (2013) tracked 946,016 unique phones in one week. Modern systems use timing analysis to re-identify devices even after MAC changes.

Correct Approach: Disable Wi-Fi when not actively using it, minimize configured networks, and use a VPN for an additional layer.

Scenario: A European shopping mall (40,000 visitors/day) deploys a Wi-Fi analytics system to measure foot traffic, dwell times, and store-to-store flows. The mall operator wants to understand which defense mechanisms actually protect visitor privacy and which are theater. You are hired to evaluate the real-world effectiveness of four proposed privacy measures.

Step 1: Baseline Tracking Capability

The mall has 120 Wi-Fi access points covering 25,000 square meters. Without any privacy measures, the system captures:

Raw data collected per visit (average 47 minutes):
  MAC address:          1 (persistent identifier)
  Probe requests:       23 (average, broadcasting SSIDs)
  Association events:    1.4 (connecting to mall Wi-Fi)
  Signal strength readings: 847 (triangulated positions)
  Configured SSIDs revealed: 8.3 (average per device)

Tracking accuracy: 3-5 meter indoor positioning
Re-visit identification: 94% (same MAC seen on return visits)
Social group detection: 78% (devices moving together)

Step 2: Evaluate Defense 1 – MAC Address Randomization (iOS 14+/Android 10+)

Test with 500 volunteer devices over 1 week:

Metric Without Randomization With Randomization Reduction
Unique MACs per visit 1 1.3 (multiple random) N/A
Cross-visit re-identification 94% 67% 29%
Probe request SSID leakage 8.3 SSIDs 2.1 SSIDs (iOS suppresses) 75%
Timing pattern fingerprinting N/A 87% re-identification New vector
Combined re-identification 94% 72% 23%

Finding: MAC randomization reduces naive tracking by 29% but timing patterns (probe request intervals, burst patterns) and SSID combinations still enable 72% re-identification. Randomization is necessary but insufficient.

Step 3: Evaluate Defense 2 – Hashed MAC Aggregation

The mall operator proposes hashing MAC addresses before storage: SHA-256(MAC + daily_salt).

Attack: With only 2^48 possible MAC addresses (48-bit address space), a full rainbow table is storage-impractical, but a targeted attack against common manufacturers takes under 1 second of GPU time. Even with daily salt rotation, an attacker who obtains one day’s salt can reverse all hashes for that day.

Calculation:

Rainbow table for 48-bit MAC space:
  Possible MACs: 2^48 = 281 trillion
  SHA-256 hash rate (RTX 4090): 22 billion/second
  Time to compute full table: 281T / 22B = 12,773 seconds = 3.5 hours
  Storage: 48 bytes per entry x 2^48 = ~13.5 PB (impractical for full table)

Practical attack (targeted):
  Common OUI prefixes (Apple, Samsung, Google): ~200 prefixes
  Reduces search space to 200 x 2^24 = 3.35 billion MACs
  Hash time: 3.35B / 22B = 0.15 seconds
  Storage: 3.35B x 48 bytes = 161 GB (fits on one SSD)

Finding: Hashing MACs provides minimal protection against targeted attacks. The limited address space makes reversal trivial for common device manufacturers.

Try adjusting the parameters below to see how OUI count and GPU speed affect attack feasibility:

Step 4: Evaluate Defense 3 – Differential Privacy Noise Injection

Add Laplace noise to aggregate counts before reporting:

True count at Store A (1 PM): 347 visitors
Epsilon (privacy budget): 1.0
Sensitivity: 1 (adding/removing one person changes count by 1)
Noise scale: 1/epsilon = 1.0

Noisy count = 347 + Laplace(0, 1.0) = 347 +/- 1.4 (95% CI)
Reported: 346 visitors (useful for business, privacy-preserving)

True dwell time at Store A: 8.2 minutes (average)
Noisy dwell time = 8.2 +/- 0.3 minutes
Reported: 8.0 minutes (still useful)

Privacy guarantee: With epsilon=1.0, an attacker observing the output cannot determine whether any specific individual was present with confidence better than e^1 = 2.72x (versus not present). For aggregate foot traffic, this provides meaningful privacy while preserving business utility.

Step 5: Recommended Architecture

Layer Mechanism What It Protects Residual Risk
Device MAC randomization Direct MAC tracking Timing fingerprints
Collection No raw MAC storage Identity persistence Session-level tracking
Aggregation k-anonymity (k >= 20) Individual trajectories Small-group patterns
Reporting Differential privacy (epsilon = 1.0) Aggregate statistics High-epsilon queries
Governance 15-minute data TTL Historical re-identification Real-time window

Result: The layered defense reduces re-identification from 94% (no defense) to less than 5% (all layers combined). The mall retains useful aggregate analytics (foot traffic within +/- 2%, dwell times within +/- 0.5 minutes) while achieving GDPR compliance through data minimization and purpose limitation.

Key lesson: No single privacy mechanism is sufficient for Wi-Fi tracking defense. MAC randomization fails against timing analysis, hashing fails against rainbow tables, and differential privacy alone cannot protect individual trajectories. Only layered defenses with strict data retention limits achieve meaningful privacy at scale.

16.10 Comprehensive Protection Framework

Effective mobile privacy protection requires multiple layers. The following decision framework summarizes available controls:

Risk Vector Default Protection Enhanced Protection When to Use Enhanced
Wi-Fi MAC tracking MAC randomization (iOS 14+/Android 10+) Remove configured networks when not needed Public spaces, retail stores
Probe request SSIDs Limited SSID broadcast (iOS) Use generic network names (“Home” not “John’s House”) Always
Motion sensors No permission required Reduce sampling rate, user notification Privacy-critical apps
BLE beacons Opt-in only Disable Bluetooth when not needed Untrusted environments

Wi-Fi Defenses:

  1. Disable Wi-Fi when not actively using
  2. Remove unused network configurations
  3. Use generic SSID names (avoid “JohnsHome”)
  4. Verify MAC randomization is enabled

Sensor Privacy:

  1. Review app sensor permissions
  2. Use browsers with motion sensor restrictions
  3. Monitor app background activity
  4. Prefer apps with transparent data practices

Behavioral Privacy:

  1. Vary daily routines when possible
  2. Limit public social media check-ins
  3. Be aware of patterns in aggregate data
  4. Review what third-party SDKs apps contain

16.10.1 Wi-Fi Signal Attenuation and Detection Range

Wi-Fi signal strength follows the Friis transmission equation, enabling distance estimation from received signal strength (RSS).

\[P_r = P_t \times G_t \times G_r \times \left(\frac{\lambda}{4\pi d}\right)^2\]

where \(P_r\) is received power, \(P_t\) is transmitted power, \(G_t\) and \(G_r\) are antenna gains, \(\lambda\) is wavelength, and \(d\) is distance.

Use the interactive calculator below to explore how transmit power, frequency, and distance affect Wi-Fi tracking detection range. Indoor attenuation (20–30 dB from walls, people, and multipath fading) is added to the free-space calculation.

Key insight: At default settings (100 mW, 2.4 GHz, 25 dB indoor loss), beacons reliably detect smartphones within approximately 20m indoors. Shopping malls space beacons 15m apart to achieve >95% coverage. MAC randomization provides limited protection because probe request timing patterns remain trackable. The only effective defense is disabling Wi-Fi when not actively using it.

16.11 Summary

Wi-Fi and sensing create additional privacy attack vectors:

Wi-Fi Privacy Leaks:

  • MAC addresses enable cross-location tracking
  • Probe requests broadcast configured network lists
  • SSID combinations create unique fingerprints
  • Social relationships inferred from shared networks

MAC Randomization Limitations:

  • Timing patterns still fingerprint devices (87% accuracy)
  • SSIDs still broadcast during probing
  • Persistent per-network MAC used upon association
  • Bluetooth not synchronized

Sensing De-anonymization:

  • Motion patterns create behavioral fingerprints
  • Activity correlations unique to individuals
  • No permissions required for accelerometer/gyroscope
  • 70-80% keystroke inference accuracy

Key Takeaway: Even “anonymized” mobile data is highly identifiable. Privacy protection requires preventing data collection, not just anonymization.

Common Pitfalls

While MAC randomization reduces Wi-Fi probe tracking, it doesn’t eliminate it. Timing patterns of probes, remembered SSID lists, and re-association patterns can still partially identify devices. MAC randomization reduces but doesn’t eliminate Wi-Fi tracking risk.

Wi-Fi probe capture from public spaces collects data from all Wi-Fi-enabled devices in range, not just those on your network. Deploying retail foot traffic analytics, occupancy sensing, or “anonymous” movement tracking without disclosure creates significant privacy compliance risk.

Accelerometer data patterns (gait signatures), touchscreen pressure signatures, and keystroke dynamics can identify individuals with high accuracy without any location data. Privacy engineering must consider behavioral fingerprinting from all sensor types, not just traditionally privacy-sensitive ones.

“Anonymous” analytics showing aggregate foot traffic patterns can still enable re-identification of individuals with distinctive patterns (unique commute times, unusual routes). Apply privacy threat modeling to aggregate analytics deployments to identify re-identification risks.

16.12 What’s Next

If you want to… Read this
Review the complete mobile privacy series Mobile Privacy Overview
Learn about location tracking risks specifically Location Privacy
Study secure coding for IoT data protection Secure Data and Software
Apply privacy-by-design patterns Privacy by Design Patterns
Understand zero trust security architecture Zero Trust Fundamentals
← Location Privacy Mobile Privacy Overview →