13  Mobile Privacy

Learning Objectives

After completing this chapter series, you will be able to:

  • Analyze mobile data collection practices and evaluate the effectiveness of Android/iOS permission models for privacy protection
  • Apply privacy leak detection techniques including data flow analysis, taint tracking, and static analysis
  • Assess location privacy risks including de-anonymization attacks and the limitations of k-anonymity for mobility data
  • Evaluate Wi-Fi and motion sensor privacy threats including probe request tracking and zero-permission sensing attacks
In 60 Seconds

Mobile IoT privacy spans four interconnected challenges: data collection from diverse sensors, location tracking through GPS and passive Wi-Fi sensing, privacy leak detection in apps, and behavioral fingerprinting from sensor fusion. Building privacy-respecting mobile IoT requires addressing all four dimensions with appropriate technical controls and user transparency.

Key Concepts

  • Mobile IoT Privacy: Privacy engineering discipline addressing the unique challenges of always-carried, multi-sensor devices that collect rich personal context continuously.
  • Sensor Diversity: Mobile IoT devices integrate GPS, accelerometer, Wi-Fi, Bluetooth, camera, microphone, NFC, and biometric sensors creating comprehensive personal data profiles.
  • Contextual Data: Rich behavioral context derived from mobile sensor fusion including activity recognition, location patterns, social proximity, and health indicators.
  • Privacy Threat Surface: Complete set of ways a mobile IoT system can expose personal data — through collection, transmission, storage, analysis, sharing, and inference.
  • Privacy-Utility Trade-off: Tension between data richness needed for mobile IoT functionality and privacy protection; privacy-preserving techniques reduce risk while maintaining acceptable utility.
  • User Control: Technical mechanisms giving users visibility into data collection and meaningful ability to restrict, review, and delete their personal data.
  • Platform Privacy Controls: OS-level privacy features (iOS privacy labels, Android permission model) providing baseline controls that IoT app developers must design within and beyond.

Privacy and compliance for IoT are about protecting people’s personal information and following the laws that govern data collection. Think of it like the rules a doctor follows to keep medical records confidential. IoT devices in homes, workplaces, and public spaces collect sensitive data about people’s lives, and there are strict requirements about how this data must be handled.

“Your smartphone is the most powerful sensor device you carry,” Sammy the Sensor said. “It has GPS, accelerometers, cameras, microphones, Wi-Fi scanners, Bluetooth, and cellular radios – all collecting data about you constantly!”

Max the Microcontroller added, “And in IoT systems, your phone often acts as the gateway between you and your smart devices. When you use an app to control your smart home, the phone is in the middle of every conversation. That makes mobile privacy absolutely critical.”

“The four big privacy risks on mobile are data collection, location tracking, Wi-Fi sensing, and permission leaks,” Lila the LED explained. “Apps often ask for way more permissions than they need. Why does a flashlight app need access to your contacts? Why does a weather app need your microphone? Each unnecessary permission is a potential privacy leak.”

“This chapter series covers all four areas,” Bella the Battery said. “You will learn how mobile data is collected and shared, how your location can be tracked through GPS and cell towers, how Wi-Fi signals reveal your movements, and how to detect when apps leak your private data. Knowledge is your first defense against mobile privacy threats!”

13.1 Overview

Mobile devices generate vast amounts of sensitive user data through sensors, location services, Wi-Fi connections, and cellular networks. Understanding how this data is collected, shared, and potentially leaked is crucial for protecting user privacy in IoT ecosystems where mobile phones often serve as gateways. This topic is covered in four focused chapters.

13.1.1 1. Mobile Data Collection and Permissions

Learn what data mobile devices collect and how Android/iOS permission models attempt to control access. Topics include:

  • Mobile sensor data types (location, accelerometer, microphone)
  • Android permission tiers (normal, dangerous, special)
  • Permission risk assessment and combination dangers
  • Why permission models fail to protect privacy

13.1.2 2. Privacy Leak Detection

Discover techniques for detecting unauthorized data exfiltration from mobile apps. Topics include:

  • Data Flow Analysis (DFA) from sources to sinks
  • Capability leaks through shared User IDs
  • TaintDroid dynamic taint tracking
  • Static analysis with LeakMiner
  • Comparing static vs dynamic analysis trade-offs

13.1.3 3. Location Privacy Leaks

Understand why location data is especially dangerous and how de-anonymization attacks work. Topics include:

  • What location traces reveal about individuals
  • De-anonymization using home and work inference
  • K-anonymity requirements (K >= 5,000 for mobility)
  • Why differential privacy fails for trajectories
  • Location privacy defenses

13.1.4 4. Wi-Fi and Sensing Privacy

Learn how Wi-Fi and motion sensors create additional tracking vectors. Topics include:

  • Wi-Fi probe request privacy leaks
  • MAC address randomization limitations
  • Mobile sensing de-anonymization
  • Zero-permission motion sensor attacks
  • Comprehensive protection frameworks

13.2 Key Takeaways

  • 4 spatiotemporal points uniquely identify 95% of individuals
  • Mobile sensing requires K >= 5,000 for anonymity (1,000x more than movie ratings)
  • 73% of apps send data to third-party tracking companies
  • Motion sensors require zero permissions but enable 70-80% keystroke inference
  • MAC randomization fails due to timing patterns and SSID leakage

13.3 Learning Path

Learning path diagram showing the four chapters in the mobile privacy series: data collection and permissions, privacy leak detection, location privacy leaks, and Wi-Fi and sensing privacy

13.4 Prerequisites

Before starting this series, you should be familiar with:

Scenario: A fitness tracker company develops an Android/iOS companion app for their wearable device. Before launch, conduct a comprehensive privacy audit covering all four areas: data collection, leak detection, location privacy, and Wi-Fi/sensing privacy.

Given:

  • App requests 8 permissions: Location (always), Camera, Microphone, Contacts, Storage, Phone State, Bluetooth, Motion sensors
  • Backend: Cloud analytics, third-party ad SDK, social features (find friends)
  • User base: 500,000+ downloads expected in first year
  • Regulatory: Must comply with GDPR (EU users), CCPA (California)

Step 1: Data Collection Audit (Permission Necessity)

Permission Stated Purpose Actual Usage Found Necessity Verdict
Location (always) “Track outdoor runs” GPS sampled every 5 sec even when app backgrounded OVER-COLLECTION: Change to “only while using”
Camera “Scan food barcodes” Used once per session, appropriate NECESSARY
Microphone Not disclosed Analytics SDK records 3-sec audio clips for “engagement analysis” PRIVACY LEAK: Remove or get explicit consent
Contacts “Find friends” Entire contact list hashed and uploaded to match with other users OVER-SHARING: Should be opt-in, not auto-upload
Storage “Save workout photos” Legitimate NECESSARY
Phone State (IMEI) “Device identification” IMEI sent to 4 analytics endpoints for cross-app tracking PRIVACY VIOLATION: Replace with instance ID
Bluetooth “Connect to fitness tracker” Legitimate, scoped to pairing NECESSARY
Motion sensors “Step counting” Legitimate, no permission required (design flaw) NECESSARY (but zero-permission risk)

Findings:

  • 3 permissions are over-broad (Location, Microphone, Phone State)
  • 1 permission used without disclosure (Microphone for analytics)
  • 1 permission enables third-party tracking (Phone State → IMEI → cross-app profiles)

Step 2: Privacy Leak Detection (Dynamic Analysis)

Run TaintDroid-style analysis for 24 hours:

Data Source Taint Label Sink Detected Consent? Verdict
getLastLocation() LOCATION HttpURLConnection → ads.tracker.com No LEAK
getLastLocation() LOCATION HttpURLConnection → api.fitness.com Yes (ToS) OK
getDeviceId() (IMEI) DEVICE_ID HttpURLConnection → analytics.thirdparty.com No LEAK
getContacts() CONTACTS HttpURLConnection → social.fitness.com Implicit (auto-upload) LEAK (no explicit consent)
Camera frames CAMERA Local storage only N/A OK
Microphone samples AUDIO HttpURLConnection → engagement.analytics.com No disclosure LEAK

Results:

  • 4 confirmed privacy leaks (Location, IMEI, Contacts, Audio)
  • Combined taint: Location + IMEI creates persistent tracking profile
  • Third-party SDK (ad network) receives tainted data without user awareness

Step 3: Location Privacy Assessment

User “Alice” has 180 days of GPS traces (outdoor runs):

Home inference:
  Most frequent start location (6:00-9:00 AM): (37.7749, -122.4194) → Census block 060750201001
  Appears in 156/180 days → High confidence home location

Work inference:
  Most frequent midday location (12:00-13:00 PM): (37.7849, -122.4094) → Census block 060750502003
  Appears in 142/180 weekdays → High confidence work location

De-anonymization risk:
  Cross-reference home census block with voter registration → 847 residents
  Cross-reference work census block with LinkedIn data → 2,100 employees in tech companies
  Intersection (lives in block A, works in block B): ~3-5 individuals
  Add 2-3 more spatiotemporal points (gym, grocery) → Unique identification (95% confidence)

Finding: Even “anonymized” user ID can be de-anonymized with location traces. K-anonymity requires k≥5,000 for mobility data, but current system has k≈3-5.

Step 4: Wi-Fi and Motion Sensor Privacy

Wi-Fi Probe Request Analysis:

Phone broadcasts SSIDs: "Home-WiFi-5G", "TechCorp-Guest", "Alice's iPhone", "Starbucks"
Privacy leakage:
  - "Home-WiFi-5G" → Likely contains address or name
  - "TechCorp-Guest" → Reveals employer
  - "Alice's iPhone" → Personal hotspot name (identity)
  - Combination of 4 SSIDs → Unique fingerprint (k=1)

Motion Sensor Side-Channel:

# Accelerometer accessible without permission (Android/iOS)
# Keystroke inference attack simulation

def infer_keystrokes_from_motion(accel_data, sampling_rate=100):
    """Infer typed text from phone vibrations during typing."""
    from sklearn.ensemble import RandomForestClassifier

    # Train model on labeled accelerometer data (attacker collects samples)
    model = RandomForestClassifier()
    model.fit(training_data, labels)  # 70-80% accuracy on PINs

    # Infer keystrokes from victim's accelerometer data
    predicted_keys = model.predict(accel_data)
    return predicted_keys

# Result: Motion sensors enable keystroke inference WITHOUT permission
# Mitigation: Browsers now require permission (iOS 12.2+), but native apps do not

Step 5: Comprehensive Remediation Plan

Issue Fix Implementation Time Risk Reduction
Location: “always” → “while using” Change permission request 2 hours Eliminates background tracking (17,280 daily samples → 0 when inactive)
Remove IMEI, use instance ID Replace getDeviceId() with UUID.randomUUID() 4 hours Eliminates cross-app tracking
Add consent for contact upload GDPR-compliant opt-in dialog 3 days Enables legal basis for social feature
Remove microphone audio collection Delete analytics SDK audio module 1 hour Stops undisclosed audio recording
Add consent for ad location sharing Set AdMob.setLocationEnabled(false) by default 1 hour Stops location sharing with ad network
Coarsen location to city-level for public leaderboards Aggregate GPS to city-level before social features 2 days Reduces de-anonymization risk (k=1 → k≈50,000)
Implement MAC randomization verification Test Wi-Fi probe behavior on user devices 1 day Verify privacy controls work

Step 6: Post-Remediation Verification

Privacy Metric Before After Improvement
Permissions requested 8 (all dangerous) 5 (necessary only) 37% reduction
Background location samples/day 17,280 (every 5 sec) 0 (only when active) 100% elimination
Third-party data sharing 4 SDKs receive PII 1 SDK (with consent) 75% reduction
De-anonymization risk (k-anonymity) k≈3-5 k≈50,000 (city-level) 10,000× improvement
GDPR compliance violations 4 (no legal basis) 0 (consent-based) Full compliance
User trust score (survey) 32% “trust with privacy” 78% “trust with privacy” 144% increase

Result: The comprehensive audit identified 4 major privacy leaks, over-collection of 3 permissions, and inadequate consent mechanisms. Remediation reduced third-party data sharing by 75%, eliminated background tracking, and achieved full GDPR compliance—while maintaining all core app functionality.

Key Insight: Mobile privacy audits must cover all four areas (permissions, leaks, location, Wi-Fi/sensing) because privacy violations often span multiple attack surfaces. A permission audit alone would have missed the zero-permission motion sensor risk. A network traffic audit alone would have missed the MAC address fingerprinting. Comprehensive audits require both static code analysis AND dynamic runtime monitoring across all data flows.

Use this matrix to prioritize privacy risks and allocate remediation resources:

Data Type Sensitivity Re-identification Risk Regulatory Impact Recommended Action Remediation Priority
GPS location (precise) Critical 95% with 4 points GDPR Art. 5/6 (personal data requiring lawful basis) Coarsen to city-level OR explicit opt-in + 30-day retention P0 (immediate)
IMEI / Device ID High 100% persistent across apps GDPR Art. 5 minimization violation Replace with per-install UUID P0 (immediate)
Contacts list High Social graph inference GDPR Art. 5/6 + CCPA Opt-in only, never auto-upload P0 (immediate)
Microphone (undisclosed) Critical Conversation capture Wiretap laws + GDPR Remove OR explicit disclosure + consent P0 (immediate)
Wi-Fi SSIDs (probe requests) Medium Unique fingerprint (k≈1) GDPR Art. 5(1)(c) Verify MAC randomization, suppress SSID broadcast P1 (within 1 month)
Motion sensors (accelerometer) Medium 70% keystroke inference No direct regulation (zero-permission) Reduce sampling rate, add user notice P2 (within 3 months)
Camera (disclosed use) Low User-initiated only None (legitimate purpose) No action needed P3 (monitor)

Risk Calculation Formula:

Privacy Risk Score = (Sensitivity × Re-ID Risk × Regulatory Multiplier) / Consent Quality

Where:
  Sensitivity: 1 (low) to 10 (critical) - based on data type
  Re-ID Risk: 0.0 to 1.0 - probability of re-identification
  Regulatory Multiplier: 1.0 (no regulation) to 3.0 (strict regulation, e.g., GDPR + wiretap laws)
  Consent Quality: 0.2 (no consent) to 1.0 (explicit opt-in)

Example (GPS location with no consent):
  Risk = (10 × 0.95 × 3.0) / 0.2 = 142.5 (CRITICAL)

Example (Camera with user-initiated consent):
  Risk = (3 × 0.1 × 1.0) / 1.0 = 0.3 (LOW)

Interactive Risk Calculator:

Decision Tree for Permission Requests:

START: Should I request this permission?

1. Is it required for core app functionality?
   NO  → Don't request it (privacy by design)
   YES → Continue

2. Can I achieve the same result with less sensitive data?
   YES → Use the less sensitive alternative
   NO  → Continue

3. Can I process the data on-device without cloud transmission?
   YES → Prefer edge processing (data minimization)
   NO  → Continue

4. Is this permission "dangerous" (requires runtime consent)?
   YES → Continue
   NO  → OK to use, but document in privacy policy

5. Have I disclosed the specific use case in privacy policy?
   NO  → Add disclosure before requesting
   YES → Continue

6. Will third-party SDKs access this permission?
   YES → Audit SDK privacy policy, consider removal
   NO  → OK to request

7. Can I use "while using" instead of "always" (for location)?
   YES → Use most restrictive option (privacy by default)
   NO  → Document why "always" is necessary

8. Request permission with clear explanation of benefit to user

Third-Party SDK Audit Checklist:

Before integrating any SDK, verify:

Common High-Risk SDK Behaviors:

SDK Type Common Privacy Risk Detection Method Mitigation
Ad Networks Location tracking without user awareness Network traffic capture → lat/lon in POST body Disable location sharing: SDK.setLocationEnabled(false)
Analytics IMEI collection for cross-app tracking Static analysis → getDeviceId() calls Replace with SDK-provided anonymous ID
Social Contact list auto-upload Network traffic → contacts JSON in requests Make feature opt-in, not default
Crash Reporting PII in stack traces Review crash logs for email/name/phone Scrub PII before transmission

Key Insight: Mobile privacy requires continuous monitoring because third-party SDKs update independently and can introduce new privacy leaks. Establish a quarterly SDK privacy audit process, not just one-time pre-launch review.

Common Mistake: Trusting “Privacy-Friendly” SDK Claims Without Verification

The Mistake: Developers integrate third-party SDKs (analytics, ads, crash reporting) based on vendor claims of “privacy-first” or “GDPR-compliant” without independent verification.

Why It Fails:

  • SDKs can collect data beyond documented APIs (background location, contact lists, IMEI)
  • “GDPR-compliant” often means “we have a privacy policy” not “we minimize data collection”
  • SDK updates can introduce new data collection without app developer awareness
  • Many SDKs share data with dozens of fourth-party partners (data brokers, ad exchanges)

Real-World Example:

  • Facebook SDK (2018): Collected call logs and SMS metadata on Android, undisclosed to app developers
  • Sensor Tower SDK (2020): Embedded tracking code in VPN apps, violating user expectations
  • X-Mode SDK (2021): Location data broker SDK embedded in prayer and weather apps, sold location data to military contractors

Consequences:

  • Legal: App developer is the data controller under GDPR, liable for SDK violations
  • Regulatory: Multi-million euro fines possible for app developers due to SDK privacy violations (even if SDK vendor at fault)
  • Reputational: “Your app sends data to 37 third parties” headlines damage user trust

How to Verify SDK Privacy:

  1. Static Analysis: Decompile SDK and search for sensitive API calls

    # Extract SDK and scan for privacy-sensitive API calls
    unzip analytics-sdk.aar
    grep -r "getLastLocation\|getDeviceId\|getLine1Number" .
  2. Network Traffic Analysis: Run app with MITM proxy (mitmproxy)

    # Capture all network requests from SDK
    mitmproxy --mode reverse:https://api.sdk-vendor.com
    # Look for: lat/lon, IMEI, contact hashes in POST bodies
  3. Exodus Privacy Scan: Use automated SDK analysis tool

    # Scan APK for known tracker SDKs
    https://reports.exodus-privacy.eu.org/
    # Output: "37 trackers detected: Google Analytics, Facebook, Mixpanel..."
  4. Differential Analysis: Compare app behavior with/without SDK

    # Measure permission requests and network traffic
    # WITHOUT SDK: 5 permissions, 20 HTTP requests/hour
    # WITH SDK: 8 permissions, 95 HTTP requests/hour (75 from SDK!)

Correct Approach:

  • Integrate only essential SDKs (question whether analytics is truly necessary)
  • Pin SDK versions (don’t auto-update without privacy re-audit)
  • Use SDK privacy features (setLocationEnabled(false), anonymous IDs)
  • Conduct quarterly audits (SDK updates introduce new risks)
  • Provide SDK opt-out (let users disable analytics/ads)
  • Disclose all SDKs in privacy policy with links to their privacy policies

Key Insight: You cannot delegate privacy responsibility to SDK vendors. Under GDPR, the app developer is the data controller and is liable for all data collection, including by third-party code. Treat SDK integration as security-critical and privacy-critical, not just a “few lines of initialization code.”

13.5 Knowledge Check

Location entropy quantifies the unpredictability of a user’s location history.

Location Entropy: \[H(\text{Location}) = -\sum_{i=1}^{n} p_i \log_2 p_i\]

where \(p_i\) is the probability of being at location \(i\).

Re-Identification Theorem (de Montjoye et al., Nature 2013): With location data, the probability of unique identification follows:

\[P(\text{unique | k \text{ points}}) \approx 1 - e^{-\lambda k}\]

where \(\lambda\) depends on dataset resolution and population density.

Working through an example:

Given: Smartphone location traces with 4 spatiotemporal points

Step 1: Calculate location entropy for one user

User Alice visits during one week: - Home (37.7749, -122.4194): 50% of time (\(p_1 = 0.5\)) - Work (37.7849, -122.4094): 35% of time (\(p_2 = 0.35\)) - Gym (37.7649, -122.4294): 10% of time (\(p_3 = 0.1\)) - Coffee shop (37.7549, -122.4394): 5% of time (\(p_4 = 0.05\))

\[H = -[0.5 \log_2(0.5) + 0.35 \log_2(0.35) + 0.1 \log_2(0.1) + 0.05 \log_2(0.05)]\] \[H = -[0.5(-1) + 0.35(-1.51) + 0.1(-3.32) + 0.05(-4.32)]\] \[H = 0.5 + 0.53 + 0.33 + 0.22 = 1.58 \text{ bits}\]

Step 2: Calculate uniqueness probability

For 1.5M user dataset (de Montjoye study): - \(\lambda \approx 0.35\) (empirical parameter) - \(k = 4\) spatiotemporal points

\[P(\text{unique}) = 1 - e^{-0.35 \times 4} = 1 - e^{-1.4} = 1 - 0.247 = 0.753\]

Result: With just 4 location points, 75% probability of unique identification. Study found actual rate was 95% in practice.

Step 3: Mobile app collection rate

Typical fitness app: - GPS sampling: Every 5 seconds during exercise - 30-minute run: \(\frac{30 \times 60}{5} = 360 \text{ points}\) - Points needed for 95% uniqueness: 4 - Excess factor: \(\frac{360}{4} = 90\times\) more data than needed for re-identification

Step 4: K-anonymity requirement

To achieve k=5,000 anonymity for location data (Golle & Partridge, 2009): - Spatial resolution: Coarsen to city level (10+ km radius) - Temporal resolution: Daily aggregates only (no hourly) - Retention: Maximum 7 days

Interactive Location Privacy Calculator:

In practice: Mobile location entropy is extremely low (1-3 bits) because humans are creatures of habit. This makes re-identification trivial even with “anonymized” data. IoT apps collecting GPS every few seconds generate 90x more location data than needed for 95% re-identification, making user tracking inevitable without aggressive privacy controls.

Common Pitfalls

Mobile privacy actually involves distinct but interconnected problems: collection privacy (what is gathered), transmission privacy (how it travels), storage privacy (where it persists), and inference privacy (what can be derived). Privacy controls adequate for one dimension may be completely inadequate for another.

iOS and Android provide privacy controls (permissions, privacy labels, tracking transparency) that represent a floor, not a ceiling. Privacy-respecting mobile IoT apps need additional application-level controls beyond what the OS provides, including granular consent, data minimization in design, and user-accessible data deletion.

Privacy behavior changes between OS versions as platforms add new restrictions and controls. Permission behavior, background collection limits, and MAC randomization work differently across iOS and Android versions. Test privacy controls across the target OS version range in your user base.

Privacy problems embedded in feature designs are expensive to fix after development. “Show users nearby friends” features designed without considering location data privacy create privacy violations that are hard to remediate without removing the feature. Include privacy review in feature design, not just code review.

13.6 What’s Next

After completing the Mobile Privacy series, continue to Secure Data and Software to learn how to prevent web application vulnerabilities, implement secure coding practices, and protect IoT protocols.

Start with Mobile Data Collection and Permissions

← Introduction to Privacy Mobile Data Collection →