Scenario: A smart thermostat collects temperature readings every 15 minutes. Analyze what can be inferred from 6 months of “innocuous” temperature data.
Data Collected:
Format: timestamp, indoor_temp, outdoor_temp, heating_on
Example: 2024-01-15 08:00, 18.5C, 2C, true
Total data points: 6 months x 30 days x 96 readings/day = 17,280 readings
Data volume: 17,280 x 20 bytes ~ 338 KB (tiny!)
What Can Be Inferred:
1. Occupancy Patterns (from temperature fluctuations)
Analysis:
- Temperature drops 3C at night -> bedtime ~11 PM
- Temperature rises sharply at 6:30 AM -> wake-up time
- No heating 8 AM - 6 PM weekdays -> house empty during work
- Weekend pattern different -> home on weekends
Privacy Impact:
- Burglars know when house is empty (8 AM - 6 PM weekdays)
- Stalkers know daily routine
- Insurance companies detect "risky" empty periods
2. Health Conditions (from unusual patterns)
Analysis:
- Sudden 24/7 home presence (week of Jan 20-27) -> illness or vacation
- Temperature raised to 22C continuously -> elderly or medical condition
- Erratic heating patterns at night -> insomnia or shift work
Privacy Impact:
- Employers detect sick days not taken as vacation
- Insurance companies infer pre-existing conditions
- Landlords detect unauthorized occupants
3. Socioeconomic Status (from heating behavior)
Analysis:
- Heating set to 16C (vs typical 20C) -> low income, saving money
- Temperature never below 21C -> wealthy, not price-sensitive
- Heating off during peak pricing hours -> sophisticated, cost-conscious
Privacy Impact:
- Utility companies offer different rates based on inferred income
- Marketers target ads (luxury vs budget products)
- Potential discrimination in services
De-anonymization Risk:
Combine temperature data with:
+ Public property records -> know exact address
+ Social media posts -> "on vacation Jan 20-27" confirms identity
+ Utility company records -> link to specific meter
Result: "Anonymous" thermostat ID linked to real person with full behavioral profile
Mitigation Strategies:
| Aggregate to hourly |
Medium (reduces granularity) |
None |
DO THIS |
| Add noise (+/- 2C) |
High (masks real values) |
Medium (reduces accuracy) |
For shared data only |
| Delete after 7 days |
Very High (limits exposure) |
High (no long-term insights) |
For high-sensitivity |
| Edge processing only |
Very High (data never leaves home) |
None |
BEST PRACTICE |
Key Insight: Even “boring” temperature data reveals intimate behavioral patterns when collected over time. This is the fundamental challenge of IoT privacy – devices see EVERYTHING you do.