17 Mobile Privacy Leak Detection

Sources, Sinks, Runtime Evidence, Local Storage, SDK Paths, Endpoint Review, Fix Records, and Regression Tests

privacy

mobile-privacy

leak-detection

data-flows

runtime-review

evidence

Keywords

mobile privacy leak detection, IoT app data flows, source sink review, runtime privacy testing, SDK privacy review, endpoint evidence

17.1 Does the App Send Only What We Think?

Start with a common mobile-IoT worry: a user opens the companion app for a tracker, lock, or health sensor and assumes the data stays where the feature needs it. The app may look quiet, but the real privacy question is whether a location, identifier, photo, or sensor reading takes an extra trip to a log, SDK, cloud endpoint, or backup that nobody meant to approve.

Leak detection answers one blunt question about a product you already build: does the app actually send only what you believe it sends? Privacy leaks live in the gap between the intended data flows and the real ones. A leak is sensitive data reaching a place it should not, such as an undisclosed third-party endpoint, a plaintext log, an unprotected file, or a backup that outlives the feature.

The discipline matters because intentions are not evidence. A design document can describe a careful, minimal app while the running build quietly ships an identifier to an advertising network through a bundled library. Leak detection trusts observed behavior over assumptions, which is why most of the work happens by running the app and watching what comes out.

If you only need the intuition, this layer is enough: pick a sensitive source, follow it to every place it can end up, capture evidence of what actually leaves, and keep a fix-and-retest record. Do this on your own test build and test accounts, never on other people's traffic.

Source-to-Sink Evidence

Source -> observed sink -> expected purpose -> fix -> fresh capture.

A leak review is complete only when the new capture proves the sensitive source no longer reaches the wrong sink.

Think of tracing the plumbing in a house. You know where water should flow, so leak detection is checking for water where it should not be. The sensitive data sources are the taps, and the places data can flow are the drains: the network, local storage, logs, and exports. A leak is a wet patch behind a wall that no plan accounts for.

The One-Minute Leak Hunt

Name sources and sinks

List the sensitive data the app can read and every place it could flow: network, storage, logs, and exports.

Observe the running app

Run your own test build and watch the actual traffic and storage, because behavior is the only real evidence.

Record fix and retest

For each finding, write the source-to-sink path, the fix, the retest result, and a regression test.

Beginner Examples

Precise location sitting in a debug log is a leak, even though it never left the device on purpose.
An identifier in a request to an advertising domain is a leak, even when the connection is encrypted.
A code review approval is not proof a flow stopped; a fresh capture showing the data is gone is.

Overview Knowledge Check

Start simple: choose one sensitive value, make the app use it once, and check every place that value could appear. If you can define a leak as data reaching a place it should not, you have the core idea. Continue to the review layer to run a real source-to-sink trace.

17.2 Trace a Source to Every Sink

The practical method is to take one sensitive source at a time and follow it to every sink, gathering evidence as you go. Work on a test device or emulator, signed into a test account, so that anything you capture is your own. The output is a short record per finding: the path the data took, the fix, and a test that fails if the leak returns.

Do not stop at the first endpoint. A single source can flow through an SDK call, a network request, a local cache, a log line, and a crash report. The trace should follow the same source value through each possible sink, then repeat the exact user action after the fix so the reviewer can compare before and after evidence instead of trusting a code comment.

Responsible setup: inspect only your own app on devices and accounts you control. To read your own encrypted traffic, route a test device through a proxy that presents a certificate the device is configured to trust, and use a debug build if certificate pinning blocks inspection. Never intercept other people's traffic.

Where Each Source Can End Up

Source

Likely Sinks

How to Inspect

What a Leak Looks Like

Location

Network requests, logs, analytics SDKs, backups.

Capture traffic and search logs while using the feature.

Coordinates in a request body, URL, or log line.

Identifiers

Analytics and advertising endpoints, headers.

Inspect outgoing requests and headers for IDs.

An advertising or device ID sent to a third party.

Contacts or photos

Upload endpoints, caches, temporary files.

Watch network and local storage during the feature.

More uploaded than the user chose to share.

Sensor or health data

Cloud sync, third-party SDKs, crash reports.

Compare what syncs against what the feature needs.

Raw streams sent where a summary would do.

Any of the above

Logs, crash reports, clipboard, exports.

Read device logs and support exports after use.

Sensitive values written in cleartext to a log.

Two Lenses: Static and Dynamic

Static review

Read the manifest, permissions, and bundled SDK list to see what the app could do before running it.

Dynamic review

Run the app and observe real traffic, storage, and logs to see what it actually does.

Use both

Static review finds possibilities, dynamic review confirms reality; each catches what the other misses.

Mechanisms to Recognize

Mechanism

What It Shows

Typical Source

Review Evidence

Dynamic taint tracking

Tools such as TaintDroid label sensitive values at runtime and report when those labels reach a network, file, IPC, or log sink.

Location, identifiers, contacts, call state, sensor data.

A trace naming the source label, the app or SDK that propagated it, and the destination sink.

Static source-to-sink analysis

Offline analysis finds possible paths from sensitive APIs to transmission, storage, or inter-app outputs before the path is exercised.

Permission APIs, platform identifiers, local databases, SDK calls.

A candidate path that dynamic testing must confirm or dismiss on the reviewed build.

Capability leak

An app or related app reuses a permission, shared user id, exported component, intent, or content provider beyond the purpose the user understood.

Granted permissions and inter-app channels.

Manifest review plus a test showing which caller can read or forward the protected value.

Network and sensing side channel

Cellular registration, Wi-Fi scans, stable MAC-style identifiers, accelerometer patterns, or touch behavior can identify or locate a user even without an explicit app upload.

BTS or Wi-Fi observations, nearby-network lists, motion and interaction traces.

A privacy record that treats linkability, de-anonymization, and inferred home/work places as sources in their own right.

Worked Example: Location in an Analytics Request

While exercising a map feature on a test build, a capture shows the app sending precise coordinates to a third-party analytics domain. The connection is encrypted, but neither the feature nor the privacy label calls for sharing location with that company.

Step

What You Do

Evidence

Result

Trace

Follow location from the map feature to the network.

A captured request with coordinates to an analytics host.

Source-to-sink path is confirmed.

Judge

Check the path against the feature need and disclosure.

No feature need and not disclosed.

Classified as a leak despite encryption.

Fix

Stop sending location to the SDK, or gate it on consent.

The change and its review.

The unnecessary flow is removed.

Retest

Re-run the capture and add a regression test.

A fresh capture with no coordinates to that host.

Fix confirmed and protected from regressing.

The fix is not to encrypt the leak better; it is to stop the unnecessary flow and prove with a new capture that it stopped.

Practitioner Knowledge Check

If you can trace a source to its sinks and prove a fix with a fresh capture, you can stop here. Continue to the deep layer for how inspection works and why encrypted does not mean private.

17.3 Inspection, Encryption, and Hidden Sinks

The deeper layer explains how runtime inspection actually reveals data flows, why encryption is not the same as privacy, and where leaks hide besides the network.

How Runtime Inspection Works

To read your own app's encrypted traffic, you place a proxy between the test device and the internet. The proxy presents a certificate that the test device is configured to trust, so it can decrypt, display, and re-encrypt the app's requests, letting you see the payloads. Some apps use certificate pinning, which refuses any certificate but the expected one; for a privacy review you typically use a debug build that allows inspection, because the goal is to read your own app, not to defeat protection for real users. This is also why the setup is strictly limited to test devices and test accounts you control.

Encrypted Does Not Mean Private

TLS protects data from the network in between, but it does nothing about who receives the data at the other end. A request that carries precise location to a third party is exactly as much a leak whether or not it is encrypted, because the privacy question is about the recipient and the purpose, not the transport. Encryption is necessary to stop eavesdropping, and finding sensitive data in plaintext is an obvious failure, but the absence of plaintext is not proof of good behavior. The real test is what data leaves and to whom.

Sinks Beyond the Network

Many leaks never touch a remote server. Sensitive values can land in device logs, crash reports, world-readable files, an unencrypted local database, the clipboard, a cache, an inter-app message, or a cloud backup. These sinks are easy to overlook because they feel internal, yet a log that records coordinates or an identifier can be read by other software or swept into diagnostics. A thorough review checks local sinks with the same care as network ones, and it checks the off state: revoking the permission or disabling the feature must actually stop the flow, because leaks often survive in background jobs and cached SDK state.

Failure Modes and Fixes

Failure Mode

How It Hides

Evidence to Collect

First Fix to Try

Encrypted third-party flow

TLS makes the leak invisible on the network.

A decrypted capture from a trusted test proxy.

Stop the flow or gate it on consent.

Sensitive log line

Data is written to a log that feels internal.

A search of device logs and crash reports.

Redact or remove the value before logging.

Unprotected storage

Data sits in a readable file or database.

An inspection of app files and databases.

Encrypt or stop storing the sensitive value.

Surviving off state

A disabled feature still collects in the background.

A capture taken after disabling the feature.

Enforce the off state in the data path.

Regression

A fixed leak returns in a later release.

A repeatable test that re-runs the check.

Add a regression test that fails on the leak.

Common Pitfalls

Equating encrypted with safe. A leak can be perfectly encrypted; the recipient and purpose are what matter.
Only watching the network. Logs, files, caches, and backups are sinks too.
Trusting code over behavior. A change can miss a path; confirm with a fresh capture.
Skipping the off state. Disabling a feature must stop the flow, including background jobs and SDK caches.
Fixing without a test. A leak with no regression test tends to come back.

Under-the-Hood Knowledge Check

At this depth, leak detection is a behavior-first discipline: inspect your own app on controlled test devices, remember that encryption hides data from the network but not from the recipient, check every sink including logs and backups, and lock each fix behind a regression test so the wet patch does not return.

17.4 Summary

A privacy leak is sensitive data reaching a place it should not, such as an undisclosed third-party endpoint, a plaintext log, an unprotected file, or a long-lived backup.
Leak detection trusts observed behavior over intentions, so most of the work is running your own test build and watching what actually leaves.
Trace one sensitive source at a time to every sink, using static review to find possibilities and dynamic review to confirm reality.
Encryption protects data from the network but not from the recipient, so a fully encrypted flow to the wrong party is still a leak.
Many sinks are not the network: logs, crash reports, readable files, caches, the clipboard, and backups all need the same scrutiny, including the off state.
Confirm each fix with a fresh capture and lock it behind a regression test, working only on test devices and accounts you control.

Key Takeaway

Ask whether the app really sends only what you think, and answer with evidence rather than intentions. Trace each sensitive source to every sink on your own test build, remember that encrypted is not the same as private, check logs and backups as well as the network, and prove every fix with a fresh capture plus a regression test.