What is Differential Privacy?
A mathematical framework for sharing aggregate information about a dataset while provably protecting the privacy of individual entries.
Differential privacy adds carefully calibrated noise to data queries, allowing useful statistical analysis while making it impossible to identify any individual's contribution.
How It Works
- Random noise is added to query results
- The noise is large enough to hide any individual's data
- But small enough that aggregate statistics remain accurate
- Provides a mathematical guarantee of privacy (epsilon parameter)
Real-World Uses
- Apple: Uses differential privacy in iOS to collect usage statistics
- Google: RAPPOR system for Chrome usage data
- US Census: Applied differential privacy to 2020 Census data
The Epsilon Problem
- Epsilon (ε) measures the privacy guarantee — smaller is more private
- There's no universal agreement on what epsilon value is "private enough"
- Apple uses ε = 4-8, while academic researchers often recommend ε < 1
- Companies may claim differential privacy while using loose parameters
Related Terms
Data Minimization
A privacy principle that organizations should collect only the minimum amount of personal data necessary for a specific purpose, and retain it only as long as needed. This reduces privacy risks by limiting exposure in case of breaches or misuse.
Pseudonymity
The state of using a consistent fake identity rather than your real name. Unlike anonymity, pseudonymity allows building reputation and history while protecting real-world identity from casual observers.
Have more questions?
Use our guided flow to get the right next privacy step for Differential Privacy.
Open Guided Flow