What is Data Poisoning?
A technique where individuals or groups deliberately feed incorrect, misleading, or adversarial data to AI training datasets, surveillance systems, or data brokers to corrupt their models, reduce their accuracy, or pollute personal profiles as a form of privacy defense.
Also known as: AI Data Poisoning, Training Data Contamination, Adversarial Data
Data poisoning flips the script on surveillance — instead of trying to hide your data, you flood the system with garbage data to make your real information impossible to find or use.
Types of Data Poisoning
Anti-Surveillance Poisoning
- AdNauseam: Browser extension that clicks on every ad, poisoning your advertising profile with random interests
- TrackMeNot: Generates random search queries to obscure your real search history
- FaceCloak: Substitutes real Facebook profile data with fake information
Anti-AI Training
- Nightshade: Tool by University of Chicago researchers that subtly alters images so AI models trained on them produce corrupted outputs
- Glaze: Protects artists' styles by adding imperceptible changes that confuse AI style-copying
- Artists deliberately publish "poison pill" images that look normal to humans but degrade AI training
Data Broker Pollution
- Signing up for services with deliberately false information
- Using fake names, addresses, and phone numbers for non-essential accounts
- Creating multiple conflicting profiles to reduce the accuracy of broker databases
Adversarial Machine Learning
- Introducing carefully crafted data points that cause AI models to misclassify or behave incorrectly
- Targeted poisoning can make facial recognition systems fail on specific individuals
- Can be used defensively (protecting privacy) or offensively (attacking systems)
Effectiveness
Pros
- Low cost and accessible to individuals
- Can be effective against broad surveillance (poisoning is a numbers game)
- Tools like Nightshade give individuals power against massive AI corporations
- Degrades the value of mass data collection
Cons
- Sophisticated systems can detect and filter poisoned data
- Individual poisoning has limited impact against systems with billions of data points
- May violate terms of service
- Effectiveness varies greatly by target system
The Ethics
Data poisoning raises interesting questions:
- Is it ethical to corrupt AI training data when companies scrape your work without consent?
- Is it self-defense to pollute data broker profiles with false information?
- Does feeding false data to surveillance systems constitute resistance or fraud?
Many privacy advocates argue that data poisoning is a legitimate form of digital self-defense in a world where consent is ignored and opt-out is impossible.
Related Terms
AI Scraping
The large-scale collection of text, images, code, and personal data from the internet by AI companies to train machine learning models — often without consent or compensation.
Anti-Forensics
Techniques used to prevent, disrupt, or mislead digital forensic investigations by destroying evidence or making analysis difficult.
Model Training Data
The massive datasets of text, images, code, and other content used to train AI models — often containing personal information scraped from the internet without individual consent.
Obfuscation
Techniques for disguising encrypted traffic to look like normal, unencrypted traffic, used to bypass censorship systems that block VPNs and Tor.
Prompt Injection
A security vulnerability in AI systems where an attacker manipulates the input to override the AI's instructions, potentially extracting private data or making the system perform unintended actions.
Have more questions?
Use our guided flow to get the right next privacy step for Data Poisoning.
Open Guided Flow