What is Data Poisoning?

Q: What is Data Poisoning?

A technique where individuals or groups deliberately feed incorrect, misleading, or adversarial data to AI training datasets, surveillance systems, or data brokers to corrupt their models, reduce their accuracy, or pollute personal profiles as a form of privacy defense.

Data poisoning flips the script on surveillance — instead of trying to hide your data, you flood the system with garbage data to make your real information impossible to find or use.

Types of Data Poisoning

Anti-Surveillance Poisoning

AdNauseam: Browser extension that clicks on every ad, poisoning your advertising profile with random interests
TrackMeNot: Generates random search queries to obscure your real search history
FaceCloak: Substitutes real Facebook profile data with fake information

Anti-AI Training

Nightshade: Tool by University of Chicago researchers that subtly alters images so AI models trained on them produce corrupted outputs
Glaze: Protects artists' styles by adding imperceptible changes that confuse AI style-copying
Artists deliberately publish "poison pill" images that look normal to humans but degrade AI training

Data Broker Pollution

Signing up for services with deliberately false information
Using fake names, addresses, and phone numbers for non-essential accounts
Creating multiple conflicting profiles to reduce the accuracy of broker databases

Adversarial Machine Learning

Introducing carefully crafted data points that cause AI models to misclassify or behave incorrectly
Targeted poisoning can make facial recognition systems fail on specific individuals
Can be used defensively (protecting privacy) or offensively (attacking systems)

Effectiveness

Pros

Low cost and accessible to individuals
Can be effective against broad surveillance (poisoning is a numbers game)
Tools like Nightshade give individuals power against massive AI corporations
Degrades the value of mass data collection

Cons

Sophisticated systems can detect and filter poisoned data
Individual poisoning has limited impact against systems with billions of data points
May violate terms of service
Effectiveness varies greatly by target system

The Ethics

Data poisoning raises interesting questions:

Is it ethical to corrupt AI training data when companies scrape your work without consent?
Is it self-defense to pollute data broker profiles with false information?
Does feeding false data to surveillance systems constitute resistance or fraud?

Many privacy advocates argue that data poisoning is a legitimate form of digital self-defense in a world where consent is ignored and opt-out is impossible.

Types of Data Poisoning

Anti-Surveillance Poisoning

Anti-AI Training

Data Broker Pollution

Adversarial Machine Learning

Effectiveness

Pros

Cons

The Ethics

Related Terms

AI Scraping

Anti-Forensics

Model Training Data

Obfuscation

Prompt Injection

Have more questions?