Nightshade: Make Images Unsuitable for Model Training
Key topics
The cat-and-mouse game between AI model trainers and artists is heating up with Nightshade, a tool that "poisons" images to make them unusable for model training. Commenters are divided on its effectiveness, with some calling it "snake oil" that will ultimately benefit industry, while others see it as a potential catalyst for artists to gain leverage against AI labs. As one commenter pointed out, the arms race between model robustness and image "poisoning" techniques may lead to unexpected security implications for unified models. The debate underscores the ongoing tension between AI development and artistic ownership.
Snapshot generated from the HN discussion
Discussion Activity
Moderate engagementFirst comment
1h
Peak period
10
1-2h
Avg / period
4.1
Based on 29 loaded comments
Key moments
- 01Story posted
Jan 4, 2026 at 7:32 AM EST
5d ago
Step 01 - 02First comment
Jan 4, 2026 at 8:35 AM EST
1h after posting
Step 02 - 03Peak activity
10 comments in 1-2h
Hottest window of the conversation
Step 03 - 04Latest activity
Jan 4, 2026 at 3:37 PM EST
5d ago
Step 04
Generating AI Summary...
Analyzing up to 500 comments to identify key contributors and discussion patterns
Want the full context?
Jump to the original sources
Read the primary article or dive into the live Hacker News thread when you're ready.
https://news.ycombinator.com/item?id=46364338
https://news.ycombinator.com/item?id=35224219
We’ve seen this arms race before and know who wins. It’s all snake oil imo
It's kinda funny in a way because effectively they're helping iron out ways in which these models "see" differently to humans. Every escalation will in the end just help make the models more robust...
That they are disclosing the tools rather than e.g. creating a network service makes this even easier.
It's all to benefit industry, whether the academics realize it or not.
In fact I would say the opposite is true. LLMs must protect against this as a security measure in unified models or things the LLM 'sees' may be faked.
If for example someone could trick you into seeing a $1 bill as a $10 it would be considered a huge failure on your part and it would be trained out of you if you wanted to remain employed.
I haven't and I don't know who wins. Who wins?
Adversarial examples aren't snake oil, if that's what you meant. There's a rich literature on both producing and bypassing them that has accumulated over the years, but while I haven't kept abreast with it, my recollection is that the bottom line is like that for online security: there's never a good reason not to make sure your system is up to date and protected from attacks, even if there exist attacks that can bypass any defense.
Where in this case attack and defense can both describe what artists want to do with their work.
Don't confuse attempting to make AI misclassify an image as a security measure.
And yes, this is snake oil and the AI wins every time.
At the end of the day a human has to be able to interpret the image, and I'd add another constraint of not thinking it looks ugly. This puts a very hard floor on what a poisoner can put in an image before the human gets sick. In a rapid turn around GAN you hit that noise floor really quickly.
I could imagine you could make one that was effective against multiple recognizers, but not in general.
I'd also guess it'd be easy to get rid of this vulnerability on the model side.
This is just grandstanding. Half the people from this lab will go on to work for AI companies.
175 years of history would disagree with you: https://en.wikipedia.org/wiki/Security_through_obscurity
Never mind that the more people try to corrupt a model, the more likely that future models will catch these corruption attempts as security and trust/safety issues to fix and work around.
The next Nightshade will eventually be viewed as malware to a model and then worked around, reconstructing around the attempt to break a model.
> You can crop it, resample it, compress it, smooth out pixels, or add noise, and the effects of the poison will remain. You can take screenshots, or even photos of an image displayed on a monitor, and the shade effects remain
if this becomes prevalent enough, you can create a lightweight classifier to remove "poisonous" images, then use some kind of neural-network(probably an autoencoder) to "fix" them. Training such networks won't be too difficult as you can create as many positive-negative samples as you want by using this tool.
It's also not obvious to me what happens with cartoon style art. Something that looks like white noise might be acceptable on an oil painting but not something with flat colors and clean lines.
- https://news.ycombinator.com/item?id=38013151
- https://news.ycombinator.com/item?id=37990750
I don't know what nightshade is supposed to do, but the fact that it doesn't affect the synthetic labeling of data at all leads me to believe image model trainers will have close to 0 consideration of what it does when training new models.