Adversarial Confusion Attacks: Making GPT-5 Hallucinate
Posted3 months ago
researchgate.netResearchstory
calmneutral
Debate
10/100
AI SafetyLarge Language ModelsAdversarial Attacks
Key topics
AI Safety
Large Language Models
Adversarial Attacks
A research paper discusses 'adversarial confusion attacks' that can cause multimodal large language models (LLMs) like GPT-5 to hallucinate, with the HN community showing interest in the implications for AI safety.
Snapshot generated from the HN discussion
Discussion Activity
Light discussionFirst comment
N/A
Peak period
1
Start
Avg / period
1
Key moments
- 01Story posted
Oct 6, 2025 at 10:21 AM EDT
3 months ago
Step 01 - 02First comment
Oct 6, 2025 at 10:21 AM EDT
0s after posting
Step 02 - 03Peak activity
1 comments in Start
Hottest window of the conversation
Step 03 - 04Latest activity
Oct 6, 2025 at 10:21 AM EDT
3 months ago
Step 04
Generating AI Summary...
Analyzing up to 500 comments to identify key contributors and discussion patterns
ID: 45491757Type: storyLast synced: 11/17/2025, 11:06:32 AM
Want the full context?
Jump to the original sources
Read the primary article or dive into the live Hacker News thread when you're ready.
The attack successfully fooled GPT-5, producing structured hallucinations from an adversarial image.
The ultimate goal is to prevent AI Agents from reliably operating on websites by embedding adversarial “confusion images” into their visual environments.