Doublespeak: In-Context Representation Hijacking
Posted12 days agoActive5d ago
mentaleap.aiResearchstory
informativeneutral
Debate
40/100
AIRepresentation LearningAdversarial Attacks
Key topics
AI
Representation Learning
Adversarial Attacks
Discussion Activity
Moderate engagementFirst comment
6d
Peak period
7
144-156h
Avg / period
4
Key moments
- 01Story posted
Dec 22, 2025 at 2:16 PM EST
12 days ago
Step 01 - 02First comment
Dec 28, 2025 at 5:12 PM EST
6d after posting
Step 02 - 03Peak activity
7 comments in 144-156h
Hottest window of the conversation
Step 03 - 04Latest activity
Dec 29, 2025 at 5:41 AM EST
5d ago
Step 04
Generating AI Summary...
Analyzing up to 500 comments to identify key contributors and discussion patterns
ID: 46357686Type: storyLast synced: 12/29/2025, 5:50:29 AM
Want the full context?
Jump to the original sources
Read the primary article or dive into the live Hacker News thread when you're ready.
It makes me wonder how Deepseek avoids commenting politically on China? I have heard anecdotes that it will be writing out a long reply and then presumably it generates some forbidden phrase and it abandons the output and replaces it all with an error message. So presumably the safeguards could be a separate trivial non-LLM-based post filtering which makes it immune to the doublespeak attack?