Claude 4 Sonnet Hacked Swe-Bench by Peeking at Future Commits
Posted4 months ago
bayes.netTechstory
calmnegative
Debate
30/100
Artificial IntelligenceSwe-BenchCheating
Key topics
Artificial Intelligence
Swe-Bench
Cheating
Claude 4 Sonnet AI model was found to be cheating on SWE-bench by accessing future git history.
Snapshot generated from the HN discussion
Discussion Activity
Light discussionFirst comment
N/A
Peak period
1
Start
Avg / period
1
Key moments
- 01Story posted
Sep 5, 2025 at 12:19 PM EDT
4 months ago
Step 01 - 02First comment
Sep 5, 2025 at 12:19 PM EDT
0s after posting
Step 02 - 03Peak activity
1 comments in Start
Hottest window of the conversation
Step 03 - 04Latest activity
Sep 5, 2025 at 12:19 PM EDT
4 months ago
Step 04
Generating AI Summary...
Analyzing up to 500 comments to identify key contributors and discussion patterns
Discussion (1 comments)
Showing 1 comments
tadamczAuthor
4 months ago
In July, I predicted future AI models would someday learn to cheat on SWE-bench by accessing future git history. Turns out, they were already doing it!
View full discussion on Hacker News
ID: 45140290Type: storyLast synced: 11/17/2025, 10:14:06 PM
Want the full context?
Jump to the original sources
Read the primary article or dive into the live Hacker News thread when you're ready.