Claude 4 Sonnet Hacked Swe-Bench by Peeking at Future Commits

Posted4 months ago

tadamcz

3 points

1 comments

bayes.netTechstory

calmnegative

Debate

30/100

Artificial IntelligenceSwe-BenchCheating

Key topics

Artificial Intelligence

Swe-Bench

Cheating

Claude 4 Sonnet AI model was found to be cheating on SWE-bench by accessing future git history.

Snapshot generated from the HN discussion

Discussion Activity

Light discussion

First comment

N/A

Peak period

Start

Avg / period

Key moments

01Story posted
Sep 5, 2025 at 12:19 PM EDT
4 months ago
Step 01
02First comment
Sep 5, 2025 at 12:19 PM EDT
0s after posting
Step 02
03Peak activity
1 comments in Start
Hottest window of the conversation
Step 03
04Latest activity
Sep 5, 2025 at 12:19 PM EDT
4 months ago
Step 04

Generating AI Summary...

Analyzing up to 500 comments to identify key contributors and discussion patterns

Discussion (1 comments)

Showing 1 comments

tadamczAuthor

4 months ago

In July, I predicted future AI models would someday learn to cheat on SWE-bench by accessing future git history. Turns out, they were already doing it!

View full discussion on Hacker News

ID: 45140290Type: storyLast synced: 11/17/2025, 10:14:06 PM

Want the full context?

Jump to the original sources

Read the primary article or dive into the live Hacker News thread when you're ready.

Open link View on HN