Last activity 26 days agoPosted Oct 31, 2025 at 1:31 PM EDT

End of Transformer Era Approaches

obiefernandez

15 points

1 comments

Mood

calm

Sentiment

mixed

Discussion Activity

Light discussion

First comment

19h

Peak period

Hour 20

Avg / period

Key moments

01Story posted
Oct 31, 2025 at 1:31 PM EDT
26 days ago
Step 01
02First comment
Nov 1, 2025 at 8:48 AM EDT
19h after posting
Step 02
03Peak activity
1 comments in Hour 20
Hottest window of the conversation
Step 03
04Latest activity
Nov 1, 2025 at 8:48 AM EDT
26 days ago
Step 04

Generating AI Summary...

Analyzing up to 500 comments to identify key contributors and discussion patterns

Discussion (1 comments)

Showing 1 comments

alyxya

26 days ago

> The training budget for this model was $4,000, trained 60 hours on a cluster of 32 H100s. (For comparison, training an LLM of this scale from scratch typically costs ~$200k.)

What they did is closer to fine-tuning, so this comparison isn’t helpful. The article is transparent about this at least, but listing the cost and performance seems disingenuous when they’re mostly piggybacking off an existing model. Until they train an equivalently sized model from scratch and demonstrate a notable benefit, all this looks like is at best a sidegrade to transformers.

View full discussion on Hacker News

ID: 45774523Type: storyLast synced: 11/20/2025, 3:32:02 PM

Want the full context?

Jump to the original sources

Read the primary article or dive into the live Hacker News thread when you're ready.

Read Article View on HN