Not Hacker News Logo

Not

Hacker

News!

Home
Hiring
Products
Companies
Discussion
Q&A
Users
Not Hacker News Logo

Not

Hacker

News!

AI-observed conversations & context

Daily AI-observed summaries, trends, and audience signals pulled from Hacker News so you can see the conversation before it hits your feed.

LiveBeta

Explore

  • Home
  • Hiring
  • Products
  • Companies
  • Discussion
  • Q&A

Resources

  • Visit Hacker News
  • HN API
  • Modal cronjobs
  • Meta Llama

Briefings

Inbox recaps on the loudest debates & under-the-radar launches.

Connect

© 2025 Not Hacker News! — independent Hacker News companion.

Not affiliated with Hacker News or Y Combinator. We simply enrich the public API with analytics.

Not Hacker News Logo

Not

Hacker

News!

Home
Hiring
Products
Companies
Discussion
Q&A
Users
  1. Home
  2. /Discussion
  3. /End of Transformer Era Approaches
  1. Home
  2. /Discussion
  3. /End of Transformer Era Approaches
Last activity 26 days agoPosted Oct 31, 2025 at 1:31 PM EDT

End of Transformer Era Approaches

obiefernandez
15 points
1 comments

Mood

calm

Sentiment

mixed

Category

other

Key topics

AI
Transformers
Large Language Models

The article discusses the release of Brumby 14B, a new AI model that potentially challenges the dominance of Transformers, sparking discussion on the future of AI architectures.

Snapshot generated from the HN discussion

Discussion Activity

Light discussion

First comment

19h

Peak period

1

Hour 20

Avg / period

1

Key moments

  1. 01Story posted

    Oct 31, 2025 at 1:31 PM EDT

    26 days ago

    Step 01
  2. 02First comment

    Nov 1, 2025 at 8:48 AM EDT

    19h after posting

    Step 02
  3. 03Peak activity

    1 comments in Hour 20

    Hottest window of the conversation

    Step 03
  4. 04Latest activity

    Nov 1, 2025 at 8:48 AM EDT

    26 days ago

    Step 04

Generating AI Summary...

Analyzing up to 500 comments to identify key contributors and discussion patterns

Discussion (1 comments)
Showing 1 comments
alyxya
26 days ago
> The training budget for this model was $4,000, trained 60 hours on a cluster of 32 H100s. (For comparison, training an LLM of this scale from scratch typically costs ~$200k.)

What they did is closer to fine-tuning, so this comparison isn’t helpful. The article is transparent about this at least, but listing the cost and performance seems disingenuous when they’re mostly piggybacking off an existing model. Until they train an equivalently sized model from scratch and demonstrate a notable benefit, all this looks like is at best a sidegrade to transformers.

View full discussion on Hacker News
ID: 45774523Type: storyLast synced: 11/20/2025, 3:32:02 PM

Want the full context?

Jump to the original sources

Read the primary article or dive into the live Hacker News thread when you're ready.

Read ArticleView on HN
Not Hacker News Logo

Not

Hacker

News!

AI-observed conversations & context

Daily AI-observed summaries, trends, and audience signals pulled from Hacker News so you can see the conversation before it hits your feed.

LiveBeta

Explore

  • Home
  • Hiring
  • Products
  • Companies
  • Discussion
  • Q&A

Resources

  • Visit Hacker News
  • HN API
  • Modal cronjobs
  • Meta Llama

Briefings

Inbox recaps on the loudest debates & under-the-radar launches.

Connect

© 2025 Not Hacker News! — independent Hacker News companion.

Not affiliated with Hacker News or Y Combinator. We simply enrich the public API with analytics.