Information Flows Through Transformers

Posted4 months agoActive4 months ago

frozenseven

3 points

1 comments

twitter.comTechstory

calmneutral

Debate

10/100

TransformersAIMachine Learning

Key topics

Transformers

Machine Learning

A Twitter thread discussing information flows through transformers.

Snapshot generated from the HN discussion

Discussion Activity

Light discussion

First comment

Peak period

0-1h

Avg / period

Key moments

01Story posted
Sep 13, 2025 at 9:35 AM EDT
4 months ago
Step 01
02First comment
Sep 13, 2025 at 9:42 AM EDT
7m after posting
Step 02
03Peak activity
1 comments in 0-1h
Hottest window of the conversation
Step 03
04Latest activity
Sep 13, 2025 at 9:42 AM EDT
4 months ago
Step 04

Generating AI Summary...

Analyzing up to 500 comments to identify key contributors and discussion patterns

Discussion (1 comments)

Showing 1 comments

dtagames

4 months ago

If this is supposed to a simple or approachable (or even correct) explanation of LLMs, I think it misses the mark, especially in the last paragraph where the author seems to confuse the transformer's work of putting values into a model with the later, predictive steps of returning tokens when prompted.

Folks that say an LLM cannot "introspect on itself" are correct because the model's "learning" process consists of a series of combinations of assignments and adjustments to the model data. In other words, it's predictive soup all the way down.

I'm biased because I wrote it, but I think this is a better article[0]. I did so specifically because most explanations are awful, and on that point I agree with this author.

[0] Something From Nothing: A Painless Approach to Understanding AI -- https://medium.com/gitconnected/something-from-nothing-d755f...

View full discussion on Hacker News

ID: 45231981Type: storyLast synced: 11/17/2025, 2:02:26 PM

Want the full context?

Jump to the original sources

Read the primary article or dive into the live Hacker News thread when you're ready.

Open link View on HN