How the Brain Parses Language

Posted26 days agoActive19 days ago

mylifeandtimes

106 points

57 comments

quantamagazine.orgResearchstoryHigh profile

calmpositive

Debate

20/100

Brain FunctionLanguage ProcessingBehavioral Science

Key topics

Brain Function

Language Processing

Behavioral Science

The debate around cognitive scientist Evelina Fedorenko's research on how the brain parses language has sparked a lively discussion, with some commenters questioning the article's focus on her personal life rather than her research. While some, like tgv, argue that the human language network is fundamentally different from Large Language Models (LLMs), others, like dr_dshiv and Terretta, suggest that there may be similarities between the two, with Terretta even proposing that adding symbols for rhythm and tone to LLM input could make them more similar to human language processing. The conversation has highlighted the complexities of language processing and the ongoing quest to understand the intricacies of the human brain.

Snapshot generated from the HN discussion

Discussion Activity

Very active discussion

First comment

22m

Peak period

72-84h

Avg / period

9.8

Comment distribution88 data points

Loading chart...

Based on 88 loaded comments

Key moments

01Story posted
Dec 8, 2025 at 7:46 AM EST
26 days ago
Step 01
02First comment
Dec 8, 2025 at 8:09 AM EST
22m after posting
Step 02
03Peak activity
50 comments in 72-84h
Hottest window of the conversation
Step 03
04Latest activity
Dec 15, 2025 at 5:57 AM EST
19 days ago
Step 04

Generating AI Summary...

Analyzing up to 500 comments to identify key contributors and discussion patterns

Discussion (57 comments)

Showing 88 comments

moralIsYouLie

26 days ago

1 reply

reads like a collection of HN comments by commenters who like to build "chapter 1" textbook agents using instant-noodle "training tools". "and what would be the time complexity?"

I can't do this anymore.

Al-Khwarizmi

23 days ago

1 reply

Ev Fedorenko is a highly recognized cognitive scientist that has been studying how humans parse language for years.

Of course this doesn't mean one shouldn't question what she says (that would be an obvious authority fallacy), but I do think it's fair to say that if you want to question it, the argument should be more elaborate that "this sounds like she has no idea of the topic".

Timwi

23 days ago

2 replies

I'm not the person you responded to, but I found the article unreadable because it kept going on about Ev’s life instead of her research. I'm sure her research is valuable and insightful, but with this style of reporting it is both inaccessible to me, and it gives me the (probably flawed) impression that her research isn't the part of her life that's supposed to be important or impressive.

jimbokun

23 days ago

This is meant for a lay audience so you should probably just read her research papers.

Also:

> it gives me the (probably flawed) impression that her research isn't the part of her life that's supposed to be important or impressive.

I don't see this at all in the article. There's just some human interest content to make her research more approachable.

mcswell

22 days ago

FWIW, that's soft of the way a lot of physics books (not textbooks) approach the subject: Einstein/ Heisenberg/ Bohr/ Pauli/ Feynman/ Oppenheimer was this kind of person, oh, and by the way he came up with this theory of X. Apparently a lot of people like that way of presenting science, but it's not for everyone.

dr_dshiv

23 days ago

2 replies

> It almost sounds like you’re saying there’s essentially an LLM inside everyone’s brain. Is that what you’re saying?

>Pretty much. I think the language network is very similar in many ways to early LLMs, which learn the regularities of language and how words relate to each other. It’s not so hard to imagine, right?

Yet, completely glosses over the role of rhythm in parsing language. LLMs aren’t rhythmic at all, are they? Maybe each token production is a cycle, though… hmm…

GolDDranks

23 days ago

2 replies

I think it's obvious that she means that it's something _like_ LLMs in some aspects. You are correct that rhythm and intonation are very important in parsing language. It's clear that the human language network is not like LLM in that sense. However, it _is_ a bit like an _early_ LLM (remember GPT2?) in the sense that it can produce and parse language, not that it makes much deeper sense in it.

tgv

23 days ago

1 reply

However ... language production and perception are quite separated in our heads. There's basically no parallel to LLMs. Note that the article doesn't give any, and is extremely vague about the biological underpinnings of language.

GolDDranks

23 days ago

1 reply

> language production and perception are quite separated in our heads

Do you have any evidence for this?

I am a former linguistics student (got my masters), and, after years of absenteeism in academia, interested in the current state of the affairs. So: "quite separated in our heads" Evidence for? against?

tgv

23 days ago

1 reply

Afasia, and general measures of "normal" performance.

There are various kinds of afasia, often linked to specific brain areas (Wernicke's and Broca's are well-known). And M/EEG and fMRI research suggests similar distinctions. It is difficult to reconcile with the idea that there is only one language system.

And you will also have noticed that your skills in perception and production differ. You can read/listen better than write/speak. Timing, ambiguity and errors in perception and production differ.

And more logically: the tasks are very different. In perception, you have to perceive the structure and meaning from a highly ambiguous, but ordered input, while during production, meaning is given (in non-linear order), and you have to find a way to fit it in a linear, grammatical order.

GolDDranks

19 days ago

Ah, totally agreed. At least there is a clear auditory / motor part in the tasks that seems quite separate.

However, I find it also unlikely that the networks are totally separate, and I wonder if there are any evidence of areas that encode the "core/abstract" linguistic de/serialization (multidimensional and messy internal semantic information ←→ linear morphophonological information) both ways, or at least mechanism that manages to use gained input network competence to "train" or "manage" output network competence.

Why? Because even though, as you say, there is a differing performance in perception and production, there is also plenty of evidence of gaining linguistic competence from input, and then managing to convert that to performance in output.

Terretta

22 days ago

> It's clear that the human language network is not like LLM in that sense.

Is it though? If rhythm or tone changes meaning, then just add symbols for rhythm and tone to LLM input and train it. You'll get not just words out that differ based on those additional symbols wrapping words, but you'll also get the rhythm and tone symbols in the output.

coldtea

21 days ago

>Yet, completely glosses over the role of rhythm in parsing language.

If you're talking about speech cadence/rhythm, then we also parse written language which doesn't have that. And we're quite capable of parsing a monotone robotic voice speaking with a monotonous mechanical rhythm too.

tcsenpai

23 days ago

2 replies

> But what if our neurobiological reality includes a system that behaves something like an LLM?

It almost seems like we got inspiration from our brain to build neural networks!

seanmcdirmid

23 days ago

2 replies

It isn’t clear though. Neural networks were inspired by the brain, but transformers? It is totally plausible but do we really think just in words?

coldtea

21 days ago

>Neural networks were inspired by the brain, but transformers? It is totally plausible but do we really think just in words?

LLMs might be trained via words, but as a backend transformers are not just for words.

They're for high dimensional structured sequences. And, we too, might not think in words, but I bet we do think using multi-dimensional sequences/vectors.

To make an analogy, transformers are not working on:

  Vector<Word>

but

  Vector<ContextualizedEmbedding>

where words just happens to be a handy training set we use.

SAI_Peregrinus

22 days ago

> It is totally plausible but do we really think just in words?

I find that proposition totally implausible. Some people certainly report only thinking in words & having a continuous inner monologue, but I'm not one of them. I think, then I describe my thoughts in words if I'm speaking or writing or thinking about speaking or writing.

coldtea

21 days ago

We've been making the same metaphor ("that's how the brain works") with each new major technology we come up with...

alfanick

23 days ago

2 replies

Anecdotal data, based on a sample of 1 (aka me). I'm originally Polish, but I would say my mother tongue is English. I also learned Latin as a kid/teen. Then learning any other languages is much easier, I also learned German and some Swiss German dialects. I can also do Spanish, Italian, French, Dutch, Czech, some Serbo-Croation. I think being Polish makes learning languages easy - as we have a lot of creations in Polish that do not translate easily to other languages. I think in my case it's the same part of brain that processes both human language and computer language.

I also learned to think in hmm "concepts", and then apply a language of my choice to express them. It's a fun skill to have :) Obviously works of Chomsky are great, especially exploring if language evolves mind or is the other way around, does mind evolves language? [let's skip his rather controversial political views lately].

Tor3

23 days ago

1 reply

I speak several languages too, though definitely not as many as you do. I'm also in the process of learning a completely new one, at an advanced age relative to when I last learned a new one (I was in my thirties then). To me, my brain most definitely doesn't process human language the way it handles computer language. It's about as different as it can get. The latter is "learning", the former is "burn patterns into the brain", and learning a language can take years, at least at this age. Computer languages? Those can be picked up in as little as a weekend, and getting proficient isn't a multi-year or decade long process. It feels totally different for me (I've been learning new computer languages at the same time as I've been trying to get up to speed with a new human language).

vkazanov

23 days ago

1 reply

Computer languages are much simpler than human languages, and they also operate in similar kind of logical ways. I definitely remember how hard was to go from pascal to C to Cpp to Python to prolog to haskell to SQL... until at some point nothing was new.

Tor3

23 days ago

To me, working with a computer language involves specific thinking, constructing stuff in my mind. But human language is nothing of the sort, though it's possible to kind of do the same if I sit down and try to polish a written sentence. But talking in, and understanding a conversation is as far from this as I can imagine. And the learning process is so extremely different.

mzs

23 days ago

1 reply

I completely understand! I'm also Polish American. I have to say it helps when mother's side of family is Gdańsk+west and father's Lublin+east. My wife's family is all from Warsaw area and I had to translate for my father-in-law during a holiday to Władysławowo-Hel (probably helps my aunt's father's side is Kashubian too, mmm... dessert first).

I was blown-away on holiday to Croatia. It was so unexpectedly relatively easily understandable after Czechia, Austria, and Slovenia. I was all, "What just happened!? Shouldn't this be something more like Italian?"

It took only a month for me to be able to communicate in Ukrainian with my ESL students, you're totally right about Cyrillic. And I too think in concepts but switch my brain to express them externally via language, whatever that language may be at the moment. I am terrible at translating OTOH, so unnatural!

But it has it's limits, I got to a point after German and Norwegian that I thought I harbored a super-power. Then I went to school in Hungary ;) I also had an ESL student from Lithuania, yep incomprehensible.

alfanick

22 days ago

Kudos to you too. Finnish, Hungarian and Estonian I wouldn’t be able to comprehend ;) these are different beasts

liampulles

23 days ago

6 replies

What I'm curious about is what the language parts of the human brain look like for babies and toddlers. Humans obviously have a bunch of languages they can speak, and toddlers pick up the language that their guardians speak around their home, so there seems to be machinery there that is for the task of "online" learning.

lapcat

23 days ago

1 reply

I think this quote may speak to the question:

> The brain’s general object-recognition machinery is at the same level of abstractness as the language network. It’s not so different from some higher-level visual areas such as the inferotemporal cortex (opens a new tab) storing bits of object shapes, or the fusiform face area storing a basic face template.

In other words, it sounds like the brain may start with the same basic methods of pattern matching for many different contexts, but then different areas of the brain specialize in looking for patterns in specific contexts such as vision or language.

This seems to align with the research of Jenny Saffran, for example, who has studied how babies recognize language, arguing that this is largely statistical pattern matching.

mullsork

23 days ago

In the series Babies by Netflix some of her research on this topic is covered. Season 1 Episode 4 "First Words."

Anon84

23 days ago

3 replies

Me too! Babies and toddlers brains are like sponges. We started teaching my baby 3 languages since birth (essentially I always spoken with her in my native language, my wife in hers and gets English from living in the US). She’s not even 4 yet an fully fluent in all three and seemlessly jumps back and forth between them. (To my surprise, she doesn’t mix words from the different languages in the same sentence)

mcswell

22 days ago

2 replies

There's a lot more to language learning than being a "sponge". Virtually all the grammar we learn is productive/ creative--that is, we apply it to new words, and say things we never heard anyone say before. And the grammar is implicit in what we hear, so children need to extract it in a form that can be generalized to new thoughts and words.

coldtea

21 days ago

>Virtually all the grammar we learn is productive/ creative--that is, we apply it to new words, and say things we never heard anyone say before.

That's downstream of the sponge phase. So much so, that initially we only absorb and don't talk yet.

mbg721

22 days ago

This is why learning Latin the way I did (very methodically and technically, with no real speaking/responding) makes you good at parsing it, but not at speaking it. There are schools today where it's taught as if it were a spoken language.

phkahler

23 days ago

>> To my surprise, she doesn’t mix words from the different languages in the same sentence

I knew two brothers that would mix words from different languages while speaking to each other because they shared the same set of languages and presumably used the best words to express their thoughts.

Your daughter probably knows other people generally speak and understand one language at a time and just conforms because its most effective.

I'm not sure if or at what age it might be good to start mixing languages with others who can.

fellowniusmonk

23 days ago

If you look at the rate of "new" word use after the first spoken word its very clear that word acquisition and categorizing occurs for a long period before that first word is ever spoken.

Speaking to babies is incredibly important for linguistics but probably for all types of complex brain function, I don't think there is an upper bound on how many words we should expose children too.

griffzhowl

23 days ago

1 reply

One part of the story I found fascinating is the overlap in infants' brains of the areas involved in tool use and hierarchical syntax. These diverge and specialize in adults. The homologous brain region in primates is involved in motor planning.

It's an interesting hint at the deeper evolutionary origins of language in the ability to plan complex actions, providing a neural basis for the observation that language and action planning have this common structure of an overall goal that can be decomposed into a structure of subgoals, which we see formalized in computer programs too.

This is an older reference (1991) where I first heard about it. there are more recent studies reinforcing various aspects of it but I didn't find one that was as comprehensive

https://doi.org/10.1017/S0140525X00071235

mcswell

22 days ago

1 reply

"overlap in infants' brains of the areas involved in tool use and hierarchical syntax"---you didn't see that in the Quanta article, right? I went back and looked, but can't find it mentioned anywhere.

griffzhowl

21 days ago

Not from the quanta article. By "story" I just meant the general story of the neural basis for language. What I mentioned in the comment is from the article I linked there

trebligdivad

23 days ago

I'd like one stage further - what are the genetics of this area? How does a dedicated brain area like this get encoded - (Hopefully the Allen Institute might dig on this one?); but if we can find how the areas are encoded in the DNA we could presumably see how they evolved, but then perhaps also spot other areas?

lukeinator42

23 days ago

It's an interesting area of research, there is even some evidence that language experienced in utero affects speech perception: https://doi.org/10.1111/apa.12098.

fuzzfactor

22 days ago

At that age, I always figured "Why parse?"

lapcat

23 days ago

1 reply

I wouldn't read too much into the LLM analogy. The interview is disappointingly short, filled with a bunch of unnecessarily tall photgraphs, and the interviewer, the one who brought up LLMs and ChatCPT and has a history of writing AI articles (https://www.quantamagazine.org/authors/john-pavlus/), almost seemed to have an agenda to contextualize the research in this way. In general, except in a hostile context, interviewees tend to be agreeable and cooperative with interviewers, which means that interviews can be steered in a predetermined way, probably for clickbait here.

In any case, there's a key disanalogy:

> Unlike a large language model, the human language network doesn’t string words into plausible-sounding patterns with nobody home; instead, it acts as a translator between external perceptions (such as speech, writing and sign language) and representations of meaning encoded in other parts of the brain (including episodic memory and social cognition, which LLMs don’t possess).

adamzwasserman

23 days ago

1 reply

The disanalogy you quote might actually be the key insight. What if language operates at two levels, like Kahneman's System 1/2?

Level 1: Nearly autonomic — pattern-matched language that acts directly on the nervous system. Evidence: how insults land before you "process" them, how fluent speakers produce speech faster than conscious deliberation allows, and the entire body of work on hypnotic suggestion, which relies on language bypassing conscious evaluation entirely.

Level 2: The conscious formulation you describe — the translator between perception and meaning.

LLMs might be decent models of Level 1 but have nothing corresponding to Level 2. Fedorenko's "glorified parser" could be the Level 1 system.

lapcat

23 days ago

1 reply

> LLMs might be decent models of Level 1

I don't think so. Fast speakers and hyponotized people are still clearly conscious and "at home" inside, vastly more "human" than any LLM. Deliberation and evaluation imply thinking before you speak but do not imply that you can't otherwise think while you speak.

adamzwasserman

23 days ago

2 replies

The body of knowledge on Ericksonian hypnotherapy is pretty clear that the effect of language on Level 1 is orthogonal to, and sometimes even opposed to, conscious processes.

I became interested after being medically hypnotized for kidney stone pain. As the hypnotist spoke, I was consciously thinking: "this is dumb, it will never work." And yet it did.

That's exactly your point — I was fully conscious and "at home" the whole time, yet something was processing and acting on the language independently. The question is whether that something shares any computational properties with LLMs, not whether the whole system does.

throaway123213

23 days ago

1 reply

"It's exactly your point — I was fully conscious and "at home" the whole time, yet something was processing and acting on the language independently."

It's unclear what you're referring to here. You were conscious, & you wanted to think the thought "this is dumb, it will never work." & you thought that. What was the independent process?

adamzwasserman

23 days ago

The hypnotism worked. There was an unconscious process at work whose relationship to the words was completely different from my conscious reaction.

lapcat

23 days ago

1 reply

I think you're creating a false dichotomy between meta-thinking and mere reflex, when in fact most conscious thinking is neither of those.

My understanding is that a hypnotized person is very focused on the hypnotist and suggestible but can otherwise carry on a relatively normal conversation with the hypnotist. And certainly an unhypnotized chattering person is still conscious, aware of the context as well as the subject of their speech. You may find the speech dull and tedious, may even call it "mindless" as insult, yet it's honestly impossible to dispute that there's an active human mind at work.

adamzwasserman

23 days ago

1 reply

I don't think we're far apart. My claim isn't that Level 1 is "mere reflex". It's that language can produce effects at a level that operates independently of (and sometimes in opposition to) conscious evaluation. The hypnosis example is just a clean demonstration of that separation.

Whether LLMs are useful models for studying that level is an empirical question. They're not conscious, but they do learn statistical regularities in language structure, which may be exactly what Level 1 is optimized for.

lapcat

23 days ago

> language can produce effects at a level that operates independently of (and sometimes in opposition to) conscious evaluation

I don't think this is a particularly interesting claim if "conscious evaluation" is understood so strictly that it excludes an ordinary motormouth.

qqxufo1

23 days ago

1 reply

If the brain's language network is only for "packaging words" and not for actual logic or reasoning, why does writing or speaking our messy thoughts out loud suddenly make them feel more logical? Is language actually helping us think, or is it just a filter that forces our chaotic ideas into a structure we can finally understand?

mcswell

22 days ago

That's a really good question. I don't have an answer, or even the beginning of an answer, but I would hazard a guess that there's a feedback loop. So listening to yourself talk (or even better, putting your thoughts down in print) is soft of like listening to someone else talk, which puts new ideas into your mind, or causing you to organize the ones you already have.

Doing mathematical proofs might be and extreme example of that--a mathematician has (I am told) an intuition--a thought--but has to work it out rigorously. Once they've done that, the intuition becomes much clearer. I guess.

netfortius

23 days ago

1 reply

Every time I read something like this reminds me of Maturana (of autopoiesis fame), who was among the first scientists from where I started gaining an interest in these areas. Relevant to his view, in the area of language, is the following:

"We human beings are living systems that exist in language. This means that although we exist as human beings in language and although our cognitive domains (domains of adequate actions) as such take place in the domain of languaging, our languaging takes place through our operation as living systems. Accordingly, in what follows I shall consider what takes place in language[,] as language arises as a biological phenomenon from the operation of living systems in recurrent interactions with conservation of organization and adaptation through their co-ontogenic structural drift, and thus show language as a consequence of the same mechanism that explains the phenomena of cognition:"

fellowniusmonk

23 days ago

I mean there are many questions about the fundementals of reality.

But state change is undisputed and state change is the bedrock of syntax and syntax is the bedrock of semantics.

So... the universe is filled with the diffuse "meaning simple" of state change, not some list of particles but an even more extreme version of mereological nihilism.

And so we see meaning not as a thing that exists in seperate spheres, not particles or information as primitive but meaning itself.

And meaning keeps bootstrapping it's own increases in complexity from RNG star emissions to rna/DNA to human brain structures and syntactic and semantic complexity gets keeps getting more complex and dense.

And it's all tied to entropy because information is entropic.

No pansychism needed, were just the meaning creating and densifing organ of the universe on accident, and humans emit complex meaning like stars emit photons and if stars generate a zone of potential energy humans generate a zone of causal leverage.

adamzwasserman

23 days ago

7 replies

There's an interesting falsifiable prediction lurking here. If the language network is essentially a parser/decoder that exploits statistical regularities in language structure, then languages with richer morphological marking (more redundant grammatical signals) should be "easier" to parse — the structure is more explicitly marked in the signal itself.

French has obligatory subject-verb agreement, gender marking on articles/adjectives, and rich verbal morphology. English has largely shed these. If you trained identical neural networks on French vs English corpora, holding everything else constant, you might expect French models to hit certain capability thresholds earlier — not because of anything about the network, but because the language itself carries more redundant structural information per token.

This would support Fedorenko's view that the language network is revealing structure already present in language, rather than constructing it. The "LLM in your head" isn't doing the thinking — it's a lookup/decode system optimized for whatever linguistic code you learned.

(Disclosure: I'm running this exact experiment. Preregistration: https://osf.io/sj48b)

fellowniusmonk

23 days ago

2 replies

Dyslexia seems to be more of an issue in English than other languages right?

But also, maybe the difficulty of parsing recruits other/executive function and is beneficial in other ways? The per phoneme efficiency of English is supposed to be quite high as an emergent trade language.

adamzwasserman

23 days ago

1 reply

The dyslexia point is interesting; yes, English orthography causes more reading disorders than languages with more regular spelling-to-sound mappings (Italian, Finnish, etc.). That's consistent with the parser having to work harder when the signal is noisier.

Your intuition about "slower more intentional parsing" connects to something I'm exploring: we may parse language at two levels simultaneously; a fast, nearly autonomic level (think: how insults land before you consciously process them) and a slower deliberate level. Whether those levels interact differently across languages is an open question.

tgv

22 days ago

First: dyslexia has little to do with parsing, which is generally understood to relate to structure/relations between words.

Second: multiple levels of language processing have been identified, although it's not at all clear how well separated they are. The higher levels (semantics, pragmatics) are by necessity lagging behind the lower (phonetics, syntax). The higher levels also seem more "deliberate."

coldtea

21 days ago

>Dyslexia seems to be more of an issue in English than other languages right?

I don't think so. It's medicalization or pathologization of dyslexia that's probably more of a thing in Engish. Same way many issues get medicalized and whole cottage industries and jobs grow around them

Grosvenor

23 days ago

1 reply

And we have those French/English text corpora in the form of Canadian law. All laws in Canada at the federal level are written in English and French.

This was used to build the first modern language translation systems, testing them going from English->french->english.

You could do similar here , understanding that your language is quite stilted legalese.

Edit: there might be other countries with similar rules in place that you could source test data from as well.

adamzwasserman

23 days ago

1 reply

Incredibly, I had not thought to use that data set.

Now I will. Thanks.

seszett

22 days ago

Belgian federal law is also written in Dutch, French and German, by the way.

But no English so you might not be interested.

tgv

23 days ago

1 reply

There are more differences between English and French than you just described, and they can affect your measurement. Even the corpora you use cannot be the same. There isn't "ceteris paribus" (holding everything else constant). The outcome of the experiment doesn't say anything about the hypothesis.

You're also going to use an artificial neural network to make claims about the human brain? That distance is too large to bridge with a few assumptions.

BTW, nobody believes our language faculties are doing the thinking. There are however, obviously, connections to thought: not only the concepts/meaning, but possibly sharing neural structures, such as the feedback mechanism that allows us to monitor ourselves.

adamzwasserman

22 days ago

1 reply

The confound concern is fair: no cross-linguistic comparison is perfectly controlled. The bet is that the effect size (if any) will be large enough to be informative despite the noise. But you're right that it's not ceteris paribus in a strict sense.

Your proposal is interesting though. Synthetic manipulation of morphology within a single language. Have you seen this done? The challenge I'd anticipate is that "genderized English" wouldn't have natural text to train on, so you'd need to generate it somehow, which introduces its own artifacts. But comparing French vs artificially gender-neutralized French might be feasible with existing parallel corpora. Worth thinking about as a follow-up.

On the neural network → brain distance: agreed it's a leap. The claim isn't that transformers are brains, but that if both are extracting structure from language, they might reveal something about what structure is there to extract. Fedorenko's own comparison to "early LLMs" suggests she thinks the analogy has some merit.

tgv

22 days ago

1 reply

> The bet is that the effect size (if any) will be large enough to be informative despite the noise.

But you have no grounds to ascribe it to the posited difference.

> Have you seen this done?

Not in LLMs, but there have been experiments with regularizing languages, and getting people to learn them in Second Language Acquisition (L2) studies. But what I've seen is inconclusive and sometimes outright contradictory.

> Fedorenko's own comparison to "early LLMs" suggests she thinks the analogy has some merit.

I don't think she can seriously entertain that thought. We simply no practically nothing about language processes in the brain. What we know about the hardware is very different from LLMs, early or not.

Just to give an indication of what we don't know: the Stroop effect (https://en.wikipedia.org/wiki/Stroop_effect) is almost 100 years old. We have no idea what causes it. There's no working model of word recognition. There are only vague suggestions about the origin of the delay. We have no clue how the visual signals for the color and the letters are separated, where they join again, and how that's related to linguistic knowledge. And that's almost 100 years of very, very much research. IF you go to Google Scholar and type "Stroop task", you'll get 197.000 (!) hits. That's nearly 200k articles etc. resulting in no knowledge whatsoever about a very simple, artificial task.

adamzwasserman

21 days ago

On effect size: my primary goal at this stage is falsification. If French and English models show no meaningful differences at matched compute, that's informative: it would support the scaling hypothesis. If they do differ, I'll need to be careful about causal claims, but it would at least challenge the "transformers are magic" framing that treats architecture as the main story.

The L2 regularization and information theory pointers are helpful, it will go on my reading list. If you have favorites, I'll start there.

On the "we know nothing" point: I'm sympathetic. The Stroop example is exactly why I'm skeptical of strong claims in either direction. 197k papers and no mechanism suggests language processing has properties we don't yet have frameworks to describe. That's not mysticism. It's just acknowledging the gap between phenomenon and explanation.

retrac

23 days ago

2 replies

That presumes that languages with little morphology do not have equivalent structures at work elsewhere doing the same kind of heavy lifting.

One classic finding in linguistics is that languages with lots of morphology tend to have freer word order. Latin has lots of morphology and you can move the verb or subject anywhere in the sentence and it's still grammatical. In a language like English syntax and word order and word choice take on the same role that morphology does in those kind of languages.

Inflected languages may indeed have more information encoded in each token. But the relative position of the tokens to each other also encodes information. And inflected languages may use this less.

Languages with richer morphology may also have smaller core vocabularies. To be fair, this is a contested conjecture too. But the theory is that languages like Ojibwe or Sansrkit with rich derivational morphologies and grammatical inflections simply don't need things like a dozen words for different types of snow, or to describe thinking. A single root with an almost infinite number of inflected forms can carry all the shades of meaning, where different roots might be selected amongst in a less inflected language.

Also in principle a predictive and falsifiable hypothesis but incredibly hard to pin down in practice due to definitional issues. (Is something like "put up with" an inflected form of "put" or three words?)

pessimizer

22 days ago

1 reply

You saved me from posting this. Strict word order makes a lot of things easier that have to be done through morphology in the vulgar Latins.

> Languages with richer morphology may also have smaller vocabularies. To be fair, this is a contested conjecture too.

I agree with the criticism of this to an extent. A lot of has seemed to me like it relies on thinking of English as a sort of normal, baseline language when it is actually very odd. It has so many vowels, and it also isn't open so has all of these little weird distinguishing consonant clusters at the end of syllables. And when you compare it to a language conjugated with a bunch of suffixes, those suffixes gradually both make the words very long, and add a bunch of sounds that can't be duplicated very often at the end of roots without causing confusion.

All of that together means that there's a lot more bandwidth for more words. English, even though it has a lot more words than other languages, doesn't have more precise words. Most of them are vague duplications, including duplicating most of Norman French just to have special, fancy versions of words that already existed. The strong emphasis on position in the grammar and the vast number of vowels also allows it to easily borrow words from other languages without a compelling reason.

I think all of that is enough to explain why English is such an outlier on vocabulary size, and I think you see similar in other languages that share a subset of these features.

retrac

22 days ago

> All of that together means that there's a lot more bandwidth for more words.

Consider Chinese. Unlike English in most ways, except that they are both mostly isolating non-inflected languages in their grammar.

The sound system is very inflexible. One could say Chinese is almost hostile to borrowing words. And it has not borrowed extensively from a superstrate culture. Most Chinese words are from ancient Chinese roots.

Standard Chinese has a vocabulary of similar breadth to English. There are sets of morphemes, mostly pairs, that come in the tens of thousands, where the second morpheme subtly (or not so subtly) alters the first in meaning. Think-feel, think-think, think-consider, think-understand, think-see, think-bright.

This is usually analyzed as a type of compound word. But is also similar to prepositional verbs in English. Which might also be a type of compound word.

I'm just an amateur with an interest in linguistics so I'm finding it hard to articulate the point here. But something like: whatever process underlies those word formation patterns and which gives rise to collocations, is I think, closely related to traditional morphology, of the adjective agreement and noun declension type.

Though if we're looking for reasons to explain vocabulary size, now that I think about it, the sheer size of both cultures (arguably the first and second largest in the world) might explain why there's so many words for so many things.

adamzwasserman

22 days ago

These are good points that sharpen the hypothesis. The word order question is interesting — positional encoding vs morphological encoding might have different computational properties for a parser.

One difference I'm betting on: morphological agreement is redundant (same information marked multiple times), while word order encodes information once. Redundancy aids error correction and may lower pattern extraction thresholds. But I'm genuinely uncertain whether that outweighs the structural information carried by strict word order.

Do you have intuitions on which would be "easier" for a statistical learner? Or pointers to relevant literature? The vocabulary size / morpheme count tradeoff is also something I hadn't fully considered as a confound.

patcon

22 days ago

1 reply

Im a strong believer in this sort of thing. You might be interested to look into the Leiden Theory of Language[1][2]. It's been my absolutely favourite fringe theory of mind since I stumbled across the rough premise in 2018, and went looking for other angles on it.

[1] https://www.kortlandt.nl/publications/art067e.pdf

[2]: https://en.wikipedia.org/wiki/Symbiosism

> Language is a mutualist symbiont and enters into a mutually beneficial relationship with its hominid host. Humans propagate language, whilst language furnishes the conceptual universe that guides and shapes the thinking of the hominid host. Language enhances the Darwinian fitness of the human species. Yet individual grammatical and lexical meanings and configurations of memes mediated by language may be either beneficial or deleterious to the biological host.

adamzwasserman

22 days ago

1 reply

Thank you for the Leiden references. I hadn't encountered this framework before. The "language symbiont" framing resonates with what I've been circling around: a system that operates with its own logic, sometimes orthogonal to conscious intention.

The mule analogy is going to stick with me. LLMs have inherited the statistical structure of the symbiont without the host: pattern without grounding. Whether that makes them useful instruments for studying the symbiont itself, or just misleading simulacra, is exactly what I'm trying to work out.

Going to dig into Kortlandt tonight.

patcon

19 days ago

Glad I shared if it serves you!

> LLMs have inherited the statistical structure of the symbiont without the host: pattern without grounding.

I like this. I think it's not too far a leap to suggest "without soul". I think there's real value in the things we've believed ourselves to be made of though deep time, though without evidence of proper provenance. I suspect we've always been grappling to find language for the unnameable things.

Some of my own [somewhat outdated] reflections on language from the time I came across it, in case you're interested :) https://nodescription.net/notes/#2019-07-13

arbot360

21 days ago

1 reply

What do you make of this article? They used an auto-regressive genomic model to perform in-context learning experiments compared to language models. This showed that ICL behavior is not exclusive to language models. https://arxiv.org/html/2511.12797v1

adamzwasserman

21 days ago

This is great, thanks for the link. IMHO it actually supports the broader claim: if ICL emerges in both language models and genomic models, it suggests the phenomenon actually is about structure in the data, not something special about neural networks or transformers per se.

Genomes have statistical regularities (motifs, codon patterns, regulatory grammar). Language has statistical regularities (morphology, syntax, collocations). Both are sequences with latent structure. Similar architectures trained on either will repeat those structures.

That's consistent with my "instrumentation" view: the transformer is revealing structure that exists in the domain, whether that domain is English, French, or DNA. The architecture is the microscope; the structure was already there.

mcswell

22 days ago

Written French does have all that inflectional morphology you talk about, but spoken French has much less--a lot of the inflectional suffixes are just not pronounced on most verbs (with the exception of a few, like être and aller--but at least 'be' in English is inflected in ways that other verbs are not). So there's not that much redundancy.

As for gender marking on adjectives--or nouns--it does almost no semantic work in French, except where you're talking about professional titles (doctor, professor...) that can be performed by men or by women.

If you want a heavily inflected language, you should look at something like Turkish, Finnish, Swahili, Quechua, Nahuatl, Inuit... Even Spanish (spoken or written) has more verbal inflection than spoken French.

griffzhowl

23 days ago

1 reply

One disanalogy between human language use and LLMs is that language evolved to fit the human brain, which was already structured by millions of years of primate social life. This is more or less the reverse situation to a neural network trained on a large text corpus.

HarHarVeryFunny

23 days ago

Yes, but animal/human brains (cortex) appear to have evolved to be prediction machines, originally mostly predicting evolving sensory inputs (how external objects behave), and predicting real-world responses to the animals actions.

Language seems to be taking advantage of this pre-existing predictive architecture, and would have again learnt by predicting sensory inputs (heard language), which as we have seen is enough to induce ability to generate it too.

rdtsc

23 days ago

2 replies

> But what if our neurobiological reality includes a system that behaves something like an LLM?

With every technological breakthrough we always posit that the brain has to work like the newly discovered thing. At various times brains were hydraulic, mechanical, electrical, like a computer, like a network. Now, of course, the brain has to be like an LLM.

HarHarVeryFunny

23 days ago

3 replies

Yes, but at least now we're comparing artificial to real neural networks, so the way it works at least has a chance of being similar.

I do think that a transformer, a somewhat generic hierarchical/parallel predictive architecture, learning from prediction failure, has to be at least somewhat similar to how we learn language, as opposed to a specialized Chompyskan "language organ".

The main difference is perhaps that the LLM is only predicting based on the preceding sequence, while our brain is driving language generation by a combination of sequence prediction and the thoughts being expressed. You can think of the thoughts being a bias to the language generation process, a bit like language being a bias to a diffusion based image generator.

What would be cool would be if we could to some "mechanistic interpretability" work on the brain's language generation circuits, and perhaps discover something similar to induction heads.

aeve890

23 days ago

1 reply

>Yes, but at least now we're comparing artificial to real neural networks

Given that the only similarity between the two of is just the "network" structure I'd say that point is pretty weak. The name it's just an historical artifact.

HarHarVeryFunny

22 days ago

1 reply

Sure, but ANNs are at least connectionist, learning connections/strengths and representations, etc - close enough at that level of abstraction that I think ANNs can suggest how the brain may be learning certain things.

coldtea

21 days ago

And these exist too:

https://en.wikipedia.org/wiki/Spiking_neural_network

paddleon

22 days ago

1 reply

> comparing artificial to real neural networks

I had a sad day in college when I thought I'd build my own ANN using C++.

First thing I did was create a "Neuron" class, to mimic the idea of a human neuron.

Second thing I did was realize that ANNs are actually just Weiner filters with a sigmoid on top. The base unit is not a "neuron".

coldtea

21 days ago

Well, these exist tho:

https://en.wikipedia.org/wiki/Spiking_neural_network

rdtsc

23 days ago

> Yes, but at least now we're comparing artificial to real neural networks, so the way it works at least has a chance of being similar.

Indeed, and I wasn't even saying it's wrong, it may be pretty close.

> What would be cool would be if we could to some "mechanistic interpretability" work on the brain's language generation circuits, and perhaps discover something similar to induction heads.

Yeah, I wouldn't be surprised. And maybe the more we find out about the brain, it could lead to some new insights about how to improve AI. So we'd sort of converge from both sides.

jimbokun

23 days ago

All of those analogies were useful in some ways, and LLMs are too.

There's also a progression in your sequence. There were rudimentary mechanical calculating devices, then electrical devices begat electrical computers, and LLMs are a particular program running on a computer. So in a way the analogies are becoming more refined as we develop systems more and more capable of mimicking human capabilities.

taeric

23 days ago

This feels too reductive to me. In particular, it makes a hard distinction between the thinking and the language. I fully accept that they are distinct, but how distinct? It is hard not to think that some thinking styles influence how something is heard?

Not just in full language, mind, but consider the last time you heard a song in a major key? Do you even know what that means? Because many of us do not.

Same goes for listening to people discuss things like sports. I'm inclined to think many people effectively run a simulation in their mind of a game as they listen to it broadcast. This almost certainly isn't inherent to the language, it is part of the learning of it, though. Think looking over lists of the moves in a chess game. Then go from that to laying out the pieces as they are after that list. Or calling what the next move can be.

Can this be a completely separate set of "circuitry" in our brains that first parses the language and then builds the simulation? I suppose. Seems more likely there is something that is active between the two that can effectively get merged in advanced practitioners.

fallingfrog

23 days ago

I've had the experience of having migraines with aphasia- this is essentially a migraine aura that affects the part of the brain that processes language. I can confirm that while this was happening, i was aware of my surroundings and able to have thoughts, but I was unable to speak and unable to understand spoken or written language. It all just looked and sounded like gibberish. I thought about whether I should go to a hospital, what was going on, wondered whether my loved ones were concerned, and so on, but was unable to communicate any of those thoughts to other people. It was a bizarre experience.

rramadass

22 days ago

Language and Reality: a Wittgensteinian Reading of Bhartrhari - https://wab.uib.no/agora/tools/alws/collection-9-issue-1-art...

New Vistas to study Bhartrhari: Cognitive NLP (Natural Language Processing) - https://arxiv.org/abs/1810.04440

View full discussion on Hacker News

ID: 46191597Type: storyLast synced: 12/11/2025, 7:50:29 PM

Want the full context?

Jump to the original sources

Read the primary article or dive into the live Hacker News thread when you're ready.

Open link View on HN