On the Existence, Impact, and Origin of Hallucination-Associated Neurons in Llms
Key topics
A heated debate erupts over the terminology used to describe Large Language Models (LLMs), with some commenters arguing that the term "hallucinate" is misleading, as LLMs lack intent and are simply "Weighted Random Word Generator Machines." Others counter that, despite the semantic differences, the term has become a useful shorthand, and it's time to move beyond the debate. The discussion reveals a deeper tension between those who understand LLMs' mechanistic nature and those who anthropomorphize them, with some suggesting that financial incentives are driving the misinterpretation. Amidst the bickering, a few voices urge a more nuanced understanding, pointing out that the abstraction level of the discussion matters.
Snapshot generated from the HN discussion
Discussion Activity
Very active discussionFirst comment
5m
Peak period
36
0-2h
Avg / period
11
Based on 44 loaded comments
Key moments
- 01Story posted
Dec 22, 2025 at 10:16 AM EST
12 days ago
Step 01 - 02First comment
Dec 22, 2025 at 10:21 AM EST
5m after posting
Step 02 - 03Peak activity
36 comments in 0-2h
Hottest window of the conversation
Step 03 - 04Latest activity
Dec 23, 2025 at 4:28 AM EST
11 days ago
Step 04
Generating AI Summary...
Analyzing up to 500 comments to identify key contributors and discussion patterns
Want the full context?
Jump to the original sources
Read the primary article or dive into the live Hacker News thread when you're ready.
But regardless of title this is all highly dubious...
The title/introduction is very baited, because it implies some "physical" connection to hallucinations in biological organism, but it's focused on trying to single out certain parts of the model. LLMs are absolutely nothing at all like a biological system, of which our brains are orders of magnitudes more complex than the machines we've built that we no longer fully understand. Believing in these LLMs as being some next stage in understanding intelligence is hubris.
Who cares? I wonder if any of the commenters is qualified enough to understand the research at all. I am not.
You're just arguing about semantics. It doesn't matter in any substantial way. Ultimately, we need a word to distinguish factual output from confidently asserted erroneous output. We use the word "hallucinate". If we used a different word, it wouldn't make any difference -- the observable difference remains the same. "Hallucinate" is the word that has emerged, and it is now by overwhelming consensus the correct word.
> Whenever they get something "right," it's literally by accident.
This is obviously false. A great deal of training goes into making sure they usually get things right. If an infinite number of monkeys on typewriters get something right, that's by accident. Not LLM's.
While I agree for many general aspects of LLMs, I do disagree in terms of some of the meta-terms used when describing LLM behavior. For example, the idea that AI has "bias" is problematic because neural networks literally have a variable called "bias", thus of course AI will always have "bias". Plus, a biases AI is literally the purpose behind classification algorithms.
But these terms, "bias" and "hallucinations", are co-opted to spin a narrative of no longer trusting AI.
How in the world did creating an overly confident chatbot completely 180 years of AI progress and sentiment?
It's called hallucination because it works by imagining you have the solution and then learning what the input needs to be to get that solution. Ie. you're changing the input (what the network sees as the "real world") to match what the network predicts, "what you already knew", just like a hallucinating human does.
You can imagine how hard it is to find the documentation of this technique now.
Hallucinations are already associated with a type of behavior, which is (roughly defined) "subjectively seeing/hearing things which aren't there". This is an input-level error, not the right umbrella term for the majority of errors happening with LLMs, many if which are at output-level.
I don't know what would be a better term, but we should distinguish between different semantic errors, such as:
- confabulating, i.e., recalling distorted or misinterpreted memories;
- lying, i.e., intentionally misrepresenting an event or memory;
- bullshitting, i.e., presenting a version without regard for the truth or provenance; etc.
I'm sure someone already made a better taxonomy, and hallucination is OK for normal public discussions, but I'm not sure why the distinctions aren't made in supposedly more serious works.
And I think we already distinguish between types of errors -- LLM's effectively don't lie, AFAIK, unless you're asking them to engage in role-play or something. They either hallucinate/confabulate in terms of inventing knowledge they don't have, or they just make "mistakes" e.g. in arithmetic, or in attempting to copy large amounts of code verbatim.
And when you're interested in mistakes, you're generally interested in a specific category of mistakes, like arithmetic, or logic, or copying mistakes, and we refer to them as such -- arithmetic errors, logic errors, etc.
So I don't think hallucination is taking away from any kind of specificity. To the contrary, it is providing specificity, because we don't call arithmetic errors hallucinations.
"Whenever they get something "right," it's literally by accident." "the random word generator"
First of, the input is not random at all which allows the question how random the output is.
Second, it compresses data which has an impact on that data. Probably cleaning or adjustment which should reduce 'random' even more. It compresses data from us into concepts. A high level concept is more robust than 'random'.
Thinking or reasoning models are also finetuning the response by walking the hyperspace and basically collecting and strengthening data.
We as humans do very similiar things and no one is calling us just random word predictors...
And because of this, "hallucinations -- plausible but factually incorrect outputs" is an absolut accurate description of what an LLM does when it response with a low probability output.
Humans also do this often enough btw.
Please stop saying an LLM is just a random word predictor.
Its like you have an agenda against LLMs by now marking them as 'semi-random' and undermining the complexity and the results we get from current LLMs.
I prefer to write hyperspace instead of n-dimensional.
Feel free to explain to me why you think my description is wrong.
LLMs are not non-deterministic in their nature. We add noise/randomess into specific layers to make them more 'creative'/'engaging' instead of always getting the exact same response we get variations.
Your whole sentence doesn't even contain a real description of what a LLM is. You say 'LLM are random LLMs'.
I explained that in a different comment already, feel free to check them out.
Obviously "hallucinate" and "lie" are metaphors. Get over it. These are still emergent structures that we have a lot to learn from by studying. But I suppose any attempt by researchers to do so should be disregarded because Person On The Internet has watched the 3blue1brown series on Neural Nets and knows better. We know the basic laws of physics, but spend lifetimes studying their emergent behaviors. This is really no different.
The "emergent structures" you are mentioning are just the outcome of randomness guided by "gradiently" descending to data landscapes. There is nothing to learn by studying these frankemonsters. All these experiments have been conducted in the past (decades past) multiple times but not at this scale.
We are still missing basic theorems, not stupid papers about which tech bro payed the highest electricity bill to "train" on extremely inefficient gaming hardware.
What is indisputable is that LLMs, even though they are 'just' word generators, are remarkably good at generating factual statements and accurate answers to problems, yet also regrettably inclined to generating apparenly equally confident counterfactual statements and bogus answers. That's all that 'hallucination' means in this context.
If this work can be replicated, it may offer a way to greatly improve the signal-to-bullshit ratio of LLMs, and that will be both impressive and very useful if true.
Biological systems are hard.
I'm extremely comfortable calling this paper complete and utter bullshit (or, I suppose if I'm being charitable, extremely poorly titled) from the title alone.
I recently almost fell on a tram as it accelerated suddenly; my arm reached out for a stanchion that was out of my vision, so rapidly I wasn't aware of what I was doing before it had happened. All of this occurred using subconscious processes, based on a non-physical internal mental model of something I literally couldn't see at the moment it happened. Consciousness is over-rated; I believe Thomas Metzinger's work on consciousness (specifically, the illusion of consciousness) captures something really important about the nature of how our minds really work.
This type of research is absolut valid.
An LLM is not just hallucinate.
https://news.ycombinator.com/newsguidelines.html
Submitters: If you want to say what you think is important about an article, that's fine, but do it by adding a comment to the thread. Then your view will be on a level playing field with everyone else's: https://hn.algolia.com/?dateRange=all&page=0&prefix=false&so...
No. Human beings have experiential, embodied, temporal knowledge of the world through our senses. That is why we can, say, empirically know something, which is vastly different than semantically or logically knowing something. Yes, human beings also have probabalistic ways of understanding the world and interacting with others. We have many other forms of knowledge as well and the LLM way of interpreting data is by no means the primary way in which we feel confident that something is true or false.
That said, I don't get up in arms about the term "hallucination", although I prefer the term confabulation per neuroscientist Anil Seth. Many clunky metaphors are now mainstream, and as long as the engineers and researchers who study these kinds of things are ok with that, that's the most important thing.
But what I think all these people who dismiss objections to the term as "arguing semantics" are missing is the fundamental point: LLMs have no intent, and they have no way of distinguishing what data is empirically true or not. This is why the framing, not just the semantics, of this piece is flawed. "Hallucinations" is a feature of LLMs that exists at the very conceptual level, not as a design flaw of current models. They have pattern recognition, which gets us very far in terms of knowing things, but people who only rely on such methods of knowing are most often referred to as conspiracy theorists.
The human brain may at its fundamental level operate on the principles of predictive processing (https://slatestarcodex.com/2017/09/05/book-review-surfing-un...). It might be that it has many layers surrounding that raw predictive core which develop us into epistemological beings. The LLMs we see today may be in the very early stages of a similar sort of (artificial) evolution.
The reflexiveness with which even top models like Opus 4.5 will sometimes seamlessly confabulate things definitely does make it seem like it is a very deep problem, but I don't think it's necessarily unsolvable. I used to be among the vast majority of people who thought LLMs were not sufficient to get us to AGI/ASI, but I'm increasingly starting to feel that piling enough hacks atop LLMs might really be what gets there before anything else.
[submitters: one reason for not editorializing titles is it makes the threads be about that!]