An LLM Is a Lossy Encyclopedia

https://mk.absturztau.be/notes/ac4jnvjpskjc02lq

4 months ago

5 replies

It's also important to say what it isn't. LLM detractors, for lack of a better word, expect an oracle and then when they find out it's just a lossily compressed blob of human knowledge with natural language as a query interface they say the tool is useless.

I've got my opinion on whether that's useful or not and it's quite a bit more nuanced. You don't zoom-enhance JPEGs for a reason either.

littlestymaar

4 months ago

1 reply

> You don't zoom-enhance JPEGs for a reason either.

Tell that to the Google Pixel product team:

4 months ago

That's both perfect for this thread and terrible in general.

If you can see the analogy between text and pictures, it drives the point exactly the right way: in both cases you expect a database to know things it either can't or forgot. If it had a good picture of the zoomed in background it could probably generate a very good representation of what the cropped part would look like; same thing works with text.

ale

4 months ago

1 reply

Aren't the detractors the ones who know for a fact that it's a lossily compressed blob of knowledge and don't blindly fall for the hype?

4 months ago

most of us here know that (I hope), the difference is in the declaration of uselessness.

jacquesm

4 months ago

The problem is that figuring out which bits are the wrong ones is as much or more work than reading the relevant documentation. I think the main value is that it has a unified interface rather than 5000 different websites that you need to learn how to navigate.

osn9363739

4 months ago

An oracle was expected because that's what everyone kept saying it was or would be. If LLMs were shown and demonstrated realistically people would think they were really neat and find ways to use them. Instead I'm told I have phd™ level intelligence in my pocket. So of course people are going to be mad when it gets stumped on problems my 4yo could figure out.

Zigurd

4 months ago

There are several human endeavors for which we select people of high aptitude and crush their souls in very demanding postgraduate professional education so they can remix a knowledge base that's difficult for humans to master and impossible for humans to fully encompass.

The worm in that apple is that you still need educated humans to catch the erroneous LLM output.

maerqin

4 months ago

2 replies

I disagree with that analogy, because LLMs have a lot of connections between text fragments, which an encyclopedia doesn't have to such a deep degree. An encyclopedia also can't interpret and output relevant knowledge from an input prompt.

flohofwoe

4 months ago

It helps a lot to set expectations though, especially when thinking of an encyclopedia not as a row of dusty old books but as an 'archive of human knowledge'.

A slightly more precise analogy is probably 'a lossily compressed snapshot of the web'. Or maybe the Librarian from Snow Crash - but at least that one knew when it didn't know ;)

jbs789

4 months ago

It’s an analogy.

AndyPa32

4 months ago

2 replies

The first thing I tell the juniors under my supervision: any LLM is not a fact machine, even though sometimes it pretends to be. Double check everything!

bodge5000

4 months ago

1 reply

The thing I always tell those who heavily trust its output is to ask it something you either already know the answer to or are something of an expert in; the flaws become much more evident.

The old joke is that you can get away with anything with a hi-vis vest and enough confidence, and LLM's pretty much work on that principle

thw_9a83c

4 months ago

A super heavy overconfidence of any LLM is what confuses a lot of people.

refurb

4 months ago

1 reply

My company went head first into AI integration into everything.

I'm counting down the days until some important business decision is based on AI output that is wrong.

lstodd

4 months ago

That had happened already.

thw_9a83c

4 months ago

2 replies

Yes, LLM is a lossy encyclopedia with a human-language answering interface. This has some benefits, mostly in terms of convenience. You don't have to browse or read through so many pages of a real encyclopedia to get a quick answer. However, there is also a clear downside. Currently, LLM is unable to judge if your question is formulated incorrectly or if your question opens up more questions that should be answered first. It always jumps to answering something. A real human would assess the questioner first and usually ask for more details before answering. I feel this is the predominant reason why LLM answers feel so dumb at times. It never asks for clarification.

4 months ago

2 replies

I don't think that's universally true with the new models - I've seen Claude 4 and GPT-5 ask for clarification on questions with obvious gaps.

With GPT-5 I sometimes see it spot a question that needs clarifying in its thinking trace, then pick the most likely answer, then spit out an answer later that says "assuming you meant X ..." - I've even had it provide an answer in two sections for each branch of a clear ambiguity.

koakuma-chan

4 months ago

1 reply

GPT-5 is seriously annoying. It asks not just one but multiple clarifying questions, while I just want my answer.

kingstnap

4 months ago

1 reply

If you don't want to answer clarifying questions, then what use is the answer???

Put another way, if you don't care about details that change the answer, it directly implies you don't actually care about the answer.

Related silliness is how people force LLMs to give one word answers to underspecified comparisons. Something along the lines of "@Grok is China or US better, one word answer only."

At that point, just flip a coin. You obviously can't conclude anything useful with the response.

koakuma-chan

4 months ago

1 reply

No, I don't think GPT-5 clarifying questions actually do what you think they do. They just made the model ask clarifying questions for the sake of asking clarifying questions. I'm sure GPT-4o would have given me the answer I wanted without clarifying questions.

kiitos

4 months ago

1 reply

revisit your instructions.md and/or user preferences, this is very likely the root cause

koakuma-chan

4 months ago

Wait what. I use duck.ai, could it be that they put something into the system prompt......

4 months ago

1 reply

A lot of the touted "fundamental limitations of LLMs" are less "fundamental" and more "you're training them wrong".

So there are improvements version to version - from both increases in raw model capabilities and better training methods being used.

ijk

4 months ago

1 reply

I'm frustrated by the number of times I encounter people assuming that the current model behavior is inevitable. There's been hundreds of billions of dollars spent on training LLMs to do specific things. What exactly they've been trained on matters; they could have been trained to do something else.

Interacting with a base model versus an instruction tuned model will quickly show you the difference between the innate language faculties and the post-trained behavior.

Workaccount2

4 months ago

1 reply

Some of the Anthropic guys have said that the core thing holding the models back is training, and they're confident the gains will keep coming as they figure out how to onboard more and more training data. So yeah, Claude might suck at reading and writing plumbing diagrams, but they claim the barrier is simply a function of training, not any kind of architectural limitation.

4 months ago

I agree with the general idea, but "sucks at reading plumbing diagrams" is the one specific example where Claude is actually choked by its unfortunate architecture.

The "naive" vision implementation for LLMs is: break the input image down into N tokens and cram those tokens into the context window. The "break the input image down" part is completely unaware of the LLM's context, and doesn't know what data would be useful to the LLM at all. Often, the vision frontend just tries to convey the general "vibes" of the image to the LLM backend, and hopes that the LLM can pick out something useful from that.

Which is "good enough" for a lot of tasks, but not all of them, not at all.

coffeefirst

4 months ago

This is also why the Kagi Assistant is still be the AI tool I’ve found. The failure state is the same as a search results, it either can’t find anything, finds something irrelevant, or finds material that contradicts the premise of your question.

It seems to me the more you can pin it to another data set, the better.

haktan

4 months ago

2 replies

It's also like someone who knows lots of facts but bad at remembering where they exactly learned it from.

HeckFeck

4 months ago

And likely when mid-story to invent parts of the story to fill the gaps, rather than admit it is wrong.

Maybe the LLMs aren't so different from us.

blitzar

4 months ago

and frequently confuses them with other facts they know

feverzsj

4 months ago

2 replies

So, it's basically useless or even harmful.

4 months ago

1 reply

Yes, if you try to use it as if it was an actual lossless encyclopedia.

One of the reasons I like this analogy is that it hints at the fact that you need to use them in a different way - you shouldn't be looking up specific facts in an unassisted LLM outside of things that even lossy compression would capture (like the capital cities of countries).

skydhash

4 months ago

1 reply

The only usages I found so far that are somewhat useful is to generate plots with python and how to use the various libraries for machine learning. Also massage some hastily written text. Both involved haste as I needed some result fast.

Everything else is mostly playing around and harmful to learning.

cmcaleer

4 months ago

This sounds pretty helpful. If I'm trying a new lib that I want to do something specific with, I paste all the documentation in and interrogate the LLM about it, then cross-reference with the docs. Usually much faster to do what I want than just CTRL+F or writing a SO question that gets immediately marked as duplicate because some other question is vaguely related.

For language learning, it's terrible and will try to teach me wrong things if it's unguided. But pasting e.g. a lesson transcript that I just finished, then asking for exercises based on it helps solidify what I learned if the material doesn't come with drills.

I think writing is one of the things it's kind of terrible at. It's often way too verbose and has a particular 'voice' that I think leaves a bad taste in peoples' mouths. At least this issue has given me the confidence to finally just send single sentence emails so people know I don't use LLMs for this.

My frustrations with LLMs from years ago has largely chilled out as I've gotten better at using them and understanding that they aren't people who I can trust to give solid advice. If you're careful about what you put in and careful about what you take out you can get decent value.

4 months ago

Just like a hammer if you're trying to cook with it.

mr_toad

4 months ago

2 replies

That’s not the title of the article (granted, it doesn’t have one), and the author calls the analogy “questionable”.

4 months ago

Most analogies are questionable in my experience - I find calling something a "questionable analogy" makes it a tiny bit less likely that people will pick it apart with thousands of reasons it's not an exact match for what it's describing.

Svip

4 months ago

In the HTML's <title>-tag, it's called "Lossy encyclopedia".

4 months ago

10 replies

A lossy encyclopaedia should be missing information and be obvious about it, not making it up without your knowledge and changing the answer every time.

When you have a lossy piece of media, such as a compressed sound or image file, you can always see the resemblance to the original and note the degradation as it happens. You never have a clear JPEG of a lamp, compress it, and get a clear image of the Milky Way, then reopen the image and get a clear image of a pile of dirt.

Furthermore, an encyclopaedia is something you can reference and learn from without a goal, it allows you to peruse information you have no concept of. Not so with LLMs, which you have to query to get an answer.

4 months ago

5 replies

I think you are missing the point of the analogy: a lossy encyclopedia is obviously a bad idea, because encyclopedias are meant to be reliable places to look up facts.

4 months ago

1 reply

And my point is that “lossy” does not mean “unreliable”. LLMs aren’t reliable sources of facts, no argument there, but a true lossy encyclopaedia might be. Lossy algorithms don’t just make up and change information, they remove it from places where they might not make a difference to the whole. A lossy encyclopaedia might be one where, for example, you remove the images plus gramatical and phonetic information. Eventually you might compress the information where the entry for “dog” only reads “four legged creature”—which is correct but not terribly helpful—but you wouldn’t get “space mollusk”.

4 months ago

3 replies

I don't think a "true lossy encylopedia" is a thing that has ever existed.

4 months ago

1 reply

One could argue that’s what a pocket encyclopaedia (those exist) is. But even if we say they don’t, when you make up a term by mushing two existing words together it helps if the term makes sense. Otherwise, why even use the existing words? You called it a “lossy enyclopedia” and not a “spaghetti ice cream” for a reason, presumably so the term evokes an image or concept in the mind of the reader. If it’s bringing up a different image than what you intended, perhaps it’s not a good term.

I remember you being surprised when the term “vibe coding” deviated from its original intention (I know you didn’t come up with it). But frankly I was surprised at your surprise—it was entirely predictable and obvious how the term was going to be used. The concept I’m attempting to communicate to you is that when you make up a term you have to think not only of the thing in your head but also of the image it conjures up in other people’s minds. Communication is a two-way street.

nyeah

4 months ago

I think you're saying that "pocket encyclopedia" is one definition of "lossy encyclopedia" that may occur to people (or that may get marketed on purpose). But that's a very poor definition of LLMs. And so the danger is that people may lock onto a wildly misleading definition. Am I getting the point?

ianburrell

4 months ago

All encyclopedias are lossy. They curate the info they include, only choosing important topics. Wikipedia is lossy. They delete whole articles for irrelevance. They edit changes to make them more concise. They require sources for facts. All good things, but Wikipedia is a subset of human knowledge.

prerok

4 months ago

Since sibling comments all seem to have concentrated on idealistic good intent, I would also like to point out a different side of things.

I grew up in socialism. Since we've transitioned to democracy, I learned that I have to unlearn some things. Our encyclopedias were not inaccurate but were not complete. It's like lying through omission. And as the old saying goes, half-truths are worse than lies.

Whether this would be deemed as a lossy encyclopedia, I don't know. What I am certain of, however, is that it was accurate but omitted important additional facts.

And that is what I see in LLMs as well. Overall, it's accurate, except in cases where an additional fact would alter the conclusion. So, it either could not find arguments with that fact, or it chose to ignore them to give an answer and could be prompted into taking them into account or whatever.

What I do know is that LLMs of today give me the same hibbie-jibbies that rereading those encyclopedias of my youth give me.

4 months ago

1 reply

A lossy encyclopedia which you can talk to and it can look up facts in the lossless version while having a conversation OTOH is... not a bad idea at all, and hundreds of millions of people agree if traffic numbers are to be believed.

(but it isn't and won't ever be an oracle and apparently that's a challenge for human psychology.)

4 months ago

2 replies

Completely agree with you - LLMs with access to search tools that know how to use them (o3, GPT-5, Claude 4 are particularly good at this) mostly paper over the problems caused by a lossy set of knowledge in the model weights themselves.

But... end users need to understand this in order to use it effectively. They need to know if the LLM system they are talking to has access to a credible search engine and is good at distinguishing reliable sources from junk.

That's advanced knowledge at the moment!

johnecheck

4 months ago

1 reply

From earlier today:

Me: How do I change the language settings on YouTube?

Claude: Scroll to the bottom of the page and click the language button on the footer.

Me: YouTube pages scroll infinitely.

Claude: Sorry! Just click on the footer without scrolling, or navigate to a page where you can scroll to the bottom like a video.

(Videos pages also scroll indefinitely through comments)

Me: There is no footer, you're just making shit up

Claude: [finally uses a search engine to find the right answer]

pbhjpbhj

4 months ago

IME, eventually, after a long time, the scrolling stops and you can get to the footer. YMMV!

gf000

4 months ago

Slightly off topic, but my experience is that they are pretty terrible at using search tools..

They can often reason themselves into some very stupid direction, burning all the tokens for no reason and failing to reply in the end.

checkyoursudo

4 months ago

1 reply

I am sympathetic to your analogy. I think it works well enough.

But it falls a bit short in that encyclopedias, lossy or not, shouldn't affirmatively contain false information. The way I would picture a lossy encyclopedia is that it can misdirect by omission, but it would not change A to ¬A.

Maybe a truthy-roulette enclyclopedia?

tomrod

4 months ago

1 reply

I guarantee every encyclopedia has mistakes.

Jensson

4 months ago

I remember a study where they checked if wikipedia had more errors than paper encyclopedias, and they found there were about as many errors in both.

That study ended the "you can't trust wikipedia" argument, you can't trust anything but wikipedia is an as good as it gets second hand reference.

rynn

4 months ago

1 reply

Aren't all encyclopedias 'lossy'? They are all partial collections of information; none have all of the facts.

prerok

4 months ago

There's an important difference as to what is omitted.

An encyclopedia could say "general relativity is how the universe works" or it could say "general relativity and quantum mechanics describe how we understand the universe today and scientists are still searching for universal theory".

Both are short but the first statement is omitting important facts. Lossy in the sense of not explaining details is ok, but omitting swathes of information would be wrong.

butlike

4 months ago

I don't like the confident hallucinations of LLMs either, but don't they rewrite and add entries in the encyclopedia every few years? Implicitly that makes your old copy "lossy"

Again, never really want a confidently-wrong encyclopedia, though

TacticalCoder

4 months ago

2 replies

> You never have a clear JPEG of a lamp, compress it, and get a clear image of the Milky Way, then reopen the image and get a clear image of a pile of dirt.

Oh but it's much worse than that: because most LLMs aren't deterministic in the way they operate [1], you can get a pristine image of a different pile of dirt every single time you ask.

[1] there are models where if you have the "model + prompt + seed" you're at least guaranteed to get the same output every single time. FWIW I use LLMs but I cannot integrate them in anything I produce when what they output ain't deterministic.

4 months ago

1 reply

"Deterministic" is overrated.

Computers are deterministic. Most of the time. If you really don't think about all the times they aren't. But if you leave the CPU-land and go out into the real world, you don't have the privilege of working with deterministic systems at all.

Engineering with LLMs is closer to "designing a robust industrial process that's going to be performed by unskilled minimum wage workers" than it is to "writing a software algorithm". It's still an engineering problem - but of the kind that requires an entirely different frame of mind to tackle.

https://archive.ph/20241023235325/https://www.nytimes.com/20...

4 months ago

1 reply

And one major issue is that LLMs are largely being sold and understood more like reliable algorithms than what they really are.

If everyone understood the distinction and their limitations, they wouldn’t be enjoying this level of hype, or leading to teen suicides and people giving themselves centuries-old psychiatric illnesses. If you “go out into the real world” you learn people do not understand LLMs aren’t deterministic and that they shouldn’t blindly accept their outputs.

https://archive.ph/rdL9W

https://archive.ph/20250808145022/https://www.404media.co/gu...

4 months ago

2 replies

It's nothing new. LLMs are unreliable, but in the same ways humans are.

4 months ago

1 reply

But LLMs output is not being treated the same as human output, and that comparison is both tired and harmful. People are routinely acting like “this is true because ChatGPT said so” while they wouldn’t do the same for any random human.

LLMs aren’t being sold as unreliable. On the contrary, they are being sold as the tool which will replace everyone and do a better job at a fraction of the piece.

4 months ago

That comparison is more useful than the alternatives. Anthropomorphic framing is one of the best framings we have for understanding what properties LLMs have.

"LLM is like an overconfident human" certainly beats both "LLM is like a computer program" and "LLM is like a machine god". It's not perfect, but it's the best fit at 2 words or less.

krupan

4 months ago

Um, no. They are unreliable at a much faster pace and larger scale than any human. They are more confident while being unreliable than most humans (politicians and other bullshitters aside, most humans admit when they aren't sure about something).

4 months ago

> you can get a pristine image of a different pile of dirt every single time you ask.

That’s what I was trying to convey with the “then reopen the image” bit. But I chose a different image of a different thing rather than a different image of a similar thing.

energy123

4 months ago

1 reply

An encyclopaedia also can't win a gold medals at the IMO and IOI. So yeah, they're not the same thing, even though the analogy is pretty good.

4 months ago

Of course they’re not the same thing, the goal of an analogy is not to be perfect but to provide a point of comparison to explain an idea.

My point is that I find the chosen term inadequate. The author made it up from combining two existing words, where one of them is a poor fit for what they’re aiming to convey.

gf000

4 months ago

1 reply

I don't think there is a singular "should" that fits every use case.

E.g. a Bloom filter also doesn't "know" what it knows.

https://news.ycombinator.com/item?id=45101946

4 months ago

I don’t understand the point you’re trying to make. The given example confused me further, since nothing in my argument is concerned with the tool “knowing” anything, that has no relation to the idea I’m expressing.

I do understand and agree with a different point you’re making somewhere else in this thread, but it doesn’t seem related to what you’re saying here.

gjm11

4 months ago

3 replies

Lossy compression does make things up. We call them compression artefacts.

In compressed audio these can be things like clicks and boings and echoes and pre-echoes. In compressed images they can be ripply effects near edges, banding in smoothly varying regions, but there are also things like https://www.dkriesel.com/en/blog/2013/0802_xerox-workcentres... where one digit is replaced with a nice clean version of a different digit, which is pretty on-the-nose for the LLM failure mode you're talking about.

Compression artefacts generally affect small parts of the image or audio or video rather than replacing the whole thing -- but in the analogy, "the whole thing" is an encyclopaedia and the artefacts are affecting little bits of that.

Of course the analogy isn't exact. That would be why S.W. opens his post by saying "Since I love collecting questionable analogies for LLMs,".

jpcompartir

4 months ago

1 reply

Interesting, in the LLM case these compression artefacts then get fed into the generating process of the next token, hence the errors compound.

4 months ago

2 replies

Not really. The whole "inference errors will always compound" idea was popular in GPT-3.5 days, and it seems like a lot of people just never updated their knowledge since.

It was quickly discovered that LLMs are capable of re-checking their own solutions if prompted - and, with the right prompts, are capable of spotting and correcting their own errors at a significantly-greater-than-chance rate. They just don't do it unprompted.

Eventually, it was found that reasoning RLVR consistently gets LLMs to check themselves and backtrack. It was also confirmed that this latent "error detection and correction" capability is present even at base model level, but is almost never exposed - not in base models and not in non-reasoning instruct-tuned LLMs.

The hypothesis I subscribe to is that any LLM has a strong "character self-consistency drive". This makes it reluctant to say "wait, no, maybe I was wrong just now", even if latent awareness of "past reasoning look sketchy as fuck" is already present within the LLM. Reasoning RLVR encourages going against that drive and utilizing those latent error-correction capabilities.

Mallowram

4 months ago

1 reply

The problem is that language doesn't produce itself. Re-checking, correcting error is not relevant. Error minimization is not the fount of survival, remaining variable for tasks is. The lossy encyclopedia is neither here nor there, it's a mistaken path:

"Language, Halliday argues, "cannot be equated with 'the set of all grammatical sentences', whether that set is conceived of as finite or infinite". He rejects the use of formal logic in linguistic theories as "irrelevant to the understanding of language" and the use of such approaches as "disastrous for linguistics"."

4 months ago

1 reply

Sorry, what? This is borderline incoherent.

mallowdram

4 months ago

1 reply

The units themselves are meaningless without context. The point of existence, action, tasks is to solve the arbitrariness in language. Tasks refute language, not the other way around. This may be incoherent as the explanation is scientific, based in the latest conceptualization of linguistics.

CS never solved the incoherence of language, conduit metaphor paradox. It's stuck behind language's bottleneck, and it do so willingly blind-eyed.

https://www.sciencedirect.com/science/article/abs/pii/S00033...

4 months ago

1 reply

What? This is even less coherent.

You weren't talking to GPT-4o about philosophy recently, were you?

mallowdram

4 months ago

2 replies

I'd know cutting-edge linguistics and signaling theory well beyond Shannon to parse this, not NLP or engineering reduction. What I've stated is extremely coherent to Systemic Functional Linguists.

Beyond this point engineers actually have to know what signaling is, rather than 'information.'

Ultimately, engineering chose the wrong approach to automating language, and it sinks the field. It's irreversible.

4 months ago

One of the main takeaways from The Bitter Lesson was that you should fire your linguists. GPT-2 knows more about human language than any linguist could ever hope to be able to convey.

If you're hitching your wagon to human linguists, you'll always find yourself in a ditch in the end.

morpheos137

4 months ago

If not language what training substrate do you suggest? Also not strong ideas are expressible coherently. You have an ironic pattern in your comments of getting lost in the very language morass you propose to deprecate. If we don't train models on language what do we train them on? I have some ideas of my own but I am interested if you can clearly express yours.

jpcompartir

4 months ago

1 reply

You seem to be responding to a strawman, and assuming I think something I don't think.

As of today, 'bad' generations early in the sequence still do tend towards responses that are distant to the ideal response. This is testable/verifiable by pre-filling responses, which I'd advise you to experiment with for yourself.

'Bad' generations early in the output sequence are somewhat mitigatable by injecting self-reflection tokens like 'wait', or with more sophisticated test-time compute techniques. However, those remedies can simultaneously turn 'good' generations into bad, they are post-hoc heuristics which treat symptoms not causes.

In general, as the models become larger they are able to compress more of their training data. So yes, using the terminology of the commenter I was responding to, larger models should tend to have fewer 'compression artefacts' than smaller models.

https://cdn.openai.com/pdf/d04913be-3f6f-4d2b-b283-ff432ef4a...

4 months ago

1 reply

With better reasoning training, the models mitigate more and more of that entirely by themselves. They "diverge into a ditch" less, and "converge towards the right answer" more. They are able to use more and more test-time compute effectively. They bring their own supply of "wait".

OpenAI's in-house reasoning training is probably best in class, but even lesser naive implementations go a long way.

jpcompartir

4 months ago

Assuming you've read OpenAI's paper released this week?

They attribute these 'compression artefacts' to pre-training, they also reference the original snowballing paper: How Language Model Hallucinations Can Snowball: https://arxiv.org/pdf/2305.13534

They further state that reasoning is no panacea. W hilst you did say: "the models mitigate more and more"

You were replying to my comment which said:

"'Bad' generations early in the output sequence are somewhat mitigatable by injecting self-reflection tokens like 'wait', or with more sophisticated test-time compute techniques."

So our statements there are logically compatible, i.e. you didn't make a statement that contradicts what I said.

"Our error analysis is general yet has specific implications for hallucination. It applies broadly, including to reasoning and search-and-retrieval language models, and the analysis does not rely on properties of next-word prediction or Transformer-based neural networks."

"Search (and reasoning) are not panaceas. A number of studies have shown how language models augmented with search or Retrieval-Augmented Generation (RAG) reduce hallucinations (Lewis et al., 2020; Shuster et al., 2021; Nakano et al., 2021; Zhang and Zhang, 2025). However, Observation 1 holds for arbitrary language models, including those with RAG. In particular, the binary grading system itself still rewards guessing whenever search fails to yield a confident answer. Moreover, search may not help with miscalculations such as in the letter-counting example, or other intrinsic hallucinations"

https://news.ycombinator.com/item?id=45101679

4 months ago

1 reply

I feel like my comment is pretty clear that a compression artefact is not the same thing as making the whole thing up.

> Of course the analogy isn't exact.

And I don’t expect it to be, which is something I’ve made clear several times before, including on this very thread.

tsunamifury

4 months ago

More disagreeing with no meaningful value to the conversation. This is you. Constantly.

moregrist

4 months ago

2 replies

> Lossy compression does make things up. We call them compression artefacts.

I don’t think this is a great analogy.

Lossy compression of images or signals tends to throw out information based on how humans perceive it, focusing on the most important perceptual parts and discarding the less important parts. For example, JPEG essentially removes high frequency components from an image because more information is present with the low frequency parts. Similarly, POTS phone encoding and mp3 both compress audio signals based on how humans perceive audio frequency.

The perceived degradation of most lossy compression is gradual with the amount of compression and not typically what someone means when they say “make things up.”

LLM hallucinations aren’t gradual and the compression doesn’t seem to follow human perception.

4 months ago

LLM confabulations might as well be gradual in the latent space. I don’t think lossy is synonymous to perceptual and the high frequency components rather easily translate to less popular data.

Vetch

4 months ago

You are right and the idea of LLMs as lossy compression has lots of problems in general (LLMs are a statistical model, a function approximating the data generating process).

Compression artifacts (which are deterministic distortions in reconstruction) are not the same as hallucinations (plausible samples from a generative model; even when greedy, this is still sampling from the conditional distribution). A better identification is with super-resolution. If we use a generative model, the result will be clearer than a normal blotchy resize but a lot of details about the image will have changed as the model provides its best guesses at what the missing information could have been. LLMs aren't meant to reconstruct a source even though we can attempt to sample their distribution for snippets that are reasonable facsimiles from the original data.

An LLM provides a way to compute the probability of given strings. Once paired with entropy coding, on-line learning on the target data allows us to arrive at the correct MDL based lossless compression view of LLMs.

Lerc

4 months ago

1 reply

The argument is that a banana is a squishy hammer.

You're saying hammers shouldn't be squishy.

Simon is saying don't use a banana as a hammer.

4 months ago

1 reply

> You're saying hammers shouldn't be squishy.

No, that is not what I’m saying. My point is closer to “the words chosen to describe the made up concept do not translate to the idea being conveyed”. I tried to make that fit into your idea of the banana and squishy hammer, but now we’re several levels of abstraction deep using analogies to discuss analogies so it’s getting complicated to communicate clearly.

> Simon is saying don't use a banana as a hammer.

Which I agree with.

tsunamifury

4 months ago

1 reply

This is the type of comment that has been killing HN lately. “I agree with you but I want to disagree because I’m generally just that type of person. Also I am unable to tell my disagreeing point adds nothing.”

4 months ago

1 reply

Except that’s not what I’m saying at all. If anything, the “type of comment that has been killing HN” (and any community) are those who misunderstand and criticise what someone else says without providing any insight while engaging in ad hominem attacks (which are explicitly against the HN guidelines). It is profoundly ironic you are actively attacking others for the exact behaviour you are engaging in. I will kindly ask you do not do that. You are the first person in this immediate thread being rude and not adding to the collective understanding of the argument.

We are all free to agree with one part of an argument while disagreeing with another. That’s what healthy discourse is, life is not black and white. As way of example, if one says “apples are tasty because they are red”, it is perfectly congruent to agree apples are tasty but disagree that their colour is the reason. And by doing so we engage in a conversation to correct a misconception.

tsunamifury

4 months ago

More of the same

mock-possum

4 months ago

1 reply

Yeah an LLM is an unreliable librarian, if anything.

4 months ago

1 reply

That’s a much better analogy. You have to specifically ask them for information and they will happily retrieve it for you, but because they are unreliable they may get you the wrong thing. If you push back they’ll apologise and try again (librarians try to be helpful) but might again give you the wrong thing (you never know, because they are unreliable).

vrighter

4 months ago

1 reply

There's a big difference between giving you correct information about the wrong thing, vs giving you incorrect information about the right thing.

A librarian might bring you the wrong book, that's the former. An LLM does the latter. They are not the same.