Ask HN: Do coding assistants "see" attached images?
No synthesized answer yet. Check the discussion below.
That said, one can reasonably infer that an LLM-based system isn’t doing any form of visual processing at all… it’s just looking at your HTML and CSS and flagging where it diverts from the statistical mean of all such structures in the training data (modulo some stochastic wandering and that it may, somehow, have mixed some measure of Rick Astley or Goatse into its multidimensional lookup table).
Please explain
You know none of the variables. None of the constants. None of the exponents. No one does, really, but even if you did it wouldn’t help, because no one bothered to write down the operators and the parentheses are randomly shifted around every time the equation is resolved.
All you know is that if you ask it for tea, it will always, invariably, and forever, give you back something that is almost, but not quite entirely, unlike tea. Sometimes it might be more unlike coffee, some times more unlike vodka and cow urine.
What you’ll never, ever, ever reliably know is what’s in the cup.
That’s about the best way I know to explain black box abstractions. In a few decades we might have a workable theory as to why these things function, to the degree that they do, though I’ll bet a rather large amount of money that we won’t.
Claude gives what seems to be a reasonable answer.
https://www.perplexity.ai/search/how-do-ai-coding-assistants...