Ask HN: Why don't programming language foundations offer "smol" models?

LLMslocal modelsprogramming language foundations

No synthesized answer yet. Check the discussion below.

Discussion (2 comments)

Showing 2 comments

zahlman

about 2 months ago

> I wonder why I can't find a model that only does Python and is good only at that, and run that locally.

Because if you want to prompt it in English, it has to be good at English as well. And it gets good at English by reading extreme quantities of it. Which incidentally is written on a wide variety of topics.

ben_w

about 2 months ago

> I wonder why I can't find a model that only does Python and is good only at that, and run that locally. When I need to do zig, I can switch to a zig model, and unload the python one from memory. If it only does a single language, and it does not need to know about US presidential elections, couldn't it be very small and something I could run on my MacOS M1 laptop with 16GB of ram?

I also wonder this.

My suspicion — based on what I experienced with local image generating models, but otherwise poorly educated — is that they need all of the other stuff besides programming languages just to understand what your plain English prompt means in the first place, and they need to be quite bulky models to have any kind of coherency over token horizons longer than one single function.

Of interest: Apple does ship a coding LLM in Xcode that's (IIRC) 2 GB and it really just does feel like fancy Swift-only autocomplete.

View on Hacker News

Resources