Your Intel Is Weak, Mr. Smith. (my Experiences with Local Agentic Models)
Key topics
In the article Matthew laments that, "for the average laptop that’s over a year old, the number of useful AI models you can run locally on your PC is close to zero. This laptop might have a four- to eight-core processor (CPU), no dedicated graphics chip (GPU) or neural-processing unit (NPU), and 16 gigabytes of RAM, leaving it underpowered for LLMs."
"That's odd," I thought to myself. "It sure feels like I'm running LLMs."
I mean, most of the models I run locally are pretty much on par with the large ones for my daily needs and there seems to be a pretty good variety of minimal desktop language models to choose from. Within my specs the number of choices is even larger.
For comparison, I'm running a dual-core system with 128MB integrated Intel UHD graphics (don't think that counts as "dedicated"), definitely no NPU, and a measly 8 gigs of RAM. The machine is about 3 years old and it was already a "budget-friendly" laptop back when I got it. As a gaming machine in 2004 it would've been pretty awesome -- I can comfortably play "Psi-Ops: The Mindgate Conspiracy" on the highest quality settings while inference is running in the background.
While Matthew makes mention of the Small Language Models that I employ, his only criticism is that these models "either scale back these features or omit them entirely" without actually defining what "these features" are (unless the ginormous size of LLMs is considered a "feature"?)
I'll grant that generating responses on my hardware is noticeably slower than when using remote LLMs but that just means that my (fully local) agentic sidekick needs to wake up a bit earlier in the morning in order to complete its assigned tasks before my first coffee of the day. After that there are plenty of assignments that it can accomplish in the background while I do other things.
All told, a 4-to-6 billion parameter model is probably the upper limit for my setup but even then I've got some great options like Google's Gemma, Microsoft's Phi, or Alibaba's Qwen. All three come in a variety of quantized flavors that include thinking/reasoning and software tool use. Plus I can comfortably use them concurrently with other, smaller, and more specialized models all while running Visual Studio Code, looking stuff up online, or whatever, etc.
I can provide the necessary tools for the assigned tasks and should I need to tighten the belt I can hot-swap down to slimmer models like Liquid AI's LFM or IBM's Granite. Moreover, there are many derived and tweaked models available for deeply "underpowered" machines like mine.
Point being, I think that Mr. Smith got it wrong on this one. Laptops like mine are more than sufficient to run modern (albeit smaller), models. Even geriatric machines and browsers can contribute to the effort -- depends on your requirements and your ability to split up the workload.
There are certain AI tasks like generative video creation that my hardware can't reasonably handle but for these outlier cases either me or my agentic buddy can farm the work out to a public API or a service like Google's Colab.
Thoughts? Critiques? Want to know the setup?
Discussion Activity
No activity data yet
We're still syncing comments from Hacker News.
Generating AI Summary...
Analyzing up to 500 comments to identify key contributors and discussion patterns
Want the full context?
Jump to the original sources
Read the primary article or dive into the live Hacker News thread when you're ready.
Discussion hasn't started yet.