Welcome to Gas Town
Key topics
The provocative "Welcome to Gas Town" article sparked a lively debate about the future of AI-assisted development tools, with commenters weighing in on the merits and limitations of experimental projects like Gas Town. Some saw it as a fun, if flawed, exploration of new ideas, while others were skeptical of its potential, pointing out that it's "dangerously untested" and not production-ready. The discussion also veered into broader topics, such as the growing presence of AI in development workflows and the surprising number of projects that are contractually prohibited from using AI tools. As commenters like mccoyb and alexjurkiewicz noted, the line between playful experimentation and practical application is blurry, and the conversation highlighted the diverse perspectives on what's possible – and what's not – with AI-assisted development.
Snapshot generated from the HN discussion
Discussion Activity
Very active discussionFirst comment
48m
Peak period
108
96-108h
Avg / period
13.3
Based on 160 loaded comments
Key moments
- 01Story posted
Jan 1, 2026 at 5:36 PM EST
10 days ago
Step 01 - 02First comment
Jan 1, 2026 at 6:25 PM EST
48m after posting
Step 02 - 03Peak activity
108 comments in 96-108h
Hottest window of the conversation
Step 03 - 04Latest activity
Jan 9, 2026 at 6:53 PM EST
1d ago
Step 04
Generating AI Summary...
Analyzing up to 500 comments to identify key contributors and discussion patterns
Want the full context?
Jump to the original sources
Read the primary article or dive into the live Hacker News thread when you're ready.
> Gas Town helps with all that yak shaving, and lets you focus on what your Claude Codes are working on.
Then:
> Working effectively in Gas Town involves committing to vibe coding. Work becomes fluid, an uncountable that you sling around freely, like slopping shiny fish into wooden barrels at the docks. Most work gets done; some work gets lost. Fish fall out of the barrel. Some escape back to sea, or get stepped on. More fish will come. The focus is throughput: creation and correction at the speed of thought.
I see -- so where exactly is my focus supposed to sit?
As someone who sits comfortably in the "Stage 8" category that this article defines, my concern has never been throughput, it has always been about retaining a high-degree of quality while organizing work so that, when context switching occurs, it transitions me to near-orthogonal tasks which are easy to remember so I can give high-quality feedback before switching again.
For instance, I know Project A -- these are the concerns of Project A. I know Project B -- these are the concerns of Project B. I have the insight to design these projects so they compose, so I don't have to keep track of a hundred parallel issues in a mono Project C.
On each of those projects, run a single agent -- with review gates for 2-3 independent agents (fresh context, different models! Codex and Gemini). Use a loop, let the agents go back and forth.
This works and actually gets shit done. I'm not convinced that 20 Claudes or massively parallel worktrees or whatever improves on quality, because, indeed, I always have to intervene at some point. The blocker for me is not throughput, it's me -- a human being -- my focus, and the random points of intervention which ... by definition ... occur stochastically (because agents).
Finally:
> Opus 4.5 can handle any reasonably sized task, so your job is to make tasks for it. That’s it.
This is laughably not true, for anyone who has used Opus 4.5 for non-trivial tasks. Claude Code constantly gives up early, corrupts itself with self-bias, the list goes on and on. It's getting better, but it's not that good.
"What if Opus wrote the code, and GPT 5~ reviewed it?" I started evaluating this question, and started to get higher quality results and better control of complexity.
I could also trust this process to a greater degree than my previous process of trying to drive Opus, look at the code myself, try and drive Opus again, etc. Codex was catching bugs I would not catch with the same amount of time, including bugs in hard math, etc -- so I started having a great degree of trust in its reasoning capabilities.
I've codified this workflow into a plugin which I've started developing recently: https://github.com/evil-mind-evil-sword/idle
It's a Claude Code plugin -- it combines the "don't let Claude stop until condition" (Stop hook) with a few CLI tools to induce (what the article calls) review gates: Claude will work indefinitely until the reviewer is satisfied.
In this case, the reviewer is a fresh Opus subagent which can invoke and discuss with Codex and Gemini.
One perspective I have which relates to this article is that the thing one wants to optimize for is minimizing the error per unit of work. If you have a dynamic programming style orchestration pattern for agents, you want the thing that solves the small unit of work (a task) to have as low error as possible, or else I suspect the error compounds quickly with these stochastic systems.
I'm trying this stuff for fairly advanced work (in a PhD), so I'm dogfooding ideas (like the ones presented in this article) in complex settings. I think there is still a lot of room to learn here.
It's cool to see others thinking the same thing!
this is the equivalent of some crazy inventor in the 19th century strapping a steam engine onto a unicycle and telling you that some day youll be able to go 100mph on a bike. He was right in the end, but no one is actually going to build something usable with current technology.
Opus 4.5 isnt there. But will there be a model in 3-5 years thats smart enough, fast enough, and cheap enough for a refined vision of this to be possible? Im going to bet on yes to that question.
https://www.wired.com/story/london-bitcoin-pub/
The POS software's on GitHub: https://github.com/sde1000/quicktill
> something like gas town is clearly not attempting to be a production grade tool.
Compare to the first two sentences:
> Gas Town is a new take on the IDE for 2026. Gas Town helps you with the tedium of running lots of Claude Code instances. Stuff gets lost, it’s hard to track who’s doing what, etc. Gas Town helps with all that yak shaving, and lets you focus on what your Claude Codes are working on.
Compared to your read, my read is confused: is it or is it not intending to be a useful tool (we can debate "production" quality, here I'm just thinking something I'd actually use meaningfully -- like Claude Code)?
I think the author wants us to take this post seriously, so I'm taking it seriously, and my critique in the original post was a serious reaction.
This tool is dangerous, largely untested, and yet may be of interest if you are already doing similar things in production.
Gas Town is clearly the same thing multiplied by ten thousand. The number of overlapping and adhoc concepts in this design is overwhelming. Steve is ahead of his time but we aren't going to end up using this stuff. Instead a few of the core insights will get incorporated into other agents in a simpler but no less effective way.
And anyway the big problem is accountability. The reason everyone makes a face when Steve preaches agent orchestration is that he must be in an unusual social situation. Gas Town sounds fun if you are accountable to nobody: not for code quality, design coherence or inferencing costs. The rest of us are accountable for at least the first two and even in corporate scenarios where there is a blank check for tokens, that can't last. So the bottleneck is going to be how fast humans can review code and agree to take responsibility for it. Meaning, if it's crap code with embarrassing bugs then that goes on your EOY perf review. Lots of parallel agents can't solve that fundamental bottleneck.
Yeah this describes my feeling on beads too. I actually really like the idea - a lightweight task/issue tracker integrated with a coding agent does seem more useful than a pile of markdown todos/plans/etc. But it just doesnt work that well. Its really buggy and the bugs seem to confuse the agent since it was given instructions to do things a certain way that dont work consistently.
And also auditable, trackable, reportable, etc..
I was sort of kidding with "JIRA for Agents", obviously using the API and existing tool you can make agents use it.
We use Github at my current job and similarly have Claude Code update issues and PRs when it does work.
I'm looking for "the Emacs" of whatever this is, and I haven't read a blog post which isolates the design yet.
Or did you find one that's good?
But yeah, I'm only running one code agent at a time, so that's not a problem I have. I should probably start with just a todo list as plain text.
It unlocks a (still) hidden multiagent orchestration function in Claude code. The person making it unminified the code and figured out how to unlock it.
I find it quite well done - I started a orchestrator project a few days ago and scrapped it because it'll be fully integrated soon it seems.
Despite it's quirks I think beads is going to go down as one of the first pieces of software that got some adoption where the end user is an agent
[1]: https://github.com/nikvdp/linear-beads
What do you like about Linear? Is it suitable for hobby projects?
Linear is great, it's what JIRA should've been. Basically task management for people who don't want to deal with task management. It's also full featured, fast (they were famously one of the earlier apps to use a local-first sync-engine style architecture), and keyboard-centric.
Definitely suitable for hobby projects, but can also scale to large teams and massive codebases.
There's a lot of strange things going on in that project.
try to add some common sense, and you'll get shouted out.
which is fine, I'll just make my own version without the slop.
> Course, I’ve never looked at Beads either, and it’s 225k lines of Go code that tens of thousands of people are using every day. I just created it in October. If that makes you uncomfortable, get out now.
It's 2025, accountability is a thing of the past. The future belongs to the unaccountable and their AI swarm.
Facebook burned something like $70bn on "metaverse" with seemingly zero results. There's a lot more capital (and biosphere) to burn on AI agents.
Show HN: I replaced Beads with a faster, simpler Markdown-based task tracker - https://news.ycombinator.com/item?id=46487580 - Jan 2026 (2 comments) (<-- I've put this one in the SCP - see https://news.ycombinator.com/item?id=26998308 for explanation)
Solving Agent Context Loss: A Beads and Claude Code Workflow for Large Features - https://news.ycombinator.com/item?id=46471286 - Jan 2026 (1 comment)
Beads – A memory upgrade for your coding agent - https://news.ycombinator.com/item?id=46075616 - Nov 2025 (68 comments)
Beads: A coding agent memory system - https://news.ycombinator.com/item?id=45566864 - Oct 2025 (1 comment)
This explains why some of the comments have timestamps that appear older than the post itself. I got tired of trying to make them line up, sorry!)
IMHO, it's less disorienting to have the post dated after the comments than it is to see a comment you thought you wrote a couple days ago but is dated today. So you're welcome to stop trying to line up timestamps.
Status quo sucks also, it just sucks less. Haven't yet figured out an actually good solution. Sorry!
The most I imagine most folks saying is "Didn't I see this post on the front page days ago?". For many other discussion fora, it's not uncommon for posts to be at the top of the pile for many days... so a days-old post date should be nothing unusual.
Re artificial uplifting a.k.a. re-upping, see https://news.ycombinator.com/item?id=26998308 and https://news.ycombinator.com/pool
WARNING DANGER CAUTION GET THE F** OUT YOU WILL DIE
I have never met Steve, but this warning alone is :chefskiss:
Gas Town is from the creator of beads.
Outside of that its trial and error, but I've learned you don't need to kick off a new chat instance very much if at all. I also like Beads because if I have to "run" or go offline I can tell it to pause and log where it left off / where its at.
For some projects I tell claude not to close tickets without my direct approval because sometimes it closes them without testing, my baseline across all projects is that it compiles and runs without major errors.
But to keep things tractable, i've kept the orchestration within a collection of subagents in a single Claude code session. The orchestration system is called Pied-Piper and you can find the code here - https://github.com/sathish316/pied-piper
It is only 1.6k Lines of Go code.
Think of as an extended bipolar-optimism-fueled glimpse into the future. Steve's MO is laid out in the medium post - but basically, it's okay to lose things, rewrite whole subsystems, whatever, this is the future. It's really fun and interesting to watch the speed of development.
I've made a few multi agent coding setups in the last year, and I think gas town has the team side about right: big boss (mayor), operations boss (deacon), relatively linear keeper of truth (witness), single point for merges (refiner), lots of coders with their code held lightly.
I love the idea of formulas - a lot of what makes gas town work and informs how well it ultimately will work is the formulas. They're close conceptually to skills.
I don't love the mad max branding, but meh, whatever, it's fun, and a perk of the brave new world where you can make stuff like for a few hundred bucks a month sent to anthropic - software can have personality again, yay.
Conceptually I think there is a product team element to this still missing - deploy engineers, product managers, visual testing. Everything is sort of out there, janky in parts, but workable to glue together right now, and will only improve. That said, the mad max town analogy is going to get overstretched at some point; we already have pretty good names for all the parts that are needed, and as coordination improves, we're going to want to add more stuff into the coordination. So, I'd like to see a version of this with normal names and expanded.
Upshot - worth a look - if beads is any indication, give it a month or two or four to settle down unless you like living on the bleeding bleeding edge.
I pointed it at a Postgres time series project I was working on, and it deployed a much better UI and (with some nudging) fixed docker errors on a remote server, which involved logging in to the server to check logs. It probably opened and fixed 50 or so beads in total.
I'd reach for it first to do something complicated ("convoy" or epic) over Claude Code even as is -- like, e.g. "copy this data ingestion we do for site x, and implement it for sites y,z,,a,b,c,d. start with a formal architecture that respects our current one and remains extensible for all these sites" is something I think it would do a fair job at.
As to cost - I did not run out of my claude pro max subscription poking around with it. It infers ... a lot ... though. I pulled together a PR that would let you point some or all of the agent types at local or other endpoints, but it's a little early, I think for the codebase. I'd definitely reach for some cheaper and/or faster inference for some of the use cases.
The article was pretty Ok. Kubernetes has it's own share of obnoxious terminology that often comes up as "we name it different so that it doesn't sound like AWS". At some point you just accept the terminology in relation to the tool you use and move on.
Assuming this isn't a parody project, maybe this just isn't for me, and thats fine. I'm struggling to understand a production use case where I'd be comfortable letting this thing loose.
Who is the intended audience for this design?
I promptly gave Claude the text to the articles and had him rewrite using idiomatic distributed systems naming.
Fun times!
Update: I was hoping it'd at least be smart enough to automatically test the project still builds but it did not. It also didn't commit it's work.
P.s. the choice of nomenclature is a bit odd, and makes it hard to follow what is what. Movie characters, dogs and raccoons, huh? How about striving for descriptive SWE clarity?
that's what got us CQRS "command query responsibility segregation" which is technically correct word but absolutely fucking meaningless to anyone that doesn't know what it means already.
It should have been called "read here, write there" but noooooooOOOOOooooo we need descriptive SWE clarity so only people with CS degrees that know all the acronyms already can understand wtf is being said.
```Gas Town is also expensive as hell. You won’t like Gas Town if you ever have to think, even for a moment, about where money comes from. I had to get my second Claude Code account, finally; they don’t let you siphon unlimited dollars from a single account, so you need multiple emails and siphons, it’s all very silly. My calculations show that now that Gas Town has finally achieved liftoff, I will need a third Claude Code account by the end of next week. It is a cash guzzler.'''
Since I am quite capable of shitting my own code up for free, and I've got zero interest in this stupid AI shit anyway, I'm vanishingly unlikely to actually use this. But, still: I like to keep half an eye on what is going on, even if I hate it. And I am more than somewhat intrigued about what the numbers actually look like.
We're trying to orchestrate a horde of agents. The workers (polecats?) are the main problem solvers. Now you need a top level agent (mayor) to breakdown the problem and delegate work, and then a merger to resolve conflicts in the resulting code (refinery). Sometimes agents get stuck and need encouragement.
The molecules stuff confused me, but I think they're just "policy docs," checklists to do common tasks.
But this is baby stuff. Only one level of hierarchy? Show me a design for your VP agent and I'll be impressed for real.
Has to be close for the shortest time from first commit to HN front page.
...no, I haven't lost the plot. I'm seeing another fad of the intoxicated parting with their money bending a useful tool into a golden hammer of a caricature. I dread seeing the eventual wreckage and self-realization from the inevitable hangover.
I've never understood this argument. Do you ever work with other humans? They are very much not deterministic, yet they can often produce useful code that helps you achieve more than you could by yourself.
I'll add a personal anecdote - 2 years ago, I wrote a SwiftUI app by myself (bare you, I'm mostly an infrastructure/backend guy with some expertise in front end, where I get the general stuff, but never really made anything big out of it other than stuff on LAMPP back in 2000s) and it took me a few weeks to get it to do what I want to do, with bare minimum of features. As I was playtesting my app, I kept writing a wishlist of features for myself, and later when I put it on AppStore, people around the world would email me asking for some other features. But life, work and etc. would get into way, and I would have no time to actually do them, as some of the features would take me days/weeks.
Fast forward to 2 weeks ago, at this point I'm very familiar with Claude Code, how to steer multiple agents at a time, quick review its outputs, stitch things together in my head, and ask for right things. I've completed almost all of the features, rewrote the app, and it's already been submitted to AppStore. The code isn't perfect, but it's also not that bad. Honestly, it's probably better from what I would've written myself. It's an app that can be memory intensive in some parts, and it's been doing well from my testings. On top of it, since I've been steering 2-3 agents actively myself, I have the entire codebase in my mind. I also have overwhelming amount of more notes what I would do better and etc.
My point is, if you have enough expertise and experience, you'll be able to "stitch things together" cleaner than others with no expertise. This also means, user acquisition, marketing and data will be more valuable than the product itself, since it'll be easier to develop competing products. Finding users for your product will be the hard part. Which kinda sucks, if I'll be honest, but it is what it is.
I've had the same experience as you. I've applied it to old projects which I have some frame of reference for and it's like a 200x speed boost. Just absolutely insane - that sort of speed can overcome a lot of other shortcomings.
I'm a full stack dev, and solo, so I write data schema, backends and frontends at the same time, usually flipping between them to test parts of new features. As far as AI use, I'm really just at the level of using a single Claude agent in an IDE - and only occasionally, because it writes a lot of nonsense. So maybe I'm missing out on the benefits of multiple agents. But where I currently see value in it is in writing (1) boilerplate and (b) sugar - where it has full access to a large and stable codebase. Where I think it fails is in writing overarching logical structures, especially early on in a project. It isn't good at writing elegant code with a clear view of how data, back and front should work together. When I've tried to start projects from scratch with Claude, it feels like I'm fighting against its micro-view of each piece of code, where it's unable to gain a macro-view of how to orchestrate the whole system.
So like, maybe a bottomless wallet and a dozen agents would help with that, but there isn't so much room for errors or bugs in my work code as there is in my fun/play/casual game code. As a result I'm not really seeing that much value in it for paid work.
If your end goal is to produce some usable product, then the implementation details matter less. Does it work? Yes? OK then maybe dont wrestle with the agent over specific libraries or coding patterns.
I don’t see how we get there, though, at least in the short term. We’re still living in the heavily-corporate-subsidized AI world with usage-based pricing shenanigans abound. Even if frontier models providers find a path to profitability (which is a big “if”), there’s no way the price is gonna go anywhere but up. It’s moviepass on steroids.
Consumer hardware capable of running open models that compete with frontier models is still a long ways away.
Plus, and maybe it’s just my personal cynicism showing, but when did tech ever reduce pricing while maintaining quality on a provided service in the long run? In an industry laser focused on profit, I just don’t see how something so many believe to be a revolutionary force in the market will be given away for less than it is today.
Billions are being invested with the expectation that it will fetch much more revenue than it’s generating today.
There's little evidence this is true. Even OpenAI who is spending more than anyone is only losing money because of the free version of ChatGPT. Anthropic says they will be profitable next year.
> Plus, and maybe it’s just my personal cynicism showing, but when did tech ever reduce pricing while maintaining quality on a provided service in the long run? In an industry laser focused on profit, I just don’t see how something so many believe to be a revolutionary force in the market will be given away for less than it is today.
Really?
I mean I guess I'm showing my age but the idea I can get a VM for a couple of dollars a month and expect it to be reliable make me love the world I live in. But I guess when I started working there was no cloud and to get root on a server meant investing thousands of dollars.
According to Ed Zitron, Anthropic spent more than it's total revenue in the first 9 months of 2025 on AWS alone: $2.66 billion on AWS compute on an estimated $2.55 billion in revenue. That's just AWS, not payroll, not other software or hardware spend. He's regularly reporting concrete numbers that look horrible for the industry while hyperscalers and foundation model companies continue to make general statements while refusing to get specific or release real revenue figures. If you only listen to what the CEOs are saying, then sure it sounds great.
Anthropic also said that AI would be writing 95% of code in 3 months or something, however many months ago that was.
Yes, but it's unclear how much of that is training costs vs operational costs. They are very different things.
But how many of those providers are too subsidizing their offering through investment capital? I don't know offhand of anyone in this space that is running at or close to breakeven.
It feels very much like the early days of streaming when you could watch everything with a single Netflix account. Those days are long gone and never coming back.
We're also seeing significant price reductions every year for LLM's. Not for frontier models, but you can get the equivalent of last year's model for cheaper. Hard to tell from the outside, but I don't think it's all subsidized?
I think maybe people over-updated on Bitcoin mining. Most tech is not inherently expensive.
That's an old world that we experienced in 2000s, and maybe in early 2010s, where we cared about the quality on a provided service in the long run. For anything web-app-general-stuff related, that's long gone, as everyone (reads: mostly everyone) has very short attention span, and what is needed is "if the thing i desire can be done right now". In long run? Who cares. I keep seeing this in every day life, at work, discussions with my previous clients and etc.
Once again, I wish it wasn't true, but nothing is pointing that it's not true.
Or, if it does _now_, how long it'll be before it' will work well using downloadable models that'll run on, say, a new car's worth of Mac Studios with a bunch of RAM in them to allow a small fleet of 70B and 120B models (or larger) to run locally? Perhaps even specialised models for each of the roles this uses?
If training of new models ceased, and hardware was just dedicated to inference, what would that do to prices and speed? It's not clear to me how much inference is actually being subsidized over the actual cost to run the hardware to do it. If there's good data on that I'd love to learn more though.
Since we have version control, you can restart anywhere if you think it's a good place to fork from. I like greenfield development, but I suspect that there are going to be a lot more forks from now on, much like the game modding scene.
Companies with money-making businesses are gonna find themselves in an interesting spot when the "vibe juniors" are the vast majority of the people they can find to hire. New ways will be needed to reduce the risk.
...go to jail?
I have enjoyed Steve's rants since "Execution in the Kingdom of Nouns" and the Google "Platform rant", but he may need someone to talk to him about bamboo and what a terrible life choice it is. Unless you can keep it the hell away from you and your neighbours it is bad, very bad. I'm talking about clumping varieties, the runners are a whole other level.
There is a repo and I am not sure; the only way to resolve it probably is to spend some of that money he’s talking about.
In the past a large codebase indicated that maybe you might take the project serious, as some human effort was expended in its creation. There were still some outliers like Urbit and it's 144 KLOC of Hoon code, perverse loobeans and all.
Now if I get so much as a whiff of AI scent of a project, I lot all interest. It indicates that the author did not a modicum of their own time in the project, so therefore I should waste my own time on it.
(I use LLM-based coding tools in some of my projects, but I have the self-respect to review the generated code before publishing init.)
Of course as a developer you still have to take responsibility for your code, minimally including a disclaimer, and not dumping this code in to someone else’s code base. For example at work when submitting MRs I do generally read the code and keep MRs concise.
I’ve found that there is a certain kind of coder that hears of someone not reading the code and this sounds like some kind of moral violation to them. It’s not. It’s some weird new kind of coding where I’m more creating a detailed description of the functionality I want and incrementally refining it and iterating on it by describing in text how I want it to change. For example I use it to write GUI programs for Ubuntu using GTK and python. I’m not familiar with python-gtk library syntax or GTK GUI methods so there’s not really much of a point in reading the code - I ask the machine to write that precisely because I’m unfamiliar with it. When I need to verify things I have to come up with ways for the machine to test the code on its own.
Point is I think it’s honestly one new legitimate way of using these tools, with a lot of caveats around how such generated code can be responsibly used. If someone vibe coded something and didn’t read it and I’m worried it contains something dangerous, I can ask Claude to analyze it and then run it in a docker container. I treat the code the same way the author does - as a slightly unknown pile of functions which seem to perform a function but may need further verification.
I’m not sure what this means for the software world. On the face of it it seems like it’s probably some kind of problem, but I think at the same time we will find durable use cases for this new mode of interacting with code. Much the same as when compilers abstracted away the assembly code.
This is not exactly that, but it is one step up. Having agents output code that then gets compiled/interpreted/whatever, based upon contextual instruction, feels very, very familiar to engineers who have ever worked close to the metal.
"Old fashioned", in this aspect, would be putting guardrails in place so that you knew that what the agent/compiler was creating was what you wanted. Many years ago, that was binaries or bytecode packaged with lots of symbols for debugging. Today, that's more automated testing.
I started "fully vibecoding" 6 months ago, on a side-project, just to see if it was possible.
It was painful. The models kept breaking existing functionality, overcomplicating things, and generally just making spaghetti ("You're absolutely right! There are 4 helpers across 3 files that have overlapping logic").
A combination of adjusting my process (read: context management) and the models getting better, has led me to prefer "fully vibecoding" for all new side-projects.
Note: I still read the code that gets merged for my "real" work, but it's no longer difficult for me to imagine a future where that's not the case.
[0] https://github.com/kucherenko/jscpd
https://github.com/shepherdjerred/scout-for-lol/blob/main/es...
2 years sounds more likely than 2 months since the established norms and practices need to mature a lot more than this to be worthy of the serious consideration of the considerably serious.
On my personal project I do sometimes chat with ChatGPT and it works as a rubber duck. I explain, put my thoughts into words and typically I already solve it when I'm thinking it through while expressing it in words. But I must also admit that ChatGPT is very good at producing prose and I often use it for recommending names of modules, functions, enums etc. So there's some value there.
But when it comes to code I want to understand everything that goes into my project. So in the end of the day I'm always going to be the "bottle neck", whether I think through the problem myself and write the code or I review and try to understand the AI generated code slop.
It seems to me that using the AI slop generation workflow is a great fit for the industry though, more quantity rather quality and continuous churn. Make it cheaper to replace code so that the replacement can be replaced a week later with another vibe-coded slop. Quality might drop, bugs might proliferate but who cares?
LLMs are far from being as trustworthy as compilers.
Now I've got tools and functionality that I would have paid for before as separate apps that are running "for free" locally.
I can't help but think this is the way forward and we'll just have to deal with the landmine as/when it comes, or hope that the tooling gets drastically better so we the landmine isn't as powerful as we fear.
Most likely, tens of other bugs are being introduced at each step, etc etc, right?
I don't known the details but I was wondering why people aren't "just" writing chat venues any commns protocols for the chats? So the fundamental unit is a chat that humans and agents can be a member of.
You can also have DMs etc to avoid chattiness.
But fundmantally if you start with this kind of madness you don't have a strict hierarchy and it might also be fun to see how it goes.
I briefly started building this but just spun out and am stuck using PAL MCP for now and some dumb scripts. Not super content with any of it yet.
He as a dev should know that adding a layer of names on top of already named entities is not a good practice. But he just had fun and this came up. Which is fantastic. But I don't want to have to translate names in my head all the time.
Just not useful. Beads also... really sorry to say this, but it is a task runner with labels, but it has 0 awareness of the actual tasks.
I don't know, maybe I am wrong, but this just doesn't seem like a thing that will work. Which is why I think it will be popular, nobody will be able to make it work, but they will not want to look dumb and will say it is awesome and amazing. Like another AI thingy I could name but will not that everyone is using.
But love Yegge and hope he does well. Amp for a little bit that I used it, is really solid agent and delivered much better results than many others.
5 more comments available on Hacker News