Beliefs That Are True for Regular Software but False When Applied to AI
Posted3 months agoActive3 months ago
boydkane.comTechstoryHigh profile
calmmixed
Debate
80/100
AISoftware DevelopmentLlms
Key topics
AI
Software Development
Llms
The article discusses how common assumptions about software development don't apply to AI systems, sparking a discussion on the limitations and potential risks of AI.
Snapshot generated from the HN discussion
Discussion Activity
Very active discussionFirst comment
1h
Peak period
120
0-12h
Avg / period
22.9
Comment distribution160 data points
Loading chart...
Based on 160 loaded comments
Key moments
- 01Story posted
Oct 14, 2025 at 2:26 PM EDT
3 months ago
Step 01 - 02First comment
Oct 14, 2025 at 3:33 PM EDT
1h after posting
Step 02 - 03Peak activity
120 comments in 0-12h
Hottest window of the conversation
Step 03 - 04Latest activity
Oct 21, 2025 at 1:50 PM EDT
3 months ago
Step 04
Generating AI Summary...
Analyzing up to 500 comments to identify key contributors and discussion patterns
ID: 45583180Type: storyLast synced: 11/22/2025, 11:00:32 PM
Want the full context?
Jump to the original sources
Read the primary article or dive into the live Hacker News thread when you're ready.
:)
Was that a humam Freudian slip, or artificial one?
Yes, old software is often more reliable than new.
I much prefer the alternative where it's written in a manner where you can almost prove it's bug free by comprehensively unit testing the parts.
Old software is typically more reliable, not because the developers were better or the software engineering targeted a higher reliability metric, but because it's been tested in the real world for years. Even more so if you consider a known bug to be "reliable" behavior: "Sure, it crashes when you enter an apostrophe in the name field, but everyone knows that, there's a sticky note taped to the receptionist's monitor so the new girl doesn't forget."
Maybe the new software has a more comprehensive automated testing framework - maybe it simply has tests, where the old software had none - but regardless of how accurate you make your mock objects, decades of end-to-end testing in the real world is hard to replace.
As an industrial controls engineer, when I walk up to a machine that's 30 years old but isn't working anymore, I'm looking for failed mechanical components. Some switch is worn out, a cable got crushed, a bearing is failing...it's not the code's fault. It's not even the CMOS battery failing and dropping memory this time, because we've had that problem 4 times already, we recognize it and have a procedure to prevent it happening again. The code didn't change spontaneously, it's solved the business problem for decades... Conversely, when I walk up to a newly commissioned machine that's only been on the floor for a month, the problem is probably something that hasn't ever been tried before and was missed in the test procedure.
And more often than not the issue is a local configuration issue, bad test data, a misunderstanding of what the code is supposed to do, not being aware of some alternate execution path or other pre/post processing that is running, some known issue that we've decided not to fix for some reason, etc. (And of course sometimes we do actually discover a completely new bug, but it's rare).
To be clear, there are certainly code quality issues present that make modifications to the code costly and risky. But the code itself is quite reliable, as most bugs have been found and fixed over the years. And a lot of the messy bits in the code are actually important usability enhancements that get bolted on after the fact in response to real-world user feedback.
Reality is management is often misaligned with proper software engineering craftsmanship at every org I've worked at except one, and that was because the top director who oversaw all of us was also a developer and he let our team lead direct us whichever way he wanted us to.
The author is talking about the maturity of a project. Likewise, as AI technologies become more mature we will have more tools to use them in a safer and more reliable way.
E.g. you might fix a bug by adding a hacky workaround in the code; better product, worse code.
New code is the source of new bugs. Whether that's an entirely new product, a new feature on an existing project, or refactoring.
Yes, I did that.
If you think modern software is unreliable, let me introduce you to our friend, Rational Rose.
Or debuggers that would take out the entire OS.
Or a bad driver crashing everything multiple times a week.
Or a misbehaving process not handing control back to the OS.
I grew up in the era of 8 and 16 bit micros and early PCs, they where hilariously less stable than modern machines while doing far less, there wasn’t some halcyon age of near perfect software, it’s always been a case of things been good enough to be good enough but at least operating systems did improve.
The fact you continued to have BSOD issues after a full reinstall is pretty strong evidence you probably had some kind of hardware failure.
My point is if you are using the same "old" modern hardware, bsod is very rare.
It's why I don't play the new trackmania.
Windows is only stabilizing because it's basically dead. All the activity is in the higher layers, where they are racking their brains on how to enshittify the experience, and extract value out of the remaining users.
There were plenty of other issues, including the fact that you had to adjust the right IRQ and DMA for your Sound Blaster manually, both physically and in each game, or that you needed to "optimize" memory usage, enable XMS or EMS or whatever it was at the time, or that you spent hours looking at the nice defrag/diskopt playing with your files, etc.
More generally, as you hint to, desktop operating systems were crap, but the software on top of it was much more comprehensively debugged. This was presumably a combination of two factors: you couldn't ship patches, so you had a strong incentive to debug it if you wanted to sell it, and software had way fewer features.
Come to think about it, early browsers kept crashing and taking down the entire OS, so maybe I'm looking at it with rosy glasses.
Last year I assembled a retro PC (Pentium 2, Riva TNT 2 Ultra, Sound Blaster AWE64 Gold) running Windows 98 to remember my childhood, and it is more stable than what I remembered, but still way worse than modern systems. There are plenty of games that will refuse to work for whatever reason, or that will crash the whole OS, specially when existing, and require a hard reboot.
Oh and at least in the '90s you could already ship patches, we used to get them with the floppies and later CDs provided by magazines.
As I mess around with these old machines for fun in my free time, I encounter these kinds of crashes pretty dang often. Its hard to tell if its just the old hardware is broken in odd ways or not so I can't fully say its the old software, but things are definitely pretty unreliable on old desktop Windows running old desktop Windows apps.
But I was thinking of the (not particularly) golden days of MS-DOS/DR-DOS/Amiga/Atari applications.
Remember when debuggers were young?
Remember when OSes were young?
Remember when multi-tasking CPUs were young?
Etc...
They're NOT saying all software in the past was better.
ISTR someone else round here observing how much more effective it is to ask these things to write short scripts that perform a task than doing the task themselves, and this is my experience as well.
If/when AI actually gets much better it will be the boss that has the problem. This is one of the things that baffles me about the managerial globalists - they don't seem to appreciate that a suitably advanced AI will point the finger at them for inefficiency much more so than at the plebs, for which it will have a use for quite a while.
It's no different from those on HN that yell loudly that unions for programmers are the worst idea ever... "it will never be me" is all they can think, then they are protesting in the streets when it is them, but only after the hypocrisy of mocking those in the street protesting today.
Unionized software engineers would solve a lot of the "we always work 80 hour weeks for 2 months at the end of a release cycle" problems, the "you're too old, you're fired" issues, the "new hires seems to always make more than the 5/10+ year veterans", etc. Sure, you wouldn't have a few getting super rich, but it would also make it a lot easier for "unionized" action against companies like Meta, Google, Oracle, etc. Right now, the employers hold like 100x the power of the employees in tech. Just look at how much any kind of resistance to fascism has dwindled after FAANG had another round of layoffs..
I guess if managers get canned, it'll be just marketing types left?
But don't count on it.
I mean, apart from anything else, that's still a bad outcome.
[1] https://fortune.com/article/jamie-dimon-jpmorgan-chase-ceo-a...
(Has anyone tried an LLM on an in-basket test? [1] That's a basic test for managers.)
[1] https://en.wikipedia.org/wiki/In-basket_test
On the other hand, trying to do something "new" is lots of headaches, so emotions are not always a plus. I could make a parallel to doctors: you don't want a doctor to start crying in a middle of an operation because he feels bad for you, but you can't let doctors doing everything that they want - there needs to be some checks on them.
Now most of the photos online are just AI generated.
Just as human navigators can find the smallest islands out in the open ocean, human curators can find the best information sources without getting overwhelmed by generated trash. Of course, fully manual curation is always going to struggle to deal with the volumes of information out there. However, I think there is a middle ground for assisted or augmented curation which exploits the idea that a high quality site tends to link to other high quality sites.
One thing I'd love is to be able to easily search all the sites in a folder full of bookmarks I've made. I've looked into it and it's a pretty dire situation. I'm not interested in uploading my bookmarks to a service. Why can't my own computer crawl those sites and index them for me? It's not exactly a huge list.
Perhaps because most of the smartest people I know are regularly irrational or impulsive :)
* https://tvtropes.org/pmwiki/pmwiki.php/Main/StrawVulcan
I hear that. Then I try to use AI for simple code task, writing unit tests for a class, very similar to other unit tests. If fails miserably. Forgets to add an annotation and enters in a death loop of bullshit code generation. Generates test classes that tests failed test classes that test failed test classes and so on. Fascinating to watch. I wonder how much CO2 it generated while frying some Nvidia GPU in an overpriced data center.
AI singularity may happen, but the Mother Brain will be a complete moron anyway.
I don't know how long that exponential will continue for, and I have my suspicions that it stops before week-long tasks, but that's the trend-line we're on.
Or watch the Computerphile video summary/author interview, if you prefer: https://m.youtube.com/watch?v=evSFeqTZdqs
The cases I'm thinking about are things that could be solved in a few minutes by someone who knows what the issue is and how to use the tools involved. I spent around two days trying to debug one recent issue. A coworker who was a bit more familiar with the library involved figured it out in an hour or two. But in parallel with that, we also asked the library's author, who immediately identified the issue.
I'm not sure how to fit a problem like that into this "duration of human time needed to complete a task" framework.
While I think they're trying to cover that by getting experts to solve problems, it is definitely the case that humans learn much faster than current ML approaches, so "expert in one specific library" != "expert in writing software".
It's well worth looking at https://progress.openai.com/, here's a snippet:
> human: Are you actually conscious under anesthesia?
> GPT-1 (2018): i did n't . " you 're awake .
> GPT-3 (2021): There is no single answer to this question since anesthesia can be administered [...]
Yes.
LLM is a very interesting technology for machines to understand and generate natural language. It is a difficult problem that it sort of solves.
It does not understand things beyond that. Developing software is not simply a natural language problem.
And also where people believe that others believe it resides. Etc...
If we can find new ways to collectively renegotiate where we think power should reside we can break the cycle.
But we only have time to do this until people aren't a significant power factor anymore. But that's still quite some time away.
Our best technology at current require teams of people to operate and entire legions to maintain. This leads to a sort of balance, one single person can never go too far down any path on their own unless they convince others to join/follow them. That doesn't make this a perfect guard, we've seen it go horribly wrong in the past, but, at least in theory, this provides a dampening factor. It requires a relatively large group to go far along any path, towards good or evil.
AI reduces this. How greatly it reduces this, if it reduces it to only a handful, to a single person, or even to 0 people (putting itself in charge), seems to not change the danger of this reduction.
Hm, I'm listening, let's see.
> Software vulnerabilities are caused by mistakes in the code
That's not exactly true. In regular software, the code can be fine and you can still end up with vulnerabilities. The platform in which the code is deployed could be vulnerable, or the way it is installed make it vulnerable, and so on.
> Bugs in the code can be found by carefully analysing the code
Once again, not exactly true. Have you ever tried understanding concurrent code just by reading it? Some bugs in regular software hide in places that human minds cannot probe.
> Once a bug is fixed, it won’t come back again
Ok, I'm starting to feel this is a troll post. This guy can't be serious.
> If you give specifications beforehand, you can get software that meets those specifications
Have you read The Mythical Man-Month?
> these claims mostly hold, but they break down when applied to distributed systems, parallel code, or complex interactions between software systems and human processes
The claims the GP quoted DON’T mostly hold, they’re just plain wrong. At least the last two, anyway.
> Once a bug is fixed, it won’t come back again
Regressions are extremely common, which is why regression tests are so important. It is definitely not uncommon for bugs that were fixed once to come back again. This statement doesn't "mostly hold".
> If you give specifications beforehand, you can get software that meets those specifications
In theory, maybe, but in practice its messy. It depends on your acceptance testing. It depends on whether stakeholders change their mind during implementation (not uncommon). It depends on whether the specification was complete and nothing new is learned during development (almost never the case). Providing a specification in advance does not necessarily mean what you get out the other end meets that specification, unless its a relatively small, trivial, non-changing piece and the stakeholders have the discipline to not try change things part-way through. I mean, sure, it may be true that if you throw a spec over the wall, then you will get something thrown back that meets the spec, but the real world isn't so simple.
> I’m also going to be making some sweeping statements about “how software works”, these claims mostly hold, but they break down when applied to distributed systems, parallel code, or complex interactions between software systems and human processes.
I'd argue that this describes most software written since, uh, I hesitate to even commit to a decade here.
If you include analog computers, then there are some WWII targeting computers that definitely qualify (e.g., on aircraft carriers).
He is trying to lax the general public perception around AIs shortcomings. He's giving AI a break, at the expense of regular developers.
This is wrong on two fronts:
First, because many people foresaw the AI shortcomings and warned about them. This "we can't fix a bug like in regular software" theatre hides the fact that we can design better benchmarks, or accountability frameworks. Again, lots of people foresaw this, and they were ignored.
Second, because it puts the strain on non-AI developers. It blamishes all the industry, putting together AI with non-AI in the same bucket, as if AI companies stumbled on this new thing and were not prepared for its problems, when the reality is that many people were anxious about the AI companies practices not being up to standard.
I think it's a disgraceful take, that only serves to sweep things under a carpet.
The fact is, we kind of know how to prevent problems in AI systems:
- Good benchmarks. People said several times that LLMs display erratic behavior that could be prevented. Instead of adjusting the benchmarks (which would slow down development), they ignored the issues.
- Accountability frameworks. Who is responsible when an AI fails? How the company responsible for the model is going to make up for it? That was a demand from the very beginning. There are no such accountability systems in place. It's a clown fiesta.
- Slowing down. If you have a buggy product, you don't scale it. First, you try to understand the problem. This was the opposite of what happened, and at the time, they lied that scaling would solve the issues (when in fact many people knew for a fact that scaling wouldn't solve shit).
Yes, it's kind of different. But it's a different we already know. Stop pushing this idea that this stuff is completely new.
'we' is the operative word here. 'We', meaning technical people who have followed this stuff for years. The target audience of this article are not part of this 'we' and this stuff IS completely new _for them_. The target audience are people who, when confronted with a problem with an LLM, think it is perfectly reasonable to just tell someone to 'look at the code' and 'fix the bug'. You are not the target audience and you are arguing something entirely different.
What should I say now? "AI works in mysterious ways"? Doesn't sound very useful.
Also, should I start parroting innacurate outdated generalizations about regular software?
The post doesn't teach anything useful for a beginner audience. It's bamboozling them. I am amazed that you used the audience perspective as a defense of some kind. It only made it worse.
Please, please, take a moment to digest my critique properly. Think about what you just said and what that implies. Re-read the thread if needed.
This is not at all what I'm trying to do. This same essay is cross-posted on LessWrong[1] because I think ASI is the most dangerous problem of our time.
> This "we can't fix a bug like in regular software" theatre hides the fact that we can design better benchmarks, or accountability frameworks
I'm not sure how I can say "your intuitions are wrong and you should be careful" and have that be misintepreted as "ignore the problems around AI"
[1]: https://www.lesswrong.com/posts/ZFsMtjsa6GjeE22zX/why-your-b...
There are people (definitely not me) who buy 100% of anything AI they read. To that beginner enthusiastic audience, your text looks like "regular software is old and unintuitive, AI is better because you can just vibe", and the text reinforces that sentiment.
Did you read the footnote about writing regression tests to catch bugs before they come back in production?
https://news.ycombinator.com/item?id=45583970
Thought I might just skip the repetition. You can continue the conversation within that thread.
This also is a misunderstanding.
The LLM can be fine, the training and data can be fine, but because the LLMs we use are non-deterministic (at least in regard to their being intentional attempts at entropy to avoid always failing certain scenarios) current algorithms are inherently by-design not going to always answer every question correctly that it potentially could have if the values that fall within a range had been specific values for that scenario. You roll the dice on every answer.
Huh? If I need to sort the list of integer number of 3,1,2 in ascending order the only correct answer is 1,2,3. And there are multiple programming and mathematical questions with only one correct answer.
If you want to say "some programming and mathematical questions have several correct answers" that might hold.
1, 2, 3
1,2,3
[1,2,3]
1 2 3
etc.
292 more comments available on Hacker News