Beliefs That Are True for Regular Software but False When Applied to AI

3 months ago

4 replies

> AIs will get more reliable over time, like old software is more reliable than new software.

Was that a humam Freudian slip, or artificial one?

Yes, old software is often more reliable than new.

joomla199

3 months ago

4 replies

Neither, you’re reading it wrong. Think of it as codebases getting more reliable over time as they accumulate fixes and tests. (As opposed to, say, writing code in NodeJS versus C++)

giancarlostoro

3 months ago

5 replies

Age of Code does not automatically equal quality of code, ever. Good code is maintained by good developers. A lot of bad code is pushed out by management, and other situations, or just bad devs. This is a can of worms you're talking your way into.

1313ed01

3 months ago

2 replies

Old code that has been maintained (bugfixed), but not messed with too much (i.e. major rewrites or new features) is almost certain to be better than most other code though?

DSMan195276

3 months ago

1 reply

"Bugfixes" doesn't mean the code actually got better, it just means someone attempted to fix a bug. I've seen plenty of people make code worse and more buggy by trying to fix a bug, and also plenty of old "maintained" code that still has tons of bugs because it started from the wrong foundation and everyone kept bolting on fixes around the bad part.

gridspy

3 months ago

One of frustrating truths about software is that it can be terrible and riddled with bugs but if you just keep patching enough bugs and use it the same way every time it eventually becomes reliable software ... as long as the user never does anything new and no-one pokes the source with a stick.

I much prefer the alternative where it's written in a manner where you can almost prove it's bug free by comprehensively unit testing the parts.

eptcyka

3 months ago

I’ve read parts of macOS’ open source code that surely has been around for a while, maintained and absolute rubbish.

prasadjoglekar

3 months ago

1 reply

It actually might. Older code running in production is almost automatically regression tested with each new fix. It might not be pretty, but it's definitely more reliable for solving real problems.

shakna

3 months ago

The list of bugs tagged regression at work certainly suggests it gets tested... But fixing those regressions...? That's a lot of dev time for things that don't really have time allocated for them.

LeifCarrotson

3 months ago

2 replies

You're using different words - the top comment only mentioned the reliability of the software, which is only tangentially related to the quality, goodness, or badness of the code used to write it.

Old software is typically more reliable, not because the developers were better or the software engineering targeted a higher reliability metric, but because it's been tested in the real world for years. Even more so if you consider a known bug to be "reliable" behavior: "Sure, it crashes when you enter an apostrophe in the name field, but everyone knows that, there's a sticky note taped to the receptionist's monitor so the new girl doesn't forget."

Maybe the new software has a more comprehensive automated testing framework - maybe it simply has tests, where the old software had none - but regardless of how accurate you make your mock objects, decades of end-to-end testing in the real world is hard to replace.

As an industrial controls engineer, when I walk up to a machine that's 30 years old but isn't working anymore, I'm looking for failed mechanical components. Some switch is worn out, a cable got crushed, a bearing is failing...it's not the code's fault. It's not even the CMOS battery failing and dropping memory this time, because we've had that problem 4 times already, we recognize it and have a procedure to prevent it happening again. The code didn't change spontaneously, it's solved the business problem for decades... Conversely, when I walk up to a newly commissioned machine that's only been on the floor for a month, the problem is probably something that hasn't ever been tried before and was missed in the test procedure.

freetime2

3 months ago

Yup, I have worked on several legacy codebases, and a pretty common occurence is that a new team member will join and think they may have discovered a bug in the code. Sometimes they are even quite adamant that the code is complete garbage and could never have worked properly. Usually the conversation goes something like: "This code is heavily used in production, and hasn't been touched in 10 years. If it's broken, then why haven't we had any complaints from users?"

And more often than not the issue is a local configuration issue, bad test data, a misunderstanding of what the code is supposed to do, not being aware of some alternate execution path or other pre/post processing that is running, some known issue that we've decided not to fix for some reason, etc. (And of course sometimes we do actually discover a completely new bug, but it's rare).

To be clear, there are certainly code quality issues present that make modifications to the code costly and risky. But the code itself is quite reliable, as most bugs have been found and fixed over the years. And a lot of the messy bits in the code are actually important usability enhancements that get bolted on after the fact in response to real-world user feedback.

giancarlostoro

3 months ago

Old software is not always more reliable though, which is my point. We can all think of really old still maintained software that is awful and unreliable. Maybe I'm just unlucky and get hired at places riddled with low quality software? I don't know, but I do know nobody I've ever worked with is ever surprised, only the junior developers.

Reality is management is often misaligned with proper software engineering craftsmanship at every org I've worked at except one, and that was because the top director who oversaw all of us was also a developer and he let our team lead direct us whichever way he wanted us to.

kube-system

3 months ago

1 reply

The author didn't mean that an older commit date on a file makes code better.

The author is talking about the maturity of a project. Likewise, as AI technologies become more mature we will have more tools to use them in a safer and more reliable way.

giancarlostoro

3 months ago

1 reply

I've seen too many old projects that are not by any means better no matter how much they get updates because management define priorities. I'm not alone in saying I've been in a few projects where the backlog is rather large. When your development is driven by marketing people trying to pump up sales, all the "non critical" bugs begin to stack up.

kube-system

3 months ago

Absolutely. Which is why the author clearly meant "old code" as in mature. Not "old code" as in "created a long time ago".

hatthew

3 months ago

I think we all agree that the quality of the code itself goes down over time. I think the point that is being made is that the quality of the final product goes up over time.

E.g. you might fix a bug by adding a hacky workaround in the code; better product, worse code.

izzydata

3 months ago

4 replies

Sounds more like survivorship bias. All the bad codebases were thrown out and only the good ones lasted a long time.

topaz0

3 months ago

1 reply

Survivorship bias is real, but is missing the important piece of the story when it comes to software, which doesn't just survive but is also maintained. Sure you may choose to discard/replace low quality software and keep high quality software in operation, which leads to survivorship bias, but the point here is that you also have a chance to find and fix issues in the one that survived, even if those issues weren't yet apparent in version 0.1. Author is not trying to say that version 0.1 of 30 year old software was of higher quality than version 0.1 of modern software -- they're saying that version 9 of 30 year old software is better than version 0.1 of modern software.

beyarkayAuthor

3 months ago

well said, yes this is the point I was (apparently failing) to make

beyarkayAuthor

3 months ago

well, yes, exactly. I'm not trying to claim that old code is more reliable just because it was written a long time ago, I'm making the claim that old code is more reliable because of the survivorship bias. If code was first written 20 years ago and is still in production, unchanged, I can be relatively certain there's no stop-the-world bugs in those lines. (this says nothing about how pretty the code is, though).

wvenable

3 months ago

In my experience actively maintained but not heavily modified applications tend towards stability over time. It don't even matter if they are good or bad codebases -- even a bad code will become less buggy over time if someone is working on bug fixes.

New code is the source of new bugs. Whether that's an entirely new product, a new feature on an existing project, or refactoring.

wsc981

3 months ago

Basically the Lindy Effect: https://en.wikipedia.org/wiki/Lindy_effect

james_marks

3 months ago

1 reply

I’ve always called this “Work Hardening”, as in, the software has been improved over time by real work being done with it.

jazzyjackson

3 months ago

Ok, but metal that has been hardened is more prone to snapping once it loses its ductility

3 months ago

You mean think of it as opposite to what is written in the remark, and then find it funny?

Yes, I did that.

3 months ago

7 replies

Holy survivorship bias, Batman.

If you think modern software is unreliable, let me introduce you to our friend, Rational Rose.

noir_lord

3 months ago

4 replies

Agreed.

Or debuggers that would take out the entire OS.

Or a bad driver crashing everything multiple times a week.

Or a misbehaving process not handing control back to the OS.

I grew up in the era of 8 and 16 bit micros and early PCs, they where hilariously less stable than modern machines while doing far less, there wasn’t some halcyon age of near perfect software, it’s always been a case of things been good enough to be good enough but at least operating systems did improve.

malfist

3 months ago

9 replies

Remember BSODs? Used to be a regular occurrence, now they're so infrequent they're gone from windows 11

krior

3 months ago

2 replies

Gone? I had two last year, lets not overstate things.

rkomorn

3 months ago

My anecdata is that my current PC is four years old, with the same OS install, and I can't even recall if I've seen one BSoD.

dylan604

3 months ago

Daily+ occurrences to two in a year pretty much rounds to zero. Kind of like we said measles were eradicated because there was <X per year cases.

ponector

3 months ago

1 reply

I guess that is because you run it on old hardware. When I've bought my Asus ROG expensive laptop I had bsod almost daily. A year later with all updates I had bsod once in a month on the same device and windows installation.

vel0city

3 months ago

1 reply

If you have faulty hardware no amount of software is going to solve your problems (other than software that just completely deactivates said faulty hardware).

The fact you continued to have BSOD issues after a full reinstall is pretty strong evidence you probably had some kind of hardware failure.

ponector

3 months ago

1 reply

But there was no reinstall in my case. Years goes by and further in time there are less and less bsod.

My point is if you are using the same "old" modern hardware, bsod is very rare.

vel0city

3 months ago

Ah, sorry, I misread your comment. Glad you're getting a better experience with your device over time!

wlesieutre

3 months ago

And the "cooperative multitasking" in old operating systems where one program locking up meant the whole system was locked up

stiglitz

3 months ago

As a Windows driver developer: LOL

spartanatreyu

3 months ago

Depends, if you install games with anti-cheat they can often conflict and cause BSODs.

It's why I don't play the new trackmania.

3 months ago

I remember Linux being remarkable reliable throughout its entire life in spite of being rabidly worked on.

Windows is only stabilizing because it's basically dead. All the activity is in the higher layers, where they are racking their brains on how to enshittify the experience, and extract value out of the remaining users.

dist-epoch

3 months ago

Mostly because Microsoft shut down kernel access, wrote it's own generic drivers for "simple" devices (USBs, printers, sound cards, ...) and made "heavy" drivers submit to their WHQL quality control to be signed to run.

ClimaxGravely

3 months ago

Still get them fairly regularly except now they come with a QR code.

Podrod

3 months ago

They're definitely not gone.

3 months ago

2 replies

I grew up in the same era and I recall crashes being less frequent.

There were plenty of other issues, including the fact that you had to adjust the right IRQ and DMA for your Sound Blaster manually, both physically and in each game, or that you needed to "optimize" memory usage, enable XMS or EMS or whatever it was at the time, or that you spent hours looking at the nice defrag/diskopt playing with your files, etc.

More generally, as you hint to, desktop operating systems were crap, but the software on top of it was much more comprehensively debugged. This was presumably a combination of two factors: you couldn't ship patches, so you had a strong incentive to debug it if you wanted to sell it, and software had way fewer features.

Come to think about it, early browsers kept crashing and taking down the entire OS, so maybe I'm looking at it with rosy glasses.

pezezin

3 months ago

1 reply

You are looking back with rosy glasses indeed.

Last year I assembled a retro PC (Pentium 2, Riva TNT 2 Ultra, Sound Blaster AWE64 Gold) running Windows 98 to remember my childhood, and it is more stable than what I remembered, but still way worse than modern systems. There are plenty of games that will refuse to work for whatever reason, or that will crash the whole OS, specially when existing, and require a hard reboot.

Oh and at least in the '90s you could already ship patches, we used to get them with the floppies and later CDs provided by magazines.

3 months ago

FWIW, I was speaking of the 80s.

vel0city

3 months ago

1 reply

It truly depends on the quality of the software you were using at the time. Maybe the software you used didn't result in many issues. I know a lot of the games I played as a kid on my family's or friend's Win95 machines resulted in system lockups or blue screens practically every time we used them.

As I mess around with these old machines for fun in my free time, I encounter these kinds of crashes pretty dang often. Its hard to tell if its just the old hardware is broken in odd ways or not so I can't fully say its the old software, but things are definitely pretty unreliable on old desktop Windows running old desktop Windows apps.

3 months ago

As an OS/2 and Linux users, I mostly missed out on Win95 fun.

But I was thinking of the (not particularly) golden days of MS-DOS/DR-DOS/Amiga/Atari applications.

crottypeter

3 months ago

1 reply

But at the time that software was "new" and unreliable.

beyarkayAuthor

3 months ago

ah yes, one person understands the point I was trying to make (:

yibg

3 months ago

Or just http without the s. We take it for granted now, but not even that long ago http was the standard.

binarymax

3 months ago

3 replies

You know, I had spent a good amount of years not having even a single thought about rational rose, and now that’s all over.

lossyalgo

3 months ago

1 reply

It could definitely be worse. I have the privilege of using it weekly :(

3 months ago

1 reply

What? How? I thought we stamped it out in the Purge of 2007.

lossyalgo

3 months ago

1 reply

Some things are forged in hell and refuse to die.

3 months ago

Ah, yes, the Oracle product model.

cjbgkagh

3 months ago

How much of that do you think would be attributable to IBM or Rational Software?

3 months ago

I do apologize. I couldn't bear this burden alone.

gridspy

3 months ago

1 reply

I think old in this sense is "released" rather than "beta" - it takes time to make any software reliable. Many of the examples here further prove that young software is unreliable.

Remember when debuggers were young?

Remember when OSes were young?

Remember when multi-tasking CPUs were young?

Etc...

beyarkayAuthor

3 months ago

Yes this is what I was trying to say. Code that's been in production for 20 years has less bugs than code that's been around for 20 minutes

jayd16

3 months ago

1 reply

You misunderstand. They are explicitly referring to the survivors that have been iterated on and chosen for being good.

They're NOT saying all software in the past was better.

beyarkayAuthor

3 months ago

correct.

chipotle_coyote

3 months ago

I know very little about Rational Rose, other than it always sounded like the stage name of a Vulcan stripper.

sidewndr46

3 months ago

Rational Rhapsody called and wants the crown back

3 months ago

At least that project was wise enough to use Lisp for storing its project files.

glitchc

3 months ago

Perhaps better rephrased as "software that's been running for a (long) while is more reliable than software that only started running recently."

beyarkayAuthor

3 months ago

I was maybe a little unclear with the phrasing, I meant to say "software that's been around for 20 years is more reliable than software that's been around for 2 months"

fidotron

3 months ago

2 replies

But this is why using the AI in the production of (almost) deterministic systems makes so much sense, including saving on execution costs.

ISTR someone else round here observing how much more effective it is to ask these things to write short scripts that perform a task than doing the task themselves, and this is my experience as well.

If/when AI actually gets much better it will be the boss that has the problem. This is one of the things that baffles me about the managerial globalists - they don't seem to appreciate that a suitably advanced AI will point the finger at them for inefficiency much more so than at the plebs, for which it will have a use for quite a while.

pixl97

3 months ago

1 reply

>that baffles me about the managerial globalists

It's no different from those on HN that yell loudly that unions for programmers are the worst idea ever... "it will never be me" is all they can think, then they are protesting in the streets when it is them, but only after the hypocrisy of mocking those in the street protesting today.

hn_acc1

3 months ago

1 reply

Agreed. My dad was raised strongly fundamentalist, and in North America, that included (back then) strongly resisting unions. In hindsight, I've come to realize that my parent's weren't maybe even of average intelligence, and definitely of above-average gullibility.

Unionized software engineers would solve a lot of the "we always work 80 hour weeks for 2 months at the end of a release cycle" problems, the "you're too old, you're fired" issues, the "new hires seems to always make more than the 5/10+ year veterans", etc. Sure, you wouldn't have a few getting super rich, but it would also make it a lot easier for "unionized" action against companies like Meta, Google, Oracle, etc. Right now, the employers hold like 100x the power of the employees in tech. Just look at how much any kind of resistance to fascism has dwindled after FAANG had another round of layoffs..

fidotron

3 months ago

Software "engineers" totally miss a key thing in other engineering professions as well, which is organizations to enforce some pretense of ethical standards to help push back against requests from product. Those orgs often look a lot like unions.

hn_acc1

3 months ago

A bunch of short scripts doesn't easily lead to a large-scale robust software platform.

I guess if managers get canned, it'll be just marketing types left?

xutopia

3 months ago

14 replies

The most likely danger with AI is concentrated power, not that sentient AI will develop a dislike for us and use us as "batteries" like in the Matrix.

preciousoo

3 months ago

1 reply

Seems like a self fulfilling prophecy

yoyohello13

3 months ago

Definitely not ‘self’ fulfilling. There are plenty of people actively and vigorously working to fulfill that particular reality.

fidotron

3 months ago

1 reply

I'm not so sure it will be that either, it would be having multiple AIs essentially at war with each other over access to GPUs/energy or whatever the materials are needed to grow if/when that happens. We will end up as pawns in this conflict.

[1] https://fortune.com/article/jamie-dimon-jpmorgan-chase-ceo-a...

3 months ago

Given that even fairly mediocre human intelligences can run countries into the ground and avoid being thrown out in the process, it's certainly possible for an AI to be in the intelligence range where it's smart enough to win vs humans but also dumb enough to turn us into pawns rather just go to space and blot out the sun with a Dyson swarm made from the planet Mercury.

But don't count on it.

I mean, apart from anything else, that's still a bad outcome.

pcdevils

3 months ago

4 replies

For one thing, we'd make shit batteries.

noir_lord

3 months ago

1 reply

IIRC the original idea was that the machines used our brain capacity as a distributed array but then they decided batteries was easier to understand while been sillier, just burn the carbon they are feeding us, it’s more efficient.

CuriouslyC

3 months ago

1 reply

If I could write the matrix reverted, Neo would discover that the last people put themselves in the pods because the world was so fucked up, and the machines had been caretakers that were trying to protect them from themselves. That revision would make the first movie perfect.

bobsmooth

3 months ago

Given that the first Matrix was a paradise that's pretty much canon if you ignore the duracell.

antod

3 months ago

Sounds about right, most of us already are. But why would the AI need our shit? Surely it wants electricity?

prometheus76

3 months ago

They farm you for attention, not electricity. Attention (engagement time) is how they quantify "quality" so that it can be gamed with an algorithm.

beyarkayAuthor

3 months ago

The Matrix only had people being batteries because a movie without humans in it isn't a fun movie to watch.

darth_avocado

3 months ago

2 replies

The reality is that the CEO/executive class already has developed a dislike for us and is trying to use us as “batteries” like in the Matrix.

ljlolel

3 months ago

4 replies

CEOs (even most VCs) are labor too

toomuchtodo

3 months ago

2 replies

Labor competes for compensation, CEOs compete for status (above a certain enterprise size, admittedly). Show me a CEO willingly stepping down to be replaced by generative AI. Jamie Dimon will be so bold to say AI will bring about a 3 day week (because it grabs headlines [1]) but he isn't going to give up the status of running JPMC; it's all he has besides the wealth, which does not appear to be enough. The feeling of importance and exceptionalism is baked into the identity.

Animats

3 months ago

1 reply

That's the market's job. Once AI CEOs start outperforming human CEOs, investment will flow to the winners. Give it 5-10 years.

(Has anyone tried an LLM on an in-basket test? [1] That's a basic test for managers.)

[1] https://en.wikipedia.org/wiki/In-basket_test

Eisenstein

3 months ago

Not if CEOs use their political power to make it illegal.

conception

3 months ago

1 reply

Spoiler there’s no reason we couldn’t work three days a week now. And 100 might be pushing it, but having life expectancy to 90 as well within our grass today as well. We have just decided not to do that.

Eisenstein

3 months ago

The reason we don't have 3 day weeks is because the system rewards revenue, not worker satisfaction.

icedchai

3 months ago

1 reply

Almost everyone is "labor" to some extent. There is always a huge customer or major investor that you are beholden to. If you are independently wealthy then you are the exception.

ljlolel

3 months ago

Bingo

pavel_lishin

3 months ago

Do they know it?

darth_avocado

3 months ago

Until shareholders treat them as such, they will remain in the ruling class

vladms

3 months ago

1 reply

Do you know personally some CEO-s? I know a couple and they generally seem less empathic than the general population, so I don't think that like/dislike even applies.

On the other hand, trying to do something "new" is lots of headaches, so emotions are not always a plus. I could make a parallel to doctors: you don't want a doctor to start crying in a middle of an operation because he feels bad for you, but you can't let doctors doing everything that they want - there needs to be some checks on them.

darth_avocado

3 months ago

1 reply

I would say that the parallel is not at all accurate because the relationship between a doctor and a patient undergoing surgery is not the same as the one you and I have with CEOs. And a lot of good doctors have emotions and they use them to influence patient outcomes positively.

Ensorceled

3 months ago

Even then, a psychopathic doctor at least has their desired outcomes mostly aligned with the patients.

nancyminusone

3 months ago

3 replies

To me, the greatest threat is information pollution. Primary sources will be diluted so heavily in an ocean of generated trash that you might as well not even bother to look through any of it.

Gigachad

3 months ago

It’s already been happening but now it’s accelerated beyond belief. I saw a video about how WW1 reenactment photos end up getting reposted away from their original context and confused with original photos to the point it’s impossible to tell unless you can track it back to the source.

Now most of the photos online are just AI generated.

chongli

3 months ago

I see that as the death knell for general search engines built to indiscriminately index the entire web. But where that sort of search fails, opportunities open up for focused search and curated search.

Just as human navigators can find the smallest islands out in the open ocean, human curators can find the best information sources without getting overwhelmed by generated trash. Of course, fully manual curation is always going to struggle to deal with the volumes of information out there. However, I think there is a middle ground for assisted or augmented curation which exploits the idea that a high quality site tends to link to other high quality sites.

One thing I'd love is to be able to easily search all the sites in a folder full of bookmarks I've made. I've looked into it and it's a pretty dire situation. I'm not interested in uploading my bookmarks to a service. Why can't my own computer crawl those sites and index them for me? It's not exactly a huge list.

tobias3

3 months ago

And it imitates all the unimportant bits perfectly (like spelling, grammar, word choice) while failing at the hard to verify important bits (truth, consistency, novelty)

mrob

3 months ago

1 reply

Why does an AI need the ability to "dislike" to calculate that its goals are best accomplished without any living humans around to interfere? Superintelligence doesn't need emotions or consciousness to be dangerous.

3 months ago

1 reply

It needs to optimize for something. Like/dislike is an anthropomorphization of the concept.

mrob

3 months ago

1 reply

It's an unhelpful one because it implies the danger is somehow the result of irrational or impulsive thought, and making the AI smarter will avoid it.

3 months ago

1 reply

That's not how I read it.

Perhaps because most of the smartest people I know are regularly irrational or impulsive :)

* https://tvtropes.org/pmwiki/pmwiki.php/Main/StrawVulcan

3 months ago

I think most people don't get that; look at how often even Star Trek script writers write Straw Vulcans*.

surgical_fire

3 months ago

4 replies

"AI will take over the world".

I hear that. Then I try to use AI for simple code task, writing unit tests for a class, very similar to other unit tests. If fails miserably. Forgets to add an annotation and enters in a death loop of bullshit code generation. Generates test classes that tests failed test classes that test failed test classes and so on. Fascinating to watch. I wonder how much CO2 it generated while frying some Nvidia GPU in an overpriced data center.

AI singularity may happen, but the Mother Brain will be a complete moron anyway.

alecbz

3 months ago

2 replies

Regularly trying to use LLMs to debug coding issues has convinced me that we're _nowhere_ close to the kind of AGI some are imagining is right around the corner.

3 months ago

2 replies

Sure, but also the METR study showed the rate of change is t doubles every 7 months where t ~= «duration of human time needed to complete a task, such that SOTA AI can complete same with 50% success»: https://arxiv.org/pdf/2503.14499

I don't know how long that exponential will continue for, and I have my suspicions that it stops before week-long tasks, but that's the trend-line we're on.

Pulcinella

3 months ago

1 reply

But will it actually get better or will it just get faster and more power efficient at failing to pair parentheses/braces/brackets/quotes?

3 months ago

Read the linked METR study please.

Or watch the Computerphile video summary/author interview, if you prefer: https://m.youtube.com/watch?v=evSFeqTZdqs

alecbz

3 months ago

2 replies

Only skimmed the paper, but I'm not sure how to think about "length of task" as a metric here.

The cases I'm thinking about are things that could be solved in a few minutes by someone who knows what the issue is and how to use the tools involved. I spent around two days trying to debug one recent issue. A coworker who was a bit more familiar with the library involved figured it out in an hour or two. But in parallel with that, we also asked the library's author, who immediately identified the issue.

I'm not sure how to fit a problem like that into this "duration of human time needed to complete a task" framework.

conception

3 months ago

1 reply

This is an excellent example of human “context windows” though and it could be the llm could have solved the easy problem with better context engineering. Despite 1M token windows, things still start to get progressively worse after 100k. LLMs would overnight be amazingly better with a reliable 1M window.

alecbz

3 months ago

What does "better context engineering" mean here? How/why are the existing token windows "unreliable"?

3 months ago

Fair comment.

While I think they're trying to cover that by getting experts to solve problems, it is definitely the case that humans learn much faster than current ML approaches, so "expert in one specific library" != "expert in writing software".

surgical_fire

3 months ago

At least Mother Brain will praise your prompt to generate yet another image in the style of Studio Ghibli as proof that your mind is a tour de force in creativity, and only a borderline genius would ask for such a thing.

bobsmooth

3 months ago

1 reply

Most reasonable AI alarmists are not concerned with sentient AI but an AI attached to the nukes that gets into one of those repeating death loops and fires all the missiles.

Ray20

3 months ago

In reality, this isn't a very serious threat. Rather, we're concerned about AI as a tool for strengthening totalitarian regimes.

beyarkayAuthor

3 months ago

2 replies

Given that AI couldn't even speak English 6 years ago, do you really think it's going to struggle with unit tests for the next 20 years?

It's well worth looking at https://progress.openai.com/, here's a snippet:

> human: Are you actually conscious under anesthesia?

> GPT-1 (2018): i did n't . " you 're awake .

> GPT-3 (2021): There is no single answer to this question since anesthesia can be administered [...]

surgical_fire

3 months ago

> Given that AI couldn't even speak English 6 years ago, do you really think it's going to struggle with unit tests for the next 20 years?

Yes.

LLM is a very interesting technology for machines to understand and generate natural language. It is a difficult problem that it sort of solves.

It does not understand things beyond that. Developing software is not simply a natural language problem.

nofriend

3 months ago

the improvements since 2021 are minor at best. ai thus far has been trained to imitate humans by training it on text written by humans. it's unlikely that you will make something as smart as a human by training it to imitate a human. imitation is a lossy process, you lose knowledge of the "why", you only imitate the outcome. to get beyond this state, we'll need a new technique. so far we've used gradient descent to teach an ai to reproduce a function. to teach it new behaviours will probably take evolutionary approaches. this will take orders of magnitude more compute to get to the same point. so yes it could take 20 years.

troupo

3 months ago

"Just one more prompt, bro", and your problems will be solved.

alfalfasprout

3 months ago

I mean, you can't really disprove either being an issue.

mmmore

3 months ago

You can say that, and I might even agree, but many smart people disagree. Could you explain why you believe that? Have you read in detail the arguments of people who disagree with you?

beyarkayAuthor

3 months ago

Given that they both seem pretty bad, it seems wrong to not consider them both dangerous and make plans for both of them?

3 months ago

Concentrated power is kinda a pre-requisite for anything bad happening, so yes, it's more likely in exactly the same way that given this:

  Linda is 31 years old, single, outspoken, and very bright. She majored in philosophy. As a student, she was deeply concerned with issues of discrimination and social justice, and also participated in anti-nuclear demonstrations.

"Linda is a bank teller" is strictly more likely than "Linda is a bank teller and is active in the feminist movement" — all you have is P(a)>P(a&b), not what the probability of either statement is.

worldsayshi

3 months ago

> power resides where men believe it resides

And also where people believe that others believe it resides. Etc...

If we can find new ways to collectively renegotiate where we think power should reside we can break the cycle.

But we only have time to do this until people aren't a significant power factor anymore. But that's still quite some time away.

navane

3 months ago

The power concentration is already massive, and a huge problem indeed. The ai is just a cherry on top. The ai is not the problem.

SkyBelow

3 months ago

I agree.

Our best technology at current require teams of people to operate and entire legions to maintain. This leads to a sort of balance, one single person can never go too far down any path on their own unless they convince others to join/follow them. That doesn't make this a perfect guard, we've seen it go horribly wrong in the past, but, at least in theory, this provides a dampening factor. It requires a relatively large group to go far along any path, towards good or evil.

AI reduces this. How greatly it reduces this, if it reduces it to only a handful, to a single person, or even to 0 people (putting itself in charge), seems to not change the danger of this reduction.

3 months ago

3 replies

> here are some example ideas that are perfectly true when applied to regular software

Hm, I'm listening, let's see.

> Software vulnerabilities are caused by mistakes in the code

That's not exactly true. In regular software, the code can be fine and you can still end up with vulnerabilities. The platform in which the code is deployed could be vulnerable, or the way it is installed make it vulnerable, and so on.

> Bugs in the code can be found by carefully analysing the code

Once again, not exactly true. Have you ever tried understanding concurrent code just by reading it? Some bugs in regular software hide in places that human minds cannot probe.

> Once a bug is fixed, it won’t come back again

Ok, I'm starting to feel this is a troll post. This guy can't be serious.

> If you give specifications beforehand, you can get software that meets those specifications

Have you read The Mythical Man-Month?

3 months ago

4 replies

You should read the footnote marked [1] after "a note for technical folk" at the beginning of the article. He is very consciously making sweeping generalizations about how software works in order to make things intelligible to non-technical readers.

dkersten

3 months ago

1 reply

Sure, but:

> these claims mostly hold, but they break down when applied to distributed systems, parallel code, or complex interactions between software systems and human processes

The claims the GP quoted DON’T mostly hold, they’re just plain wrong. At least the last two, anyway.

beyarkayAuthor

3 months ago

1 reply

Say more? I stand by my statement, but you're not specific enough for me to explain why I believe I'm correct.

dkersten

3 months ago

Focusing on the last two, which I called out specifically:

> Once a bug is fixed, it won’t come back again

Regressions are extremely common, which is why regression tests are so important. It is definitely not uncommon for bugs that were fixed once to come back again. This statement doesn't "mostly hold".

> If you give specifications beforehand, you can get software that meets those specifications

In theory, maybe, but in practice its messy. It depends on your acceptance testing. It depends on whether stakeholders change their mind during implementation (not uncommon). It depends on whether the specification was complete and nothing new is learned during development (almost never the case). Providing a specification in advance does not necessarily mean what you get out the other end meets that specification, unless its a relatively small, trivial, non-changing piece and the stakeholders have the discipline to not try change things part-way through. I mean, sure, it may be true that if you throw a spec over the wall, then you will get something thrown back that meets the spec, but the real world isn't so simple.

pavel_lishin

3 months ago

2 replies

But are those sweeping generalizations true?

> I’m also going to be making some sweeping statements about “how software works”, these claims mostly hold, but they break down when applied to distributed systems, parallel code, or complex interactions between software systems and human processes.

I'd argue that this describes most software written since, uh, I hesitate to even commit to a decade here.

3 months ago

For the purposes of the article, which is to demonstrate how developing an LLM is completely different from developing traditional software, I'd say they are true enough. It's a CS 101 understanding of the software development lifecycle, which for non-technical readers is enough to get the point across. An accurate depiction of software development would only obscure the actual point for the lay reader.

hedora

3 months ago

At least the 1950’s. That’s when stuff like asynchrony and interrupts were worked out. Dijkstra wrote at length about this in reference to writing code that could drive a teletype (which had fundamentally non-deterministic timings).

If you include analog computers, then there are some WWII targeting computers that definitely qualify (e.g., on aircraft carriers).

3 months ago

2 replies

Does that really matter?

He is trying to lax the general public perception around AIs shortcomings. He's giving AI a break, at the expense of regular developers.

This is wrong on two fronts:

First, because many people foresaw the AI shortcomings and warned about them. This "we can't fix a bug like in regular software" theatre hides the fact that we can design better benchmarks, or accountability frameworks. Again, lots of people foresaw this, and they were ignored.

Second, because it puts the strain on non-AI developers. It blamishes all the industry, putting together AI with non-AI in the same bucket, as if AI companies stumbled on this new thing and were not prepared for its problems, when the reality is that many people were anxious about the AI companies practices not being up to standard.

I think it's a disgraceful take, that only serves to sweep things under a carpet.

3 months ago

1 reply

I don't think he's doing that at all. The article is pointing out to non-technical people how AI is different than traditional software. I'm not sure how you think it's giving AI a break, as it's pointing out that it is essentially impossible to reason about. And it's not at the expense of regular developers because it's showing how regular software development is different than this. It makes two buckets, and puts AI in one and non-AI in the other.

3 months ago

1 reply

He is. Maybe he's just running with the pack, but that doesn't matter either.

The fact is, we kind of know how to prevent problems in AI systems:

- Good benchmarks. People said several times that LLMs display erratic behavior that could be prevented. Instead of adjusting the benchmarks (which would slow down development), they ignored the issues.

- Accountability frameworks. Who is responsible when an AI fails? How the company responsible for the model is going to make up for it? That was a demand from the very beginning. There are no such accountability systems in place. It's a clown fiesta.

- Slowing down. If you have a buggy product, you don't scale it. First, you try to understand the problem. This was the opposite of what happened, and at the time, they lied that scaling would solve the issues (when in fact many people knew for a fact that scaling wouldn't solve shit).

Yes, it's kind of different. But it's a different we already know. Stop pushing this idea that this stuff is completely new.

3 months ago

1 reply

>But it's a different we already know

'we' is the operative word here. 'We', meaning technical people who have followed this stuff for years. The target audience of this article are not part of this 'we' and this stuff IS completely new _for them_. The target audience are people who, when confronted with a problem with an LLM, think it is perfectly reasonable to just tell someone to 'look at the code' and 'fix the bug'. You are not the target audience and you are arguing something entirely different.

[1]: https://www.lesswrong.com/posts/ZFsMtjsa6GjeE22zX/why-your-b...

3 months ago

Let's pretend I'm the audience, and imagine that in the past I said those things ("fix the bug" and "look at the code").

What should I say now? "AI works in mysterious ways"? Doesn't sound very useful.

Also, should I start parroting innacurate outdated generalizations about regular software?

The post doesn't teach anything useful for a beginner audience. It's bamboozling them. I am amazed that you used the audience perspective as a defense of some kind. It only made it worse.

Please, please, take a moment to digest my critique properly. Think about what you just said and what that implies. Re-read the thread if needed.

beyarkayAuthor

3 months ago

1 reply

> He is trying to lax the general public perception around AIs shortcomings

This is not at all what I'm trying to do. This same essay is cross-posted on LessWrong[1] because I think ASI is the most dangerous problem of our time.

> This "we can't fix a bug like in regular software" theatre hides the fact that we can design better benchmarks, or accountability frameworks

I'm not sure how I can say "your intuitions are wrong and you should be careful" and have that be misintepreted as "ignore the problems around AI"

3 months ago

The word intuition here have power. Surely you must have noticed that vibe coding and intuition have a strong connection.

There are people (definitely not me) who buy 100% of anything AI they read. To that beginner enthusiastic audience, your text looks like "regular software is old and unintuitive, AI is better because you can just vibe", and the text reinforces that sentiment.

beyarkayAuthor

3 months ago

Bless you for having reading comprehension

rester324

3 months ago

1 reply

I thought this blog post was a parody. And to my surprise both the author and the audience takes it seriously. Weird

beyarkayAuthor

3 months ago

Why did you think it a parody?

beyarkayAuthor

3 months ago

1 reply

> Ok, I'm starting to feel this is a troll post. This guy can't be serious.

Did you read the footnote about writing regression tests to catch bugs before they come back in production?