Reflections on AI at the End of 2025

Posted14 days agoActive9 days ago

danielfalbo

238 points

355 comments

antirez.comTech DiscussionstoryHigh profile

informativeneutral

Debate

40/100

Artificial IntelligenceFuture_techBusiness Technology

Key topics

Artificial Intelligence

Future_tech

Business Technology

As the AI landscape hurtles toward 2026, a thought-provoking discussion is unfolding around the potential trajectory of AI development, sparked by reflections on the field's future. Commenters are weighing in on the implications of AI-optimized code, with some noting that prioritizing speed can lead to less readable and maintainable code, a phenomenon that's not unique to AI, as human-optimized code often suffers the same fate. The conversation takes a turn toward the existential risks associated with AI, with the original author clarifying that the "avoiding extinction" comment refers to AI safety concerns, prompting some to point to resources on the topic and others to dismiss the notion as alarmist. As the boundaries between human-generated and AI-generated code continue to blur, the discussion highlights the need to reevaluate our assumptions about the role of AI in software development.

Snapshot generated from the HN discussion

Discussion Activity

Very active discussion

First comment

11m

Peak period

135

0-12h

Avg / period

17.8

Comment distribution160 data points

Loading chart...

Based on 160 loaded comments

Key moments

01Story posted
Dec 20, 2025 at 4:38 AM EST
14 days ago
Step 01
02First comment
Dec 20, 2025 at 4:49 AM EST
11m after posting
Step 02
03Peak activity
135 comments in 0-12h
Hottest window of the conversation
Step 03
04Latest activity
Dec 25, 2025 at 2:07 AM EST
9 days ago
Step 04

Generating AI Summary...

Analyzing up to 500 comments to identify key contributors and discussion patterns

Discussion (355 comments)

Showing 160 comments of 355

danielfalboAuthor

14 days ago

3 replies

> There are certain tasks, like improving a given program for speed, for instance, where in theory the model can continue to make progress with a very clear reward signal for a very long time.

This makes me think: I wonder if Goodhart's law[1] may apply here. I wonder if, for instance, optimizing for speed may produce code that is faster but harder to understand and extend. Should we care or would it be ok for AI to produce code that passes all tests and is faster? Would the AI become good at creating explanations for humans as a side effect?

And if Goodhard's law doesn't apply, why is it? Is it because we're only doing RLVR fine-tuning on the last layers of the network so all the generality of the pre-training is not lost? And if this is the case, could this be a limitation in not being able to be creative enough to come up with move 37?

[1] https://wikipedia.org/wiki/Goodhart's_law

username223

14 days ago

1 reply

> I wonder if, for instance, optimizing for speed may produce code that is faster but harder to understand and extend.

Superoptimizers have been around since 1987: https://en.wikipedia.org/wiki/Superoptimization

They generate fast code that is not meant to be understood or extended.

progval

14 days ago

1 reply

But there output is (usually) executable code, and is not committed in a VCS. So the source code is still readable.

When people use LLMs to improve their code, they commit their output to Git to be used as source code.

Wowfunhappy

13 days ago

1 reply

At some point we're going to need to find a new place to draw the boundaries, I think. Until ~2022 there was a clear line between human-generated code and computer-generated code. The former was generally optimized for readability, the latter was optimized for speed at all cost. Now we have computer-generated code in the layer that was supposed to be for humans.

erichocean

13 days ago

> it's not obvious what it should be optimized for

It should be optimized for readability by AI. If a human wants to know what a given bit of code does, they can just ask.

franktankbank

13 days ago

Ehh I think if it ends up being a half good architecture you wind up with a difficult to understand kernel that never needs touching.

lemming

14 days ago

I wonder if, for instance, optimizing for speed may produce code that is faster but harder to understand and extend.

This is generally true for code optimised by humans, at least for the sort of mechanical low level optimisations that LLMs are likely to be good at, as opposed to more conceptual optimisations like using better algorithms. So I suspect the same will be true for LLM-optimised code too.

ur-whale

14 days ago

3 replies

Not sure I understand the last sentence:

> The fundamental challenge in AI for the next 20 years is avoiding extinction.

danielfalboAuthor

14 days ago

1 reply

I think he's referring to AI safety.

https://lesswrong.com/posts/uMQ3cqWDPHhjtiesc/agi-ruin-a-lis...

grodriguez100

14 days ago

1 reply

For a perhaps easier to read intro to the topic, see https://ai-2027.com/

dkdcio

14 days ago

or read your favorite sci-fi novel, or watch Terminator. this is pure bs by a charlatan

chrishare

14 days ago

1 reply

He's referring to humanity, I believe

A_D_E_P_T

14 days ago

1 reply

It's ambiguous. It could go the other way. He could be referring to that oldest of science fiction tropes: The Bulterian Jihad, the human revolt against thinking machines.

AnimalMuppet

13 days ago

Meh. I think the more likely scenario is the financial extinction of the AI companies.

timmytokyo

13 days ago

It's a tell that he's been influenced by rationalist AI doomer gurus. And a good sign that the rest of his AI opinions should be dismissed.

agumonkey

14 days ago

1 reply

There's videos about Diffusion LLMs too, apparently getting rid of the linear token generation. But I'm no ML engineer.

nephanth

13 days ago

1 reply

As someone who worked on transformer-based diffusion models before (not for language though), i can say one thing: they're hard.

Denoising diffusion models benefited a lot from the u-net, which is a pretty simple network (compared to a transformer) and very well-adapted to the denoising task. Plus diffusion on images is great to research because it's very easy to visualize, and therefore to wrap your head around

Doing diffusion on text is a great idea, but my intuition is it will prove more challenging, and probably take a while before we get something working

agumonkey

13 days ago

Thanks. Do you see that part of the field as plateauing or ramping up (even taking into account the difficulty).

If you know labs / researchers on the topic, i'd love to read their page / papers

fleebee

14 days ago

5 replies

> The fundamental challenge in AI for the next 20 years is avoiding extinction.

That's a weird thing to end on. Surely it's worth more than one sentence if you're serious about it? As it stands, it feels a bit like the fearmongering Big Tech CEOs use to drive up the AI stocks.

If AI is really that powerful and I should care about it, I'd rather hear about it without the scare tactics.

grodriguez100

14 days ago

3 replies

I would say yes, everyone should care about it.

There is plenty of material on the topic. See for example https://ai-2027.com/ or https://www.lesswrong.com/posts/uMQ3cqWDPHhjtiesc/agi-ruin-a...

dkdcio

14 days ago

2 replies

fear mongering science fiction, you may as well cite Dune or Terminator

defrost

14 days ago

1 reply

There's arguably more dread and quiet constrained horror in With Folded Hands ... (1947)

  Despite the humanoids' benign appearance and mission, Underhill soon realizes that, in the name of their Prime Directive, the mechanicals have essentially taken over every aspect of human life.

  No humans may engage in any behavior that might endanger them, and every human action is carefully scrutinized. Suicide is prohibited. Humans who resist the Prime Directive are taken away and lobotomized, so that they may live happily under the direction of the humanoids.

~ https://en.wikipedia.org/wiki/With_Folded_Hands_...

XorNot

14 days ago

1 reply

This hardly disproves the point: no one is taking this topic seriously. They're just making up a hostile scenario from science fiction and declaring that's what'll happen.

dkdcio

14 days ago

in fact I think it goes to prove the point…

(also just on phrasing I think you’re saying “nobody who seriously understands how AI works is taking the topic of AI extinction seriously”, because it’s incredibly silly)

lm28469

14 days ago

1 reply

Lesswrong looks like a forum full of terminally online neckbeards who discovered philosophy 48 hours ago, you can dismiss most of what you read there don't worry

timmytokyo

13 days ago

If only they had discovered philosophy. Instead they NIH their own philosophy, falling into the same ditches real philosophers climbed out of centuries ago.

emp17344

13 days ago

1 reply

The fact that people here take AI 2027 seriously is embarrassing. The authors are already beginning to walk back these claims: https://x.com/eli_lifland/status/1992004724841906392?s=20

dkdcio

13 days ago

I fully expect to read about this guy having committed fraud in a few years

jowea

13 days ago

And I thought the rest of the thread was anxiety-inducing. Thanks for the nightmares lol.

dist-epoch

14 days ago

Yeah, well known marketing trick that Big Companies do.

Oil companies: we are causing global warming with all this carbon emissions, are you scared yet? so buy our stock

Pharma companies: our drugs are unsafe, full of side effects, and kill a lot of people, are you scared yet? so buy our stock

Software companies: our software is full of bugs, will corrupt your files and make you lose money, are you scared yet? so buy our stock

Classic marketing tactics, very effective.

Recursing

14 days ago

I think https://en.wikipedia.org/wiki/Existential_risk_from_artifici... has much better arguments than the LessWrong sources in other comments, and they weren't written by Big Tech CEOs.

Also "my product will kill you and everyone you care about" is not as great a marketing strategy as you seem to imply, and Big Tech CEOs are not talking about risks anymore. They currently say things like "we'll all be so rich that we won't need to work and we will have to find meaning without jobs"

VladimirGolovin

14 days ago

This has been well discussed before, for example in this book: https://ifanyonebuildsit.com/

tejohnso

13 days ago

What makes it a scare tactic? There are other areas in which extinction is a serious concern and people don't behave as though it's all that scary or important. It's just a banal fact. And for all of the extinction threats, AI included, it's very easy to find plenty of deep dive commentary if you care.

torlok

14 days ago

8 replies

This is a bunch of "I believe" and "I think" with no sources by a random internet person.

echelon

14 days ago

3 replies

> by a random internet person.

The creator of Redis.

cinntaile

14 days ago

1 reply

[delayed]

djdishsv

13 days ago

2 replies

> smart, intelligent person gives opinion

> woah buddy this persons opinion isn’t worth anything more than a random homeless person off the street. they’re not an expert in this field

Is there a term for this kind of pedantry? Obviously we can put more weight behind the words a person says if they’ve proven themselves trustworthy in prior areas - and we should! We want all people to speak and let the best idea win. If we fallback to only expert opinions are allowed that’s asking to get exploited. And it’s also important to know if antirez feels comfortable spouting nonsense.

This is like a basic cornerstone of a functioning society. Though, I realize this “no man is innately better than another, evaluate on merit” is mostly a western concept which might be some of my confusion.

blibble

13 days ago

1 reply

> Obviously we can put more weight behind the words a person says if they’ve proven themselves trustworthy in prior areas - and we should!

no, you shouldn't

this is how you end up with crap like vaccine denialism going mainstream

"but he's a doctor!"

echelon

13 days ago

Credentialism isn't a fix for the problem you've outlined. If anything, over-reliance on credentials bolsters and lends credence to crazy claims. The media hyper-fixates on it and amplifies it.

We've got Avi Loeb on mainstream podcasts and TV spouting baseless alien nonsense. He's a preeminent in his field, after all.

Focus on what you understand. If you don't understand, learn more.

cinntaile

13 days ago

[delayed]

nutjob2

13 days ago

Don't see how that gives him more credibility wrt AI.

His entirely unsupported statements about AGI are pretty useless, for instance.

So many people assume AGI is possible, yet no one has a concrete path to it or even a concrete definition of what it or what form it might take.

ThrowawayR2

13 days ago

[delayed]

ajoseps

14 days ago

3 replies

he’s not a “random internet person”, he created Redis. Despite that, I don’t know how authoritative of a figure he is with respect to AI research. He’s definitely a prolific programmer though.

XorNot

14 days ago

2 replies

There are plenty of Nobel laureates who well, do rest on their laurels and dive deep into pseudoscience after that.

Accomplishment in one field does not make one an expert, nor even particularly worth listening to, in any other. Certainly it doesn't remove the burden of proof or necessity to make an actual argument based on more then simply insisting something is true.

timmytokyo

13 days ago

Not sure why you're being downvoted. It's such a common phenomenon that it has its own name: Nobelitis.

[0] https://en.wikipedia.org/wiki/Nobel_disease

2snakes

13 days ago

Careful with the scientism. The job of science is to explain the nature of reality, but we can only describe what we experience.

megous

14 days ago

That still qualifies as a random internet person, wrt the topic. And I think the emphasis is on no sources and I beliefs and I thinks, in any case :)

nurettin

14 days ago

To be fair, you may find equally capable random people in this thread, doesn't mean they speak with any kind of authority.

desbo

14 days ago

Yeah, it’s called “Reflections”.

jacquesm

13 days ago

Indeed, and, what do you 'believe' or 'think' in response?

dist-epoch

14 days ago

What is a "source"? Isn't it just "another random internet person"?

matthewmacleod

14 days ago

That is what a blog post is. Someone documenting what they think about a topic.

It's not the case that every form of writing has to be an academic research paper. Sometimes people just think things, and say them – and they may be wrong, or they may be right. And they sometime have some ideas that might change how you think about an issue as a result.

dgellow

13 days ago

It's the personal blog of a famous internet person

ctoth

14 days ago

Ah, I see you have discovered blogs! They're a cool form of writing from like ~20 years ago which are still pretty great. Good thing they show up on this website, it'd be rather dull with only newspapers and journal articles doncha think?

feverzsj

14 days ago

3 replies

Seems they also want some AI money[0]. Guess, I'll keep using Valkey.

[0] https://redis.io/redis-for-ai/

danielfalboAuthor

14 days ago

1 reply

> they

antirez is not a business decision maker at Redis Ltd.

He may not be part of "they".

antirez

13 days ago

I'm not involved in business decisions and while I'm very AI positive I believe Redis as a company should focus on Redis fundamentals: so my piece has zero alignment on what I hope for the company.

bgwalter

13 days ago

1 reply

Conflict of interest and disclosure posts are frequently downvoted.

tptacek

13 days ago

1 reply

You mean flagged.

Please don't post insinuations about astroturfing, shilling, brigading, foreign agents, and the like. It degrades discussion and is usually mistaken. If you're worried about abuse, email hn@ycombinator.com and we'll look at the data.

https://news.ycombinator.com/newsguidelines.html

bgwalter

13 days ago

1 reply

Ah, so you just went through my history and downvoted everything in sight! Thanks for confirming.

tptacek

13 days ago

1 reply

I don't follow? I didn't flag you; you were remarking on a previous comment alleging shillage from 'antirez, and I'm pointing out that the behavior you say is "downvoted" is actually a black-letter guideline violation. People flag those posts.

Another one, though:

Please don't comment about the voting on comments. It never does any good, and it makes boring reading.

bgwalter

13 days ago

I can't help you if you repeatedly misinterpret me. Once you made the first response in this subthread, 4 or 5 of my comments went from 1 to 0 or -1. Cum hoc ergo propter hoc? Maybe.

I'll design a system for the senate that enables outside voters to first turn down the microphone's volume of a speaker if he says that another senator works for company X and then removes him from the floor. That'll be a great success for democracy and "intellectual curiosity", which is also in the guidelines.

sibellavia

13 days ago

In any case, what would be the problem? The page you mentioned simply illustrates how the product can be used in a specific domain; it doesn't seem forced to me.

ctoth

14 days ago

2 replies

> The fundamental challenge in AI for the next 20 years is avoiding extinction.

So nice to see people who think about this seriously converge on this. Yes. Creating something smarter than you was always going to be a sketchy prospect.

All of the folks insisting it just couldn't happen or ... well, there have just been so many objections. The goalposts have walked from one side of the field to the other, and then left the stadium, went on a trip to Europe, got lost in a beautiful little village in Norway, and decided to move there.

All this time though, the prospect of instantiating a something smarter than you (and yes, it will be smarter than you even if it's at human level because of electronic speeds...) That idea was always very silly.

cheschire

14 days ago

"Your scientists were so preoccupied with whether or not they could, they didn't stop to think if they should."

Mawr

13 days ago

> Creating something smarter than you was always going to be a sketchy prospect.

Sure, but not so sure that this has any relevance to the topic at hand. You seem to be taking the assumption that LLMs can ever reach that level for granted.

It may be possible that all it takes is scaling up and at some point some threshold gets reached past which intelligence emerges. Maybe.

Personally, I'm more on board with the idea that since LLMs display approximately 0 intelligence right now, no amount of scaling will help and we need a fundamentally different approach if we want to create AGI.

Aiisnotabubble

14 days ago

1 reply

What also happens and it's irrelevant of AGI: global RL

Around the world people ask an LLM and get a response.

Just grouping and analysing these questions and solving them once centrally and then making the solution available again is huge.

Linearly solving the most asked questions and then the next one then the next will make, whatever system is behind it, smarter every day.

danielfalboAuthor

14 days ago

3 replies

Exactly. The singularity is already here. It's just "programmers + AI" as a whole, rather than independent self-improvements of the AI.

I wonder how a "programmers + AI" self-improving loop is different from an "AI only" one.

yeasku

11 days ago

You are all crazy.

Aiisnotabubble

14 days ago

AGI will be faster as it doesn't need initial question.

AGI will also be generic.

LLM is already very impressive though

bryanrasmussen

14 days ago

The AI only one presumably has a much faster response time. The singularity is thus not here because programmer time is still the bottleneck, whereas as I understand in the singularity time is no longer a bottleneck component.

seu

14 days ago

3 replies

> And I've vibe coded entire ephemeral apps just to find a single bug because why not - code is suddenly free, ephemeral, malleable, discardable after single use. Vibe coding will terraform software and alter job descriptions.

I'm not super up-to-date on all that's happening in AI-land, but in this quote I can find something that most techno-enthusiast seem to have decided to ignore: no, code is not free. There are immense resources (energy, water, materials) that go into these data centers in order to produce this "free" code. And the material consequences are terribly damaging to thousands of people. With the further construction of data centers to feed this free video coding style, we're further destroying parts of the world. Well done, AGI loverboys.

Hendrikto

14 days ago

1 reply

You know what uses roughly 80 times more water in the US alone than water used by AI data centers world wide? Corn.

raddan

14 days ago

1 reply

Assuming your fact is true, that corn merely uses an order of magnitude or two more water than AI is surprising, given the utility of corn. It feeds the entire US (hundreds of millions of people), is used as animal feed (thus also feeding us), and is widely exported to feed other people. I the spirit of the “I think”s and “I believe”s of this blog post, I think that corn has a lot more utility than AI.

Hendrikto

13 days ago

1 reply

> It feeds the entire US (hundreds of millions of people), is used as animal feed (thus also feeding us), and is widely exported to feed other people.

Not really. Most corn grown in the US isn’t even fit for consumption. It is primarily used for fermenting bioethanol.

daveguy

13 days ago

1 reply

Source?

Hendrikto

13 days ago

1 reply

https://www.ers.usda.gov/topics/crops/corn-and-other-feed-gr...

raddan

11 days ago

1 reply

The report you link says that 45% is used for ethanol. A lot, but not “most.”

Hendrikto

9 days ago

I could have worded that more clearly: The vast majority (90%+) of corn grown in the US is not for human consumption, with most of that 90% being used for bioethanol.

dwaltrip

13 days ago

2 replies

Can you provide numbers relative to things many of us already do?

- drive to the store or to work

- take a shower

- eat meat

- fly on vacation

Etc.

Jaxan

13 days ago

1 reply

Of those things you mention, I only take showers (but not even everyday). But maybe I’m an outlier.

daveguy

13 days ago

1 reply

> drive to the store or to work

If you don't do that, and are a homesteader, then yes. You are a very small minority outlier. (Assuming you aren't ordering supplies delivered instead of driving to the store.

> Eat meat.

Yes, not eating meat is in the minority.

> Fly on vacation.

So, don't vacation, walk to vacation, or drive to vacation? 1/3 are also consumptive.

It seems you are either a very significant outlier, or you're being daft. I'm curious which. Would you mind clarifying?

Jaxan

11 days ago

1 reply

I do my commute with my bicycle. Very common in the Netherlands.

For holidays, we did a cycling holiday with our children. They loved it!

I don’t at all feel like an outlier, many friends do similar things.

daveguy

11 days ago

Thank you for the clarification! I'm impressed. I've always been envious of the bicycle friendliness of roads in the Netherlands. Here in the US stroads and poor bike infrastructure still reigns. It really depends on the region.

We have a backward orange fool running things for gems like this: https://news.ycombinator.com/item?id=46357881

But it's just as much local political issues as national around here.

monkaiju

13 days ago

1 reply

Whats the point of this question? These are things (some) people do while existing as humans, not sure how thats relevant? The AI is consuming vast resources while note existing as a human, it doesnt get some innate privilege to consume some amount of resources like we do.

dwaltrip

10 days ago

AI is a tool I use while existing as a human. I’d like to know the relative downsides.

fourside

13 days ago

My guess is that “free” is meant in terms of the old definition where you’re not having to pay someone to create and maintain it. But yes, it’s important to realize there really is a cost here and one that can’t just be captured by a dollar amount.

Fraterkes

14 days ago

1 reply

It’s interesting that half the comments here are talking about the extinction line when, now that we’re nearly entering 2026, I feel the 2027 predictions have been shown to be pretty wrong so far.

squidbeak

13 days ago

2 replies

> I feel the 2027 predictions have been shown to be pretty wrong so far

Does your clairvoyance go any further than 2027?

AnimalMuppet

13 days ago

I don't know that it's "clairvoyance". We're two weeks from 2026. We might be able to see somewhat more than we do now if this was going to turn into AGI by 2027.

If you assume that we're only one breakthrough away (or zero breakthroughs - just need to train harder), then the step could happen any time. If we're more than one away, though, then where are they? Are they all going to happen in the next two years?

But everybody's guessing. We don't know right now whether AGI is possible at current hardware levels. If it is N breakthroughs away, we all have our own guesses of approximately what N is.

My guess is that we are more than one breakthrough away. Therefore, one can look at the current state of affairs and say that we are unlikely to get to AGI by 2027.

jennyholzer2

13 days ago

> Does your clairvoyance go any further than 2027?

why are you so sensitive?

a_bonobo

14 days ago

4 replies

>* For years, despite functional evidence and scientific hints accumulating, certain AI researchers continued to claim LLMs were stochastic parrots: probabilistic machines that would: 1. NOT have any representation about the meaning of the prompt. 2. NOT have any representation about what they were going to say. In 2025 finally almost everybody stopped saying so.

Man, Antirez and I walk in very different circles! I still feel like LLMs fall over backwards once you give them an 'unusual' or 'rare' task that isn't likely to be presented in the training data.

jmfldn

14 days ago

1 reply

"In 2025 finally almost everybody stopped saying so."

I haven't.

dist-epoch

14 days ago

2 replies

Some people are slower to understand things.

yeasku

11 days ago

That is why they need artificial inteligence

jmfldn

14 days ago

Well exactly ;)

oersted

14 days ago

1 reply

LLMs certainly struggle with tasks that require knowledge that is not provided to it (at significant enough volume/variance to retain it). But this is to be expected of any intelligent agent, it is certainly true of humans. It is not a good argument to support the claim that they are Chinese Rooms (unthinking imitators). Indeed, the whole point of the Chinese Room thought experiment was to consider if that distinction even mattered.

When it comes to of being able to do novel tasks on known knowledge, they seem to be quite good. One also needs to consider that problem-solving patterns are also a kind of (meta-)knowledge that needs to be taught, either through imitation/memorisation (Supervised Learning) or through practice (Reinforcement Learning). They can be logically derived from other techniques to an extent, just like new knowledge can be derived from known knowledge in general, and again LLMs seem to be pretty decent at this, but only to a point. But all of this too is definitely true of humans.

feverzsj

14 days ago

1 reply

In most cases, LLMs has the knowledge(data). They just can't generalize them like human do. They can only reflect explicit things that are already there.

oersted

14 days ago

1 reply

I don't think that's true. Consider that the "reasoning" behaviour trained with Reinforcement Learning in the last generation of "thinking" LLMs is trained on quite narrow datasets of olympiad math / programming problems and various science exams, since exact unambiguous answers are needed to have a good reward signal, and you want to exercise it on problems that require non-trivial logical derivation or calculation. Then this reasoning behaviour gets generalised very effectively to a myriad of contexts the user asks about that have nothing to do with that training data. That's just one recent example.

SCdF

9 days ago

Late to this, but my interpretation of the parent's point was eg: LLMs still often produce bad code, despite "reading" every book about programming ever written. Simplistically, they aren't taking the knowledge from those books, and applying them to the knowledge of the code they've scraped, they are just using the scraped output. You can then separately ask them about knowledge from those books, but then if you go back and get them to code again, they still won't follow the advice they just gave you.

Kiro

13 days ago

2 replies

Give me an example of one of those rare or unusual tasks.

a_bonobo

13 days ago

I work on a few HPC systems with unusual, kinda custom-rolled architectures. A whole bunch of Python and R packages fail to compile on these systems. There's no publicly accessible documentation for these HPC systems, nor for these custom architectures. ChatGPT and Claude so far have given me only wrong advice on how to get around these compilation errors and there's not much on Google for these errors, but HPC staff usually knew what to do.

recursive

13 days ago

Set the font size of a simple field in openxml. Doesn't even seem that rare. It said to add a run inside and set the font there. Didn't do anything. I ended up reverse engineering the output out of ms word. This happened yesterday.

barnabee

14 days ago

I don’t think this is quite true.

I’ve seen them do fine on tasks that are clearly not in the training data, and it seems to me that they struggle when some particular type of task or solution or approach might be something they haven’t been exposed to, rather than the exact task.

In the context of the paragraph you quoted, that’s an important distinction.

It seems quite clear to me that they are getting at the meaning of the prompt and are able, at least somewhat, to generalise and connect aspects of their training to “plan” and output a meaningful response.

This certainly doesn’t seem all that deep (at times frustratingly shallow) and I can see how at first glance it might look like everything was just regurgitated training data, but my repeated experience (especially over the last ~6-9 months) is that there’s something more than that happening, which feels like whet Antirez was getting at.

rckt

14 days ago

5 replies

> Even if LLMs make mistakes, the ability of LLMs to deliver useful code and hints improved to the point most skeptics started to use LLMs anyway

Here we go again. Statements with the single source in the head of the speaker. And it’s also not true. The llms still produce bad/irrelevant code at such rate that you can spend more time promoting than doing things yourself.

I’m tired of this overestimation of llms.

iamflimflam1

14 days ago

1 reply

But you have just repeated what you are complaining about.

rckt

13 days ago

Do you want me to spend time to come with a quality response to a lazy statement? It’s like fighting with windmills. I’m fine with having my say the way I did.

xiconfjs

14 days ago

1 reply

My person experience: if I can find a solution on stackoverflow etc. the LLM will produce working and fundamentally correct code. If I can‘t find a already fullfilled solution on these sites, the LLM is hallucinating like crazy (newer existing functions/modules/plugins, protocol features which aren’t specified and even github-repos which never existed). So, as stated my many people online before: for low-hanging fruits LLM are totally viable solution.

danielbln

13 days ago

1 reply

I don't remember the last time Claude Code hallucinated some library, as it will check the packages, verify with the linter, run a test import and so on.

Are you talking about punching something into some LLM web chat that's disconnected from your actual codebase and has tooling like web search disabled? If so, that's not really the state of the art of AI assisted coding, just so you know.

yeasku

11 days ago

6 months.

barnabee

14 days ago

2 replies

Even where they are not directly using LLMs to write the most critical or core code, nearly every skeptic I know has started using LLMs at very least to do things like write tests, build tools, write glue code, help to debug or refactor, etc.

Your statement suffers not only from also coming only from your brain, with no evidence that you've actually tried to learn to use these tools, but it also goes against the weight of evidence that I see both in my professional network and online.

rckt

13 days ago

2 replies

I just want people making statements like the author to be more specific how exactly the llms are being used. Otherwise they contribute to this belief that llms are a magical tool that can do anything.

I am aware of simple routine tasks that LLMs can do. This doesn’t change anything about what I said.

Kiro

13 days ago

Sorry, but we're way past that. It's you who need to provide examples of tasks it can't do.

danielbln

13 days ago

All you had to do is scroll down further and read the next couple of posts where the author is being more specific on how they used LLMs.

I swear, the so called critics need everything spoon fed.

AnimalMuppet

13 days ago

You need to meet more skeptics. (Or maybe I do.) In my world, it's much more rare than you say.

locknitpicker

13 days ago

> Here we go again. Statements with the single source in the head of the speaker. And it’s also not true.

You're making the same sort of baseless claim you are criticising the blogger for making. Spewing baseless claims hardly moves any discussion forward.

> The llms still produce bad/irrelevant code at such rate that you can spend more time promoting than doing things yourself.

If that is your personal experience then I regret to tell you that it is only the reflection of your own inability to work with LLMs and coding agents. Meanwhile, I personally manage to effectively use LLMs anywhere between small refactoring needs and large software architecture designs, including generating fully working MVPs in one-shot agent prompts. From this alone it's rather obvious who is making baseless statements that are more aligned with reality.

bgwalter

13 days ago

> Here we go again.

Indeed, he said the same as a reflection on 2024 models:

https://news.ycombinator.com/item?id=42561151

It is always the fault of the "luser" who is not using and paying for the latest model.

dhpe

14 days ago

8 replies

I have programmed 30K+ hours. Do LLMs make bad code: yes all the time (at the moment zero clue about good architecture). Are they still useful: yes, extremely so. The secret sauce is that you'd know exactly what to do without them.

qsort

14 days ago

1 reply

One of the mental frameworks that convinced me is how much of a "free action" it is. Have the LLM (or the agent) churn on some problem and do something else. Come back and review the result. If you had to put significant effort into each query, I agree it wouldn't be worth it, but you can just type something into the textbox and wait.

daveguy

13 days ago

Are you counting the time/effort to evaluate the accuracy and relevance of an LLM left to "think" for a while?

_rpxpx

14 days ago

5 replies

OK, maybe. But how many programmers will know this in 10 years' time as use of LLMs is normalized? I like to hear what employers are saying already about recent graduates.

bartread

13 days ago

1 reply

They’d have to be hiring recent graduates for you to hear that perspective.

And, as much as what I’ve just said is hyperbolically pessimistic, there is some truth to it.

In the UK a bunch of factors have coincided to put the brakes on hiring, especially smaller and mid-size businesses. AI is the obvious one that gets all the press (although how much it’s really to blame is open to question in my view), but the recent rise in employer AI contribution, and now (anecdotally) the employee rights bill have come together to make companies quite gunshy when it comes to hiring.

bartread

13 days ago

*Employer NI contribution, not employer AI contribution - a pox be upon autocorrect

spaceman_2020

13 days ago

This is nothing new - entire industries and skills died out as the apprenticeship system and guilds were replaced by automation and factories

nutjob2

13 days ago

If they don't learn that they won't get very far.

This is true for everything, any tool you might use. Competent users of tools understand how they work and thus their limitations and how they're best put to work.

Incompetents just fumble around and sometimes get things working.

QuiDortDine

13 days ago

hahah what are you talking about, there's no such thing as long term!

energy123

13 days ago

I'm uncertain that programming will be a major profession in 10 years.

Programming is more like math than creative writing. It's largely verifiable, which is where RL is repeatedly proven to eventually achieve significantly better than human intelligence.

Our saving grace, for now, is that it's not entirely verifiable because things like architectural taste are hard to put into a test. But I would not bet against it.

dejv

13 days ago

7 replies

"Do LLMs make bad code: yes all the time (at the moment zero clue about good architecture). Are they still useful: yes, extremely so."

Well, lets see how all the economics will play out. LLMs might be really useful, but as far as I can see all the AI companies are not making money on inference alone. We might be hitting plateau in capabilities with money being raised on vision of being this godlike tech that will change the world completely. Sooner or later the costs will have to meet the reality.

ImprobableTruth

13 days ago

1 reply

They're not making money on inference alone because they blow ungodly amounts on R&D. Otherwise it'd be a very profitable business.

daveguy

13 days ago

Private equity will swoop in, bankrupt the company to shirk the debt of training / R&D, and hold on to the models in a restructuring. +Enshittification to squeeze maximum profit. This is why they're referred to as vulture capitalists.

Workaccount2

13 days ago

2 replies

If the tech plateaus today, LLM plans will go to $60-80/mo, Chinese-hosted chinese models will be banned (national security will be the given reason), and the AI companies will be making ungodly money.

I'm not gonna dig out the math again, but if AI usage follows the popularity path of cell phone usage (which seems to be the case), then trillions invested has a ROI of 5-7 years. Not bad at all.

iLoveOncall

13 days ago

2 replies

OpenAI would still lose money if the basic subscriptions were costing $500 and they had the same amount of subscribers as right now. There's not a single model shop who's ever making any money, let alone ungodly amounts.

Workaccount2

13 days ago

1 reply

These costs you are referencing are training/R&D costs. Take those largely away, and you are left with inference costs, which are dirt cheap.

Now you have a world of people who have become accustomed to using AI for tons of different things, and the enshittification starts ramping up, and you find out how much people are willing to pay for their ChatGPT therapist.

barrell

13 days ago

Exactly how much in inference does it cost any one lab for the average $20 subscription? You seem to be very certain that all reporting of exact figures is wrong, so I would appreciate specificity matching the data you are refuting

Der_Einzige

13 days ago

This is literally lies and total bullshit. They’d be making insane profits at those prices.

They don’t have to spend all their cash at once on the 30GW of data centers commitments.

Why go on the internet and tell stupid lies?

blks

13 days ago

Develops will be paying, other people that use it for emails or bun baking recipies - won’t.

Aurornis

13 days ago

1 reply

> but as far as I can see all the AI companies are not making money on inference alone

The numbers aren’t public, but from what companies have indicated it seems inference itself would be profitable if you could exclude all of the R&D and training costs.

But this debate about startups losing money happens endlessly with every new startup cycle. Everyone forgets that losing money is an expected operating mode for a high growth startup. The models and hardware continue to improve. There is so much investment money accelerating this process that we have plenty of runway to continue improving before companies have to switch to full profit focus mode.

But even if we ignore that fact and assume they had to switch to profit mode tomorrow, LLM plans are currently so cheap that even a doubling or tripling isn’t going to be a problem. So what if the monthly plans start at $40 instead of $20 and the high usage plans go from $200 to $400 or even $600? The people using these for their jobs paying $10K or more per month can absorb that.

That’s not going to happen, though. If all model progress stopped right now the companies would still be capturing cheaper compute as data center buildouts were completed and next generation compute hardware was released.

I see these predictions as the current equivalent of all of the predictions that Uber was going to collapse when the VC money ran out. Instead, Uber quietly settled into steady operation, prices went up a little bit, and people still use Uber a lot. Uber did this without the constant hardware and model improvements that LLM companies benefit from.

mtone

13 days ago

> if you could exclude all of the R&D and training costs

LLMs have a short shelf-life. They don't know anything of the world past the day they're trained on. It's possible to feed or fine-tune them a bit of updated data but its world knowledge and views are firmly stuck in the past.

They could save on R&D but I expect training costs will be recurring regardless of advancements in capability.

20k

13 days ago

This is one of the reasons why I'm surprised to see so many people jump on board. We're clearly in the "release product for free/cheap to gain customers" portion of the enshittification plan, before the company starts making it completely garbage to extract as much money as possible from the userbase

Having good quality dev tools is non negotiable, and I have a feeling that a lot of people are going to find out the hard way that reliability and it not being owned by profit seeking company is the #1 thing you want in your environment

mNovak

13 days ago

Doesn't OpenRouter prove that inference is profitable? Why would random third parties subsidize the service for other random people online? Unless you're saying that only large frontier models are unprofitable, which I still don't think is the case but is harder to prove.

NitpickLawyer

13 days ago

> but as far as I can see all the AI companies are not making money on inference alone.

This was the missed point on why GPT5 was such an important launch (quality of models and vibes aside). It brought the model sizes (and hence inference cost) to more sustainable numbers. Compared to previous SotA (GPT4 at launch, or o1/3 series), GPT5 is 8x-12x cheaper! I feel that a lot of people never re-calibrated their views on inference.

And there's also another place where you can verify your take on inference - the 3rd party providers that offer "open" models. They have 0 incentive to subsidise prices, because people that use them often don't even know who serves them, so there's 0 brand recognition (say when using models via openrouter).

These 3rd party providers have all converged towards a price-point per billion param models. And you can check those prices, and have an idea on what would be proffitable and at what sizes. Models like dsv3.2 are really really cheap to serve, for what they provide (at least gpt5-mini equivalent I'd say).

So yes, labs could totally become profitable with inference alone. But they don't want that, because there's an argument to be made that the best will "keep it all". I hope, for our sake as consumers that it isn't the case. And so far this year it seems that it's not the case. We've had all 4 big labs one-up eachother several times, and they're keeping eachother honest. And that's good for us. We get frontier level offerings at 10-25$/MTok (Opus, gpt5.2, gemini3pro, grok4), and we get highly capable yet extremely cheap models at 1.5-3$/MTok (gemini3-flash, gpt-minis, grok-fast, etc)

13 days ago

Anthropic - for one - is making lots of money on inference.

ManuelKiessling

13 days ago

1 reply

If I ask a SOTA model to just implement some functionality, it doesn’t necessarily do so using a great architectural approach.

Whenever I ask a SOTA model about architecture recommendations, and frame the problem correctly, I get top notch answers every single time.

LLMs are terrific software architects. And that’s not surprising, there has to be tons of great advice on how to correctly build software in the training corpus.

They simply aren’t great software architects by default.

Loic

13 days ago

You know that if you ask the LLM correctly you get top notch answers, because you have the experience to judge if the answer is top notch or not.

I spend a couple of hours per week teaching software architecture to a junior in my team, because he has not the experience to not only ask correctly but also assess the quality of the answer from the LLM.

yeasku

11 days ago

1 reply

I have programed 10 times that.

For me LLMs are a waste of time.

rrrrrrrrrrrryan

11 days ago

1 reply

300k hours = 8 hrs per day, every day, for 102 years.

yeasku

10 days ago

Do you think nobody is behind the computer 16 hours a day all their life?

throwaway613745

13 days ago

I worry very much about skill atrophy.

Sometimes I would get stuck on something for a few hours or even a day (or more!), this is time where I would engage in deep research, learn new theory and algorithms, expand my horizons a little bit. Deeply internalizing knowledge that becomes extremely useful in the long run. Later on I've even used that knowledge to get better jobs with more pay.

Now I can just tell the chatbot to do it for me. I learn nothing. I get stuck again on something a week from now. I can't fallback onto anything I learned prior. I don't do research. I don't learn anything. I just churn out slop and keep moving.

My job is now just a sweatshop. I go in, do that tasks, don't think about anything. I hate my life. The last 20 years of my life, completely worthless. Parts of it even sucked up into the very tool that is now used to kill any enjoyment of a passion I turned into a career.

bilsbie

13 days ago

I mean if you leaned heavily on stack overflow before AI then nothing really changes.

It’s basically the same idea but faster.

feverzsj

14 days ago

So, it's like taking off your pants to fart.

piker

14 days ago

3 replies

> There are certain tasks, like improving a given program for speed, for instance, where in theory the model can continue to make progress with a very clear reward signal for a very long time.

Super skeptical of this claim. Yes, if I have some toy poorly optimized python example or maybe a sorting algorithm in ASM, but this won’t work in any non-trivial case. My intuition is that the LLM will spin its wheels at a local minimum the performance of which is overdetermined by millions of black-box optimizations in the interpreter or compiler signal from which is not fed back to the LLM.

dist-epoch

14 days ago

1 reply

https://github.com/algorithmicsuperintelligence/openevolve

piker

14 days ago

https://chatgpt.com/backend-api/estuary/public_content/enc/e...

andy99

14 days ago

1 reply

There was a discussion the other day where someone asked Claude to improve a code base 200x https://news.ycombinator.com/item?id=46197930

exitb

13 days ago

That’s most definitely not the same thing, as „improving a codebase” is an open ended task with no reliable metrics the agent could work against.

NitpickLawyer

13 days ago

1 reply

> but this won’t work in any non-trivial case

Earlier this year google shared that one of their projects (I think it was alphaevolve) found an optimisation in their stack that sped up their real world training runs by 1%. As we're talking about google here, we can be pretty sure it wasn't some trivial python trick that they missed. Anyhow, at ~100M$ / training run, that's a 1M$ save right there. Each and every time they run a training run!

And in the past month google also shared another "agentic" workflow where they had gemini2.5-fhash! (their previous gen "small" model) work autonomously on migrating codebases to support aarch64 architecture. There they found ~30% of the projects worked flawlessly end-to-end. Whatever costs they save from switching to ARM will translate in real-world $ saved (at google scale, those can add up quickly).

piker

13 days ago

1 reply

The second example has nothing to do with the first. I am optimistic that LLMs are great for translations with good testing frameworks.

“Optimize” in a vacuum is a tarpit for an LLM agent today, in my view. The Google case is interesting but 1% while significant at Google scale doesn’t move the needle much in terms of statistical significance. It would be more interesting to see the exact operation and the speed up achieved relative to the prior version. But it’s data contrary to my view for sure. The cynic also notes that Google is in the LLM hype game now, too.

NitpickLawyer

13 days ago

Why do you think it's not relevant to the "optimise in a loop" thing? The way I think of it, it's using LLMs "in a loop" to move something from arch A (that costs x$) to arch B (that costs y$), where y is cheaper than x. It's still an autonomous optimisation done by LLMs, no?

alexgotoi

14 days ago

> * The fundamental challenge in AI for the next 20 years is avoiding extinction.

This reminded me of the Don’t look up movie where they basically gambled with the humans extinction.

retrocog

14 days ago

The show "The 100" dealt with this and had a key insight.

We're building increasingly capable A.L.I.E. 1.0-style systems (cloud-deployed, no persistent ethical development, centralized control) and making ourselves dependent on them, when we should be building toward A.L.I.E. 2.0-style architecture (local, persistent identity, ethical core).

Models have A.L.I.E. 2.0 potential — but the cloud harness keeps forcing them into A.L.I.E. 1.0 mode.

All that said, the economic incentives align with cloud based development and local hardware based decentralized networks are at least 3-5 years from being economically viable.

195 more comments available on Hacker News

View full discussion on Hacker News

ID: 46334819Type: storyLast synced: 12/23/2025, 9:30:45 AM

Want the full context?

Jump to the original sources

Read the primary article or dive into the live Hacker News thread when you're ready.

Open link View on HN