A Trillion Dollars (Potentially) Wasted on Gen-AI
Original: A trillion dollars (potentially) wasted on gen-AI
Key topics
The debate rages on: is the trillion-dollar investment in generative AI a reckless waste or a worthwhile gamble? As commenters dissect the issue, a central disagreement emerges: some argue that scaling is the key to progress in machine learning, while others counter that it's not the only factor, citing the No Free Lunch Theorem and diminishing returns on parameter scaling. Amidst the back-and-forth, a nuanced perspective emerges: perhaps the investment isn't wasted, even if it doesn't lead to AGI, as it still drives innovation and improvement in current architectures. The discussion feels particularly relevant now, as the AI landscape continues to shift and the true value of these massive investments remains to be seen.
Snapshot generated from the HN discussion
Discussion Activity
Very active discussionFirst comment
40m
Peak period
109
0-6h
Avg / period
16.8
Based on 134 loaded comments
Key moments
- 01Story posted
Nov 28, 2025 at 8:21 AM EST
about 1 month ago
Step 01 - 02First comment
Nov 28, 2025 at 9:01 AM EST
40m after posting
Step 02 - 03Peak activity
109 comments in 0-6h
Hottest window of the conversation
Step 03 - 04Latest activity
Nov 30, 2025 at 5:19 PM EST
about 1 month ago
Step 04
Generating AI Summary...
Analyzing up to 500 comments to identify key contributors and discussion patterns
Want the full context?
Jump to the original sources
Read the primary article or dive into the live Hacker News thread when you're ready.
Ilya should just enjoy his billions raised with no strings.
Yes, indeed, that is why all we have done since the 90s is scale up the 'expert systems' we invented ...
That's such an a-historic take it's crazy.
* 1966: failure of machine translation
* 1969: criticism of perceptrons (early, single-layer artificial neural networks)
* 1971–75: DARPA's frustration with the Speech Understanding Research program at Carnegie Mellon University
* 1973: large decrease in AI research in the United Kingdom in response to the Lighthill report
* 1973–74: DARPA's cutbacks to academic AI research in general
* 1987: collapse of the LISP machine market
* 1988: cancellation of new spending on AI by the Strategic Computing Initiative
* 1990s: many expert systems were abandoned
* 1990s: end of the Fifth Generation computer project's original goals
Time and time again, we have seen that each academic research begets a degree of progress, improved by the application of hardware and money, but ultimately only a step towards AGI, which ends with a realisation that there's a missing congitive ability that can't be overcome by absurd compute.
LLMs are not the final step.
Read about the the No Free Lunch Theorem. Basically, the reason we need to "scale" so hard is because we're building models that we want to be good at everything. We could build models that are as good at LLMs at a narrow fraction of tasks we ask of them to do, at probably 1/10th the parameters.
https://en.wikipedia.org/wiki/Bitter_lesson
The author, for whatever reason, views it as a foregone conclusion that every dollar spent in this way is a waste of time and resources, but I wouldn't view any of that as wasted investment at all. It isn't any different from any other trend - by this logic, we may as well view the cloud/SaaS craze of the last decade as a waste of time. After all, the last decade was also fueled by lots of unprofitable companies, speculative investment and so on, and failed to reach any pie-in-the-sky Renaissance-level civilization-altering outcome. Was it all a waste of time?
It's ultimately just another thing industry is doing as demand keeps evolving. There is demand for building the current AI stack out, and demand for improving it. None of it seems wasted.
https://www.youtube.com/watch?v=DtePicx_kFY https://www.bbc.com/news/articles/cy7e7mj0jmro
It's sour grapes because the methods he prefers have not gotten the same attention (hah...) or funding.
He's continuing to push the ludicrous Apple "reasoning paper" that he described as a "knockout blow for LLMs" even though it was nothing of the sort.
With each of his articles, I usually lose more respect for him.
The author did not say every dollar was wasted, he said that LLMs will never meet the current investment returns.
It’s very frustrating to see comments like this attacking strawmans and setting up Motte and Bailey arguments every time there’s AI criticism. “Oh but LLMs are still useful” and “Even if LLMs can’t achieve AGI we’ll figure out something that will eventually.” Yes but that isn’t what Sam and Andreesen and all these VCs have been saying, and now the entire US economy is a big gamble on a technology that doesn’t deliver what they said it would and because the admin is so cozy with VCs we’re probably all going to suffer for the mistakes of a handful of investors who got blinded by dollar signs in their eyes.
It is one thing to say that OpenAI has overpromised on revenues in the short term and another to say that the entire experiment was a waste of time because it hasn't led to AGI, which is quite literally the stance that Marcus has taken in this article.
This is a strawman, the author at no point says that “every dollar is a waste.”
The dollars invested are not justified considering TODAYs revenues.
Just like 2 years ago people said NVIDIA stock prices was not justified and a massive bubble considering the revenue from those days. But NVIDIA revenues 10xed, and now the stock price from 2 years ago looks seriously underpriced and a bargain.
You are assuming LLM revenues will remain flat or increase moderately and not explode.
So - do Altman and Andreesen really believe that, or is it just a marketing and investment pitch?
A reminder that OpenAI has its own explicit definition of AGI: "Highly autonomous systems that outperform humans at most economically valuable work."
The MS/OAI agreement quantifies that as "profits of $100bn/yr."
This seems rather stupid to me. If you can get multiple AGIs all generating profits of $100bn a year - roughly half an Apple, or two thirds of a Meta - you no longer have anything recognisable as a pre-AI economy, because most of the white collar population is out of work and no longer earning anything. And most of the blue collar population that depends on white collar earnings is in the same boat.
So you have to ask "Profits from doing what, and selling to whom?"
The Altman pitch seems to be "Promise it first, collect investor cash, and worry about the rest later."
As for Andreessen, I don't think he even cares. As the author writes:
"for the venture capitalists that have driven so much of field, scaling, even if it fails, has been a great run: it’s been a way to take their 2% management fee investing someone else’s money on plausible-ish sounding bets that were truly massive, which makes them rich no matter how things turn out"
VCs win every time. Even if it's a bubble and it bursts, they still win. In fact, they are the only party that wins.
Heck, the bigger the bubble, the more money is poured into it, and the bigger the commissions. So VCs have an interest in pumping it up.
Also, if anyone wants to know what a real effort to waste a trillion dollars can buy ... https://costsofwar.watson.brown.edu/
One thing to keep in mind, is that most of these people who go around spreading unfounded criticism of LLMs, "Gen-AI" and just generally AI aren't usually very deep into understanding computer science, and even less science itself. In their mind, if someone does an experiment, and it doesn't pan out, they'll assume that means "science itself failed", because they literally don't know how research and science work in practice.
I’m quite critical, but I think we have to grant that he has plenty of credentials and understands the technical nature of what he’s critiquing quite well!
Its all about scale.
If you spend $100 on something that didn't work out that money wasn't wasted if you learned something amazing. If you spend $1,000,000,000,000 on something that didn't work out the expectation is that you learn something close to 1,000,000,000x more than the $100 spend. If the value of learning is several orders of magnitude less than the level of investment there is absolutely tremendous waste.
For example: nobody qualifies spending a billion dollars on a failed project as value if your learning only resulted in avoiding future paper cuts.
While it doesn't seem we can agree on a meaning for AGI, I think a lot of people think of it as an intelligent entity that has 100% agency.
Currently we need to direct LLM's from task to task. They don't yet posses the capability of full real world context.
This is why I get confused when people talk about AI replacing jobs. It can replace work, but you still need skilled workers to guide them. To me, this could result in humans being even more valuable to businesses, and result in an even greater demand for labor.
If this is true, individuals need to race to learn how to use AI and use it well.
Agent-loops that can work from larger scale goals work just fine. We can't letting them run with no oversight, but we certainly also don't need to micro-manage every task. Most days I'll have 3-4 agent-loops running in parallel, executing whole plans, that I only check in on occasionally.
I still need to review their output occasionally, but I certianly don't direct them task to task.
I do agree with you we still need skilled workers to guide them, so I don't think we necessarily disagree all that much, but we're past the point where they need to be micromanaged.
As a counter-point, LLMs still do embarrassing amounts of hallucinations, some of which are quite hilarious. When that is gone and it starts doing web searches -- or it has any mechanisms that mimic actual research when it does not know something -- then the agents will be much closer to whatever most people imagine AGI to be.
Have LLMs learned to say "I don't know" yet?
ChatGPT and Gemini (and maybe others) can already perform and cite web searches, and it vastly improves their performance. ChatGPT is particularly impressive at multi-step web research. I have also witnessed them saying "I can't find the information you want" instead of hallucinating.
It's not perfect yet, but it's definitely climbing human percentiles in terms of reliability.
I think a lot of LLM detractors are still thinking of 2023-era ChatGPT. If everyone tried the most recent pro-level models with all the bells and whistles then I think there would be a lot less disagreement.
I use the mainstream LLMs and I've noted them improving. They have ways to go still.
I was objecting to my parent poster's implication that we have AGI. However muddy that definition is, I don't feel like we do have that.
Can they, fundamentally, do that? That is, given the current technology.
Architecturally, they don't have a concept of "not knowing." They can say "I don't know," but it simply means that it was the most likely answer based on the training data.
A perfect example: an LLM citing chess rules and still making an illegal move: https://garymarcus.substack.com/p/generative-ais-crippling-a...
Heck, it can even say the move would have been illegal. And it would still make it.
I am aware that the LLM companies are starting to integrate this quality -- and I strongly approve. But again, not being self-critical and as such lacking self-awareness is one of the qualities that I would ascribe to an AGI.
All the time, which you'd know very well if you'd spent much time with current-generation reasoning models.
Still far from AGI however, which was my original point. Any general intelligent being would be self-aware and as such self-critical, by extension.
Boosters: it can answer PhD-level questions and it helps me a lot with my software projects.
Detractors: it can't learn to do a task it doesn't already know how to do.
Boosters: But actually it can actually sometimes do things it wouldn't be able to do otherwise if you give it lots of context and instructions.
Detractors: I want it to be able to actually figure out and retain the context itself, without being given detailed instructions every time, and do so reliably.
Boosters: But look, in this specific case it sort of does that.
Detractors: But not in my case.
Boosters: you're just using it wrong. There must be something wrong with your prompting strategy or how you manage context.
etc etc etc...
[1] https://en.wikipedia.org/wiki/Bloom's_taxonomy
Agreed. Has there been waste? Inarguably. Has the whole thing been a waste? Absolutely not. There are lessons from our past that in an ideal world would have allowed us to navigate this much more efficiently and effectively. However, if we're being honest with ourselves, that's been true of any nascent technology (especially hyped ones) for as long as we've been recording history. The path to success is paved with failure, Hindsight is 20/20, History rhymes and all that.
> I can't figure out what people mean when they say "AGI" any more
We've been asking "What is intelligence" (and/or Sentience) for as long as we've been alive, and still haven't come to a consensus on that. Plenty people will confidently claim they have an answer, which is great, but it's entirely irrelevant if there's not a broad consensus on that definition or a well defined way to verify AI/people/anything against it. Point in case...
> we appear to be past that. We've got something that seems to be general and seems to be more intelligent than an average human
Hard disagree specifically as it regards to Intelligence. They are certainly useful utilities when you use them right, but I digress. What are you basing that on? How can we be sure we're past a goal-post when we don't even know where the goal-post is? For starters, how much is Speed (or latency or IOP/TPSs or however you wish to contextualize it) a function of "intelligence"? For a tangible example of that: If an AI came to a conclusion derived from 100 separate sources, and a human manually went through those same 100 sources and came to the same conclusion, is the AI more intelligent by virtue of completing that task faster? I can absolutely see (and agree with) how that is convenient/useful, but the question specifically is: Does the speed it can provide answers (assuming they're both correct/same) make it smarter or as smart as the human?
How do they rationalize and reason their way through new problems? How do we humans? How important is the reasoning or the "how" of how it arrives at answers to the questions we ask it if the answers are correct? For a tangible example of that: What is happening when you ask an AI to compute the sum of 1 plus 1? What are we doing when we're asking to perform the same task? What about proving it to be correct? More broadly, in the context of AGI/Intelligence, does it matter if the "path of reason" differs if the answers are correct?
What about how confidently it presents those answers (correct or not)? It's well known that us humans are incredibly biased towards confidence. Personally, I might start buying into the hype the day that AI starts telling me "I'm not sure" or "I don't know." Ultimately, until I can trust it to tell me it doesn't know/isn't certain, I wont trust it when it tells me it does know/is certain, regardless of how "Correct" it may be. We'll get there one day, and until then I'm happy to use it for the utility and convenience it provides while doing my part to make it better and more useful.
Here i think it's more about opportunity cost.
> I can't figure out what people mean when they say "AGI" any more, we appear to be past that
What i ask of an AGI is to not hallucinate idiotic stuff. I don't care about being bullshitted too much if the bullshit is logic, but when i ask "fix mypy errors using pydantic" and instead of declaring a type for a variable it invent weird algorithms that make no sense and don't work (and the fix would have taken 5 minutes for any average dev).I mean, Claude 4.5 and Codex have replaced my sed/search and replaces, write my sanity tests, write my commit comment, write my migration scripts (and most of my scripts), and make refactor so easy i now do one refactor every month or so, but if it is AGI, i _really_ wonder what people mean by intelligence.
> Also, if anyone wants to know what a real effort to waste a trillion dollars can buy
100% agree. Pleas Altman, Ilya and other, i will hapilly let you use whatever money you want if that money is taken from war profiteers and warmongers.
We've got something that occasionally sounds as if it were more intelligent than an average human. However, if we stick to areas of interest of that average human, they'll beat the machine in reasoning, critical assessment, etc.
And in just about any area, an average human will beat the machine wherever a world model is required, i.e., a generalized understanding of how the world works.
It's not to criticize the usefulness of LLMs. Yet broad statements that an LLM is more intelligent than an average Joe are necessarily misleading.
I like how Simon Wardley assesses how good the most recent models are. He asks them to summarize an article or a book which he's deeply familiar with (his own or someone else's). It's like a test of trust. If he can't trust the summary of the stuff he knows, he can't trust the summary that's foreign to him either.
None of this is a binary, though. We already have AGI that is superhuman in some ways and subhuman in others. We are already using LLM's to help improve themselves. We already have job displacement.
That continuum is going to continue. AI will become more superhuman in some ways, but likely stay subhuman in others. LLM's will help improve themselves. Job displacement will increase.
Thus the question is whether this rate of change will be fast or slow. Seems mundane, but it's a big deal. Humans can adapt to slow changes, but not so well to fast ones. Thus AGI is a big deal, even if it's a crap stand in for the things people care about.
https://pcpartpicker.com/trends/price/memory/
If it pops it might end up being looked at in the lens of history as one of the largest backdoor/proxy wealth redistributions ever. The capex being spent is in large part going to fund the labor of the unwashed masses, and society is getting the individual productivity and efficiency benefits from the end result models.
I’m particularly thankful for the plethora of open source models I have access to thanks to all this.
I, individually, have realized indisputable substantial benefits from having these tools at my disposal every day. If the whole thing pops, these tools are safely in my possession and I’m better because I have them. Thanks .01%!!
(the reality is I don’t think it will pop in the classic sense, and these days it seems the .01 can never lose. either way, the $1tn can’t be labeled as a waste).
Those models you speak of are great now, but they will degrade over time and become useless unless they get updated, right?
I look at the early days of the Internet when sites like Google and Youtube were unprofitable and looked like a great deal for us lowly users. That did not last.
During the .com boom and bust, there was tons of fraud in Enron and the like, also when they were laying new fiber capital, little of it was being used which is why it’s still dark.
Today every transistor that is added is immediately 100% utilized.
The idea that the trillions are a waste is not exactly fresh. The economic model is still not clear. Alarmists have been shrill and omnipresent. Bankruptcy might be the future of everyone.
But, will we look up one day and say, “Ah never mind” about GPT, Claude, et al? Fat chance. Will no one find a use for a ton of extra compute? I’m pretty sure.
I don’t much dispute any of the facts I skimmed off the article but the conclusion is dumb.
If all this went away tomorrow, what would we do with all the compute? Its not exactly general purpose infrastructure thats being built.
What will we do with all the compute? Landfills, just like all other e-waste. It's never getting repurposed. I already saw this story play out multiple times in the past. The dot-com bubble led to so much waste--PBXs, switches, PCs, monitors, all thrown on the heap. Oh, sorry, "surplused" then thrown on a heap in the Philippines or wherever.
I might very well be super wrong. F.ex. NVIDIA is guarding their secrets very well and we have no reason to believe they'll suddenly drop the ball. But it does make me think; IMO a truly general GPU (and open + free) compute has been our area's blind spot for way too long.
The real takeaway of the study was that workers were using their personal LLM accounts to do work rather than using the AI implementation mess their companies had shat out.
Yeah, no shit. I’ve been saying this since the day GPT 3 became hyped. I don’t think many with a CS background are buying the “snake oil” of AGI through stochastic parrots.
At some point, even people who hype LLMs will spin their narrative to not look out of touch with reality. Or not more out of touch than is acceptable lol.
https://ai.vixra.org/pdf/2506.0065v1.pdf
The public has now caught up with that view. Familiarity breeds contempt, in this case justifiably so.
EDIT: It is interesting that in a submission about Sutskever essentially citing Sutskever is downvoted. You can do it here, but the whole of YouTube will still hate "AI".
Oh please. What LLMs are doing now was complete and utter science fiction just 10 years ago (2015).
Any fool could have anticipated the eventual result of transformer architecture if pursued to its maximum viable form.
What is impressive is the massive scale of data collection and compute resources rolled out, and the amount of money pouring into all this.
But 10 years ago, spammers were building simple little bots with markov chains to evade filters because their outputs sounded plausibly human enough. Not hard to see how a more advanced version of that could produce more useful outputs.
"Any fool could have seen this coming in 2012 if they were paying attention to vision model improvements"
Hindsight is 20/20.
Give me reliable and safe self driving for Interstate highways in moderate to good weather conditions and I would be very happy. Get better incrementally from there.
I live solidly in the snow belt.
Autopilot for planes works in this manner too. Theoretically a modern airliner could autofly takeoff to landing entirely autonomously at this point, but they do not. They decrease pilot workload.
If you want the full robotaxi panacea everywhere at all times in all conditions? Sure. None of us are likely to see that in our lifetime.
(That said, I disagree with them saying "In our lifetimes, neither you nor I", that's much too strong a claim)
LLMs screw up a lot sure, but Watson couldn't do code reviews, or help me learn a foreign language by critiquing my use of articles and declination and idiom, nor could it create an SVG of a pelican riding a bicycle, nor help millions of board kids cheat on their homework by writing entire essays for them.
I’m under the impression that people who are still saying LLMs are unimpressive might just be not using them correctly/effectively.
Or as Primagean says: “skill issue”
He probably makes quite good money as the go to guy for saying AI is rubbish? https://champions-speakers.co.uk/speaker-agent/gary-marcus
But maybe not contrarians/non-contrarians? They are just the agree/disagree commentators. And much of the most valuable commentary is nuanced with support for and against their own position. But generally for.
But that's certainly not a nuanced / trustworthy analysis of things unless you're a top tier researcher.
I'm struggling to reconcile how these connect and he has been installed as Head of AI at Uber. Reeks of being a huckster
>...held the position briefly after Uber acquired his company, Geometric Intelligence, in late 2016. However, Marcus stepped down from the directorship in March 2017,
An example is citing Mr Sutskever's interview this way:
> in my 2022 “Deep learning is hitting a wall” evaluation of LLMs, which explicitly argued that the Kaplan scaling laws would eventually reach a point of diminishing returns (as Sutskever just did)
which is misleading, since Sutskever said it didn't hit a wall in 2022[0]:
> Up until 2020, from 2012 to 2020, it was the age of research. Now, from 2020 to 2025, it was the age of scaling
The larger point that Mr Marcus makes, though, is that the maze has no exit.
> there are many reasons to doubt that LLMs will ever deliver the rewards that many people expected.
That is something that most scientists disagree with. In fact the ongoing progress on LLMs has already accumulated tremendous utility which may already justify the investment.
[0]: https://garymarcus.substack.com/p/a-trillion-dollars-is-a-te...
> 1997 - Professor of Psychology and Neural Science
Gary Marcus is a mindless talking head "contrarian" at this point. He should get a real job.
https://arxiv.org/abs/1801.00631
Here are some of the points
Is deep learning approaching a wall? - He doesn't make a concrete prediction, which seems like a hedge to avoid looking silly later. Similarly, I noticed a hedge in this post:
Of course it ain’t over til it’s over. Maybe pure scaling ... will somehow magically yet solve ...
---
But the paper isn't wrong either:
Deep learning thus far is data hungry - yes, absolutely
Deep learning thus far is shallow and has limited capacity for transfer - yes, Sutskeyer is saying that deep learning doesn't generalize as well as humans
Deep learning thus far has no natural way to deal with hierarchical structure - I think this is technically true, but I would also say that a HUMAN can LEARN to use LLMs while taking these limitations into account. It's non-trivial to use them, but they are useful
Deep learning thus far has struggled with open-ended inference - same point as above -- all the limitations are of course open research questions, but it doesn't necessarily mean that scaling was "wrong". (The amount of money does seem crazy though, and if it screws up the US economy, I wouldn't be that surprised)
Deep learning thus far is not sufficiently transparent - absolutely, the scaling has greatly outpaced understanding/interpretability
Deep learning thus far has not been well integrated with prior knowledge - also seems like a valuable research direction
Deep learning thus far cannot inherently distinguish causation from correlation - ditto
Deep learning presumes a largely stable world, in ways that may be problematic - he uses the example of Google Flu Trends ... yes, deep learning cannot predict the future better than humans. That is a key point in the book "AI Snake Oil". I think this relates to the point about generalization -- deep learning is better at regurgitating and remixing the past, rather than generalizing and understanding the future.
Lots of people are saying otherwise, and then when you call them out on their predictions from 2 years ago, they have curiously short memories.
Deep learning thus far works well as an approximation, but its answers often cannot be fully trusted - absolutely, this is the main limitation. You have to verify its answers, and this can be very costly. Deep learning is only useful when verifying say 5 solutions is significantly cheaper than coming up with one yourself.
Deep learning thus far is difficult to engineer with - this is still true, e.g. deep learning failed to solve self-driving ~10 years ago
---
So Marcus is not wrong, and has nothing to apologize for. The scaling enthusiasts were not exactly wrong either, and we'll see what happens to their companies.
It does seem similar to be dot com bubble - when the dust cleared, real value was created. But you can also see that the marketing was very self-serving.
Stuff like "AGI 2027" will come off poorly -- it's an attempt by people with little power to curry favor with powerful people. They are serving as the marketing arm, and oddly not realizing it.
"AI will write all the code" will also come off poorly. Or at least we will realize that software creation != writing code, and software creation is the valuable activity
>Deep learning thus far is shallow and has limited capacity for transfer - yes, Sutskeyer is saying that deep learning doesn't generalize as well as humans
But they do generalize to some extent, and my limited understanding is that they generalize way more than expected ("emergent abilities") from the pre-LLM era, when this prediction was made. Sutskever pretty much starts the podcast saying "Isn’t it straight out of science fiction?"
Now Gary Marcus says "limited capacity for transfer" so there is wiggle room there, but can this be quantified and compared to what is being seen today?
In the absense of concrete numbers, I would suspect he is wrong here. I mean, I still cannot mechanistically picture in my head how my intent, conveyed in high-level English, can get transformed into working code that fits just right into the rather bespoke surrounding code. Beyond coding, I've seen ChatGPT detect sarcasm in social media posts about truly absurd situations.
At some level, it is extracting abstract concepts from its training data, as well as my prompt and the test data (which is probably outside the distribution of the training data), even apply appropriate value judgements to those concepts where suitable, and combine everything properly to generate a correct response. These are much higher-level concepts than the ones Marcus says deep learning has no grasp of.
Absent quantifiable metrics, on a qualitative basis at least I would hold this point against him.
On a separate note:
> "AI will write all the code" will also come off poorly.
On the contrary, I think it is already true (cf agentic spec-driven development.) Sure, there are the hyper-boosters who were expecting software engineers to be replaced entirely, but looking back, claims from Dario, Satya, Pichai and their ilk were were all about "writing code" and not "creating software." They understand the difference and in retrospect were being deliberately careful in their wording while still aiming to create a splash.
Agreed on all points. Let's see some numerical support.
He wasn't wrong though.
> Yet deep learning may well be approaching a wall, much as I anticipated earlier, at beginning of the resurgence (Marcus, 2012)
(From "Deep Learning: A Critical Appraisal")
A system capable of self improvement will be sufficient for AGI imo.
I guess there is some truth to it. The last big improvement to LLMs was reasoning. It gave the existing models additional capabilities (after some re-training).
We've reached the plateau of tiny incremental updates. Like with smartphones. I sometime still use an iPhone 6s. There is no fundamental difference compared to the most current iPhone generation 10 years later. The 6s is still able to perform most of the tasks you need a smartphone to do. The new ones do it much faster, and everything works better, but the changes are not disrupting at all.
The feeling I have now is that it was a fine decision for him to have made. It made a point at the time, perhaps moral, perhaps political. And now it seems, despite whatever cost there was for him at the time, the "golden years" of OpenAI (and LLM's in general) may have been over anyway.
To be sure, I happen to believe there is a lot of mileage for LLMs even in their current state—a lot of use-cases, integration we have yet to explore. But Sutskever I assume is a researcher and not a plumber—for him the LLM was probably over.
One wonders how long before one of these "break throughs". On one hand, they may come about serendipitously, and serendipity has no schedule. It harkens back to when A.I. itself was always "a decade away". You know, since the 1950's or so.
On the other hand, there are a lot more eyeballs on AI these days than there ever were in Minsky's* day.
(*Hate to even mention the man's name these days.)
Indeed. Humans are a sucker for a quick answer delivered confidently. And The industry coalesced around LLM's once it was able to output competent, confident, corporate (aka HR-approved) english, which for many AI/DL/ML/NN researchers was actually a bit of a bummer. Reason I say that is because that milestone suddenly made the "[AGI is] always a decade away" to seeming much more imminent. Thus the focus of investment in the space shifted from actual ML/DL/NN research to who could convert largest pile of speculatively leveraged money into pallets of GPU's and data to feed them as "throw more compute/data" at it was a quicker/more reliable way to realize performance gains than investing in research did. Yes, research would inevitably yield results, but it's incredibly hard to forceast how long it takes for research to yield tangible results and harder still to quantify that X dollars will result in Y result in Z time compared to X dollars buys Y compute deployed in Z time. With the immense speculative backed FOMO and the potential valuation/investment that could result from being "the leader" in any given regard, it's no wonder that BigTech chose to primarily invest in the latter, thus leaving to those working in the former space to start considering looking elsewhere to continue actual research.
Wasted money is a totally different topic. If we view LLMs as a business opportunity, they haven't yet paid off. To imply, however, that a massive investment in GPUs is a waste seems flawed. GPUs are massively parallel compute. Were the AI market to collapse, we can imagine these GPUs being sold a severe discounts which would then likely spur some other technological innovation just as the crypto market laid the groundwork for ML/AI. When a resource gets cheap, more people gain access to it and innovation occurs. Things that were previously cost prohibitive become affordable.
So, whether or not we humans achieve AGI or make tons of money off of LLMs is somewhat irrelevant. The investment is creating goods of actual value even if those goods are currently overpriced, and should the currently intended use prove to be poor, a better and more lucrative use will be found in the event of an AI market crash.
Personally, I hope that the AGI effort is successful, and that we can all have a robot house keeper for $30k. I'd gladly trade one of the cars in my household to never do dishes, laundry, lawnmowing, or household repairs again just as I paid a few hundred to never have to vacuum my floors (though I actually still do once a month when I move furniture to vacuum places the Roomba can't go, a humanoid robot could do that for me).
Is weird to me. Surely you recognise just as they don't know what they don't know (which is presumably the problem when it hallucinates), you must also have the same issue, there's just no old greybeard to wish you didn't have access.
My humble take on AGI is that we don't understand consciousness so how could we build something conscious except by accident? It seems like an extremely risky and foolish thing to attempt. Luckily, humans will fail at it.
"creating goods of actual value"
and
"creating goods of actual value for any price"
I don't think it's controversial that these things are valuable but rather the cost to produce use things is up for discussion, and the real problem here. If the price is too high now, then there will be real losses people experience down the line, and real losses have real consequences.
Especially given the humungous scale of infrastructure that the current approach requires. Is there another line of technology that would require remotely as much?
Note, I'm not saying there can't be. It's just that I don't think there are obvious shots at that target.
Even with strong adoption it may take many years for LLMs available now to reach their potential utility in the economy. This should moderate the outlook for future changes, but instead we have a situation where the speculative MIT study that predicted "AI" could perform 12% of the work in the economy is widely considered to not only be accurate, but inevitable in the short term. How much time is needed dramatically changes calculations of potential and what might be considered waste.
Also worth keeping in mind that the Y2K tech bust left behind excess network capacity that ended up being useful later, but the LLM boom can be expected to leave behind poorly considered data centers full of burned out chips which is a very different legacy.
I am sure some of you are thinking "that is all slop code". It definitely can be if you don't do your due diligence in review. We have definitely seen a bifurcation of devs who do that, and those who don't, where I am currently working.
But by far the biggest gain is my mental battery is far less drained at the end of the day. No task feels soul crushing anymore.
Personally, coding agents are the greatest invention of my lifetime outside the emergence of the internet.
One point about this is that humans appear unable to understand that this is an efficient outcome because investment booms are a product of uncertainty around the nature of the technological change. You are building something is literally completely new, no-one had any idea what cars consumers would buy so lots of companies started to try and work out that out and that consolidated into competition on cost/scale once that became clear. There is no way to go to the end of that process, there are many people outside the sphere of business who are heavily incentivized to say that we (meaning bureaucrats and regulators) actually know what kind of cars consumers wanted and that all the investment was just a waste.
Another point is that technological change is very politically disruptive. This was a point that wasn't well appreciated...but is hopefully clear with social media. There are a large number of similar situations in history though: printing press, newspapers, etc. Technological change is extremely dangerous if you are a politician or regulator because it results in your power decreasing and, potentially, your job being lost. Again, the incentives are huge.
The other bizarre irony of this is that people will look at an investment boom with no technological change, that was a response to government intervention in financial markets and a malfunctioning supply-side economy...and the response was: all forms of technical innovation are destabilizing, investment booms are very dangerous, etc. When what they mean is corporations with good political connections might lose money.
This is also linked inherently to the view around inflation. The 1870s are regarded as one of the most economically catastrophic periods in economic history by modern interpretations of politics. Let me repeat this in another way: productivity growth was increasing by 8-10%/year, you saw mind-boggling gains from automation (one example is cigarettes, iirc it took one skilled person 10-20 minutes to create a cigarette, a machine was able to produce hundreds in a minute), and conventional macroeconomics views this as bad because...if you can believe it...they argue that price declines lead to declines in investment. Now compare to today: prices continue to rise, investment is (largely) non-existent, shortages in every sector. Would you build a factory in 1870 knowing you could cut prices for output by 95% and produce more? The way we view investment is inextricably linked in economic policy to this point of view, and is why the central banks have spent trillions buying bonds with, in most cases, zero impact on real investment (depending on what you mean, as I say above, private equity and other politically connected incumbents have made out like bandits...through the cycle, the welfare gain from this is likely negative).
You see the result of this all over the Western world: shortages of everything, prices sky-high, and when technological change happens the hysteria around investment being wasteful and disruptive. It would be funny if we didn't already see the issues with this path all around us.
It is not wasted, we need more of this, this ex-post, academic-style reasoning of everything in hindsight gets us nowhere. There is no collateral damage, even in the completely fake Fed-engineered housing bubble, the apparently catastrophic cost was: more houses, and some wealthy people lost their net worth (before some central bankers found out their decisions in 03-04 caused wealthy people to lose money, and quickly set about recapitalising their brokerage accounts with taxpayers money).
Many people today find LLMs useful, as evidenced by willingness to pay to use them. That alone IMO largely demolishes the “wasted” argument; markets will find the right price vs cost vs value trade-off when necessary.
The author presupposes that the only goal of all the LLM investment that has been done is to achieve AGI.
Certainly the big players are seeking AGI but they’re also businesses trying to make money, or at least not lose huge amounts of money. So we see a lot of market experiments trying to figure out where and how LLMs can provide value for paying customers.
Of course, he includes enough weasel phrases that you could never nail him down on any particular negative sentiment; LLMs aren’t bad, they just need to be “complemented”. But even if we didn’t have context, the whole thesis of the piece runs completely counter to this — you don’t “waste” a trillion dollars on something that just needs to be complemented!
FWIW, I totally agree with his more mundane philosophical points about the need to finally unify the work of the Scruffies and the Neats. The problem is that he frames it like some rare insight that he and his fellow rebels found, rather than something that was being articulated in depth by one of the fields main leaders 35 years ago[1]. Every one of the tens of thousands of people currently working on “agential” AI knows it too, even if they don’t have the academic background to articulate it.
I look forward to the day when Mr. Marcus can feel like he’s sufficiently won, and thus get back to collaborating with the rest of us… This level of vitriolic, sustained cynicism is just antithetical to the scientific method at this point. It is a social practice, after all!
[1] https://www.mit.edu/~dxh/marvin/web.media.mit.edu/~minsky/pa...
This is a frankly bizarre argument. Firstly, it presupposes that _only_ way AI becomes useful is if turns into AGI. But that isn't true: Existing LLMs can do a variety of economically valuable tasks, such as coding, even when not being AGI. Perhaps the economic worth of non-AGI will never equal what it costs to build an operate it, but it seems way too early to make that judgement and declare any non-AGI AI as worthless.
Secondly, even if scaling alone won't reach AGI, that doesn't mean that you can reach AGI _without_ scaling. Even when new and better architectures are developed, it still seems likely that, between two models with an equivalent architecture, the one with more data and compute research will be more powerful. And waiting for better architectures before you try to scale means you will never start. 50 years from now, researchers will have much better architectures. Does that mean we should wait 50 years before trying to scale them? How about 100 years? At what point do you say, we're never going to discover anything better, so now we can try scaling?
https://lpeproject.org/events/the-accumulation-of-waste-a-po...
Perhaps the scale is unprecedented, or it's always been like this it's just much less concealed these days.
Absolute retards can waste trillions of dollars on stupid ideas, because they're in the in group. Next door someone who's worked their whole life gets evicted because their mortgage is now way more of what they make in salary.
Sucks to be in the out group!
If we find AGI needs a different chip architecture, yeah, LLMs would have been quite a waste.
"A trillion dollars is a terrible thing to waste"
cf. "A mind is terrible thing to waste"
https://en.wikipedia.org/wiki/UNCF