Claude for Excel
Mood
heated
Sentiment
mixed
Category
other
Key topics
Anthropic's 'Claude for Excel' aims to integrate AI into Excel for tasks like debugging and formula creation, sparking debate among commenters about its potential benefits and risks.
Snapshot generated from the HN discussion
Discussion Activity
Very active discussionFirst comment
23m
Peak period
154
Day 1
Avg / period
32
Based on 160 loaded comments
Key moments
- 01Story posted
Oct 27, 2025 at 12:09 PM EDT
about 1 month ago
Step 01 - 02First comment
Oct 27, 2025 at 12:32 PM EDT
23m after posting
Step 02 - 03Peak activity
154 comments in Day 1
Hottest window of the conversation
Step 03 - 04Latest activity
Nov 1, 2025 at 1:03 PM EDT
25 days ago
Step 04
Generating AI Summary...
Analyzing up to 500 comments to identify key contributors and discussion patterns
LLMs are not deterministic.
I'd argue over the short term humans are more deterministic. I ask a human the same question multiple times and I get the same answer. I ask an LLM and each answer could be very different depending on its "temperature".
But I agree with the sentiment. It seems it is more important than ever to agree on what it means to understand something.
Only a matter of time before someone does it though.
AFAIK there is no 'git for Excel to diff and undo', especially not built-in (aka 'for free' both cost-wise and add-ons/macros not allowed security-wise).
My limited experience has been that it is difficult to keep LLMs from changing random things besides what they're asked to change, which could cause big problems if unattackable in Excel.
Unlike code where it's all on display, with all these formulas are hidden in each cell, you won't see the problem unless click on the cell so you'll have a hard time finding the cause.
Little stuff like splitting text more intelligently or following the formatting seen elsewhere would be very satisfying.
Same deal there -- the original author was a genius and was the only person who knew how it was set up or how it worked.
What I’m saying is that if you really believed we were 2, maybe 3 years tops from AGI or the singularity or whatever you would spend 0 effort serving what already seems to be a domain that is already served by 3rd parties that are already using your models! An excel wrapper for an LLM isn’t exactly cutting edge AI research.
They’re desperate to find something that someone will pay a meaningful amount of money for that even remotely justifies their valuation and continued investment.
Being able to select a few rows and then use plain language to describe what I want done is a time saver, even though I could probably muddle through the formulas if I needed to.
-stop using the free plan -don't use gemini flash for these tasks -learn how to do things over time and know that all ai models have improved significantly every few months
Whats the month over month improvement if the current state is "creates entirely fake data that looks convincing" as a user it's hard to tell when we've hit a point of this being a useful feature. The old timey metric would normally be that when a company rolls out a new feature it's usually mostly functional, that doesn't appear to be the case here at all, so what's the sign?
It is an entire agent loop. You can ask it to build a multi sheet analysis of your favorite stock and it will. We are seeing a lot of early adopters use it for financial modeling, research automation, and internal reporting tasks that used to take hours.
To see something much more powerful on Google Sheets than Gemini for free, you can add "try@tabtabtab.ai" to your sheet, and make a comment tagging "try@tabtabtab.ai" and see it in action.
If that is too much just go to ttt.new!
I would’ve expected “make a vlookup or pivot table that tells me x” or “make this data look good for a slide deck” to be easier problems to solve.
For easy spreadsheet stuff (which 80% of average white collars workers are doing when using excel) I’d imagine the same approach. Try to do what I want, and even if you’re half wrong the good 50% is still worth it and a better starting point.
Vibe coding an app is like vibe coding a “model in excel”. Sure you could try, but most people just need to vibe code a pivot table
Thousands of unreported COVID cases: https://news.ycombinator.com/item?id=24689247
Thousands of errors in genetics research papers: https://news.ycombinator.com/item?id=41540950
Wrong winner announced in national election: https://news.ycombinator.com/item?id=36197280
Countries across the world implement counter-productive economic austerity programs: https://en.wikipedia.org/wiki/Growth_in_a_Time_of_Debt#Metho...
https://docs.claude.com/en/docs/about-claude/models/overview
Spend a few years in an insurance company, a manufacturing plant, or a hospital, and then the assertion that the frontier labs will figure it out appears patently absurd. (After all, it takes humans years to understand just a part of these institutions, and they have good-functioning memory.)
This belief that tier 5 is useless is itself a tell of a vulnerability: the LLMs are advancing fastest in domain-expertise-free generalized technical knowledge; if you have no domain expertise outside of tech, you are most vulnerable to their march of capability, and it is those with domain expertise who will rely increasingly less on those who have nothing to offer but generalized technical knowledge.
That OpenAI is now apparantly striving to become the next big app layer company could hint at George Hotz being right but only if the bets work out. I‘m glad that there is competition on the frontier labs tier.
I don’t think the frontier labs have the bandwidth or domain knowledge (or dare I say skills) to do tier 5 tasks well. Even their chat UIs leave a lot to be desired and that should be their core competency.
However I would think more of elite data centers rather than commodity data centers. That's because I see Tier 4 being deeply involved in their data centers and thinking of buying the chips to feed their data centers. I wouldn't be so inclined to throw in my opinion immediately if I found an article showing this ordering of the tiers, but being a tweet of a podcast it might have just been a rough draft.
My wife works in insurance operations - everyone she manages from the top down lives in Excel. For line employees a large percentage of their job is something like "Look at this internal system, export the data to excel, combine it with some other internal system, do some basic interpretation, verify it, make a recommendation". Computer Use + Excel Use isn't there yet...but these jobs are going to be the first on the chopping block as these integrations mature. No offense to these people but Sonnet 4.5 is already at the level where it would be able to replicate or beat the level of analysis they typically provide.
It's one thing to fudge the language in a report summary, it can be subjective, however numbers are not subjective. It's widely known LLMs are terrible at even basic maths.
Even Google's own AI summary admits it which I was surprised at, marketing won't be happy.
Yes, it is true that LLMs are often bad at math because they don't "understand" it as a logical system but rather process it as text, relying on pattern recognition from their training data.
- Log in to the internal system that handles customer policies
- Find all policies that were bound in the last 30 days
- Log in to the internal system that manages customer payments
- Verify that for all policies bound, there exists a corresponding payment that roughly matches the premium.
- Flag any divergences above X% for accounting/finance to follow up on.
Practically this involves munging a few CSVs, maybe typing in a few things, setting up some XLOOKUPs, IF formulas, conditional formatting, etc.
Will AI replace the entire job? No...but that's not the goal. Does it have to be perfect? Also no...the existing employees performing this work are also not perfect, and in fact sometimes their accuracy is quite poor.
The one thing LLMs should consistently do is ensure that formatting is correct. Which will help greatly in the checking process. But no, I generally don't trust them to do sensible things with basic formulation. Not a week ago GPT 5 got confused whether a plus or a minus was necessary in a basic question of "I'm 323 days old, when is my birthday?"
My concern would be more with how to check the work (ie, make sure that the formulas are correct and no columns are missed) because Excel hides all that. Unlike code, there's no easy way to generate the diff of a spreadsheet or rely on Git history. But that's different from the concerns that you have.
The UX of spreadsheet diffs is a hard one to solve because of how weird the calculation loops are and how complicated the relationship between fields might be.
I've never tried to solve this for a real end user before in a generic way - all my past work here was for internal ability to audit changes and rollback catastrophes. I took a lot of shortcuts by knowing which cells are input data vs various steps of calculations -- maybe part of your ux is being able to define that on a sheet by sheet basis? Then you could show how different data (same formulas) changed outputs or how different formulas (same data) did differently?
Spreadsheets are basically weird app platforms at this point so you might not be able to create a single experience that is both deep and generic. On the other hand maybe treating it as an app is the unlock? Get your AI to noodle on what the whole thing is for, then show diff between before and after stable states (after all calculation loops stabilize or are killed) side by side with actual diffs of actual formulas? I feel like Id want to see a diff as a live final spreadsheet and be able to click on changed cells and see up the chain of their calculations to the ancestors that were modified.
Fun problem that sounds extremely complicated. Good luck distilling it!
> Most Excel work is similar to basic coding so I think this is an area where they might actually be pretty well suited.
This is a hot take. One I'm not sure many would agree with.
Excel is similar to coding in BASIC, a giant hairy ball of tangled wool.
The model ought to be calling out to some sort of tool to do the math—effectively writing code, which it can do. I'm surprised the major LLM frontends aren't always doing this by now.
In JavaScript (and I assume most other programming languages) this is the job of static analysis tools (like eslint, prettier, typescript, etc.). I’m not aware of any LLM based tools which performs static analysis with as good a results as the traditional tools. Is static analysis not a thing in the spreadsheet world? Are there the tools which do static analysis on spreadsheets subpar, or offer some disadvantage not seen in other programming languages? And if so, are LLMs any better?
LLMs are a lossy validation, and while they work sometimes, when they fail they usually do so 'silently'.
The more complicated the spreadsheet and the more dependencies it has, the greater the room for error. These are probabilistic machines. You can use them, I use them all the time for different things, but you need to treat them like employees you can't even trust to copy a bank account number correctly.
Besides, using AI is an exercise in a "trust but verify" approach to getting work done. If you asked a junior to do the task you'd check their output. Same goes for AI.
I hate smartsheet…
Excel or R. (Or more often, regex followed by pen and paper followed by more regex.)
Handing them regex would be like giving a monkey a bazooka
Actually, yes. This kind of management reporting is either (1) going to end up in the books and records of the company - big trouble if things have to be restated in the future or (2) support important decisions by leadership — who will be very much less than happy if analysis turns out to have been wrong.
A lot of what ties up the time of business analysts is ticking and tying everything to ensure that mistakes are not made and that analytics and interpretations are consistent from one period to the next. The math and queries are simple - the details and correctness are hard.
Sometimes there can be an advantage in leading or lagging some aspects of internal accounting data for a time period. Basically sitting on credits or debits to some accounts for a period of weeks. The tacit knowledge to know when to sit on a transaction and when to action it is generally not written down in formal terms.
I'm not sure how these shenanigans will translate into an ai driven system.
This worked famously well for Enron.
Take your own advice.
This is basic business and engineering 101.
Well said. Concise and essentially inarguable, at least to the extent it means LLMs are here to stay in the business world whether anyone likes it or not (barring the unforeseen, e.g. regulation or another pressure).
For example, if I ask you to tabulate orders via a query but you forgot to include an entire table, this is a major error of process but the query itself actually is consistently error-free.
Reducing error and mistakes is very much modeling where error can happen. I never trust an LLM to interpret data from a spreadsheet because I cannot verify every individual result, but I am willing to ask an LLM to write a macro that tabulates the data because I can verify the algorithm and the macro result will always be consistent.
Using Claude to interpret the data directly for me is scary because those kinds of errors are neither verifiable nor consistent. At least with the “missing table” example, that error may make the analysis completely bunk but once it is corrected, it is always correct.
Yeah, but it could be perfect, why are there humans in the loop at all? That is all just math!
I have personally worked with spreadsheet based financial models that use 100k+ rows x dozens of columns and involve 1000s of formulas that transform those data into the desired outputs. There was very little tolerance for mistakes.
That said, humans, working in these use cases, make mistakes >0% of the time. The question I often have with the incorporation of AI into human workflows is, will we eventually come to accept a certain level of error from them in the way we do for humans?
For cases where that is not available, we should use a human and never an LLM.
I had a big backlog of "nice to have scripts" I wanted to write for years, but couldn't find the time and energy for. A couple of months after I started using Claude Code, most of them exist.
Just a suspicion.
not just in a spreadsheet, any kind of deterministic work at all.
find me a reliable way around this. i don't think there is one. mcp/functions are a band aid and not consistent enough when precision is important.
after almost three years of using LLMs, i have not found a single case where i didn't have to review its output, which takes as long or longer than doing it by hand.
ML/AI is not my domain, so my knowledge is not deep nor technical. this is just my experience. do we need a new architecture to solve these problems?
In Excel, it's possible to just ad hoc adjust things and make it up as you go. It's not clean but very adaptable and flexible.
This is talking about applying LLMs to formula creation and references, which they are actually pretty good at. Definitely not about replacing the spreadsheet's calculation engine.
Why are we suddenly ok with giving every underpaid and exploited employee a foot gun and expect them to be responsible with it???
If your experience with the lowest, most-abused employees is better than mine, I envy you.
Rightly so! But LLMs can still make you faster. Just don't expect too much from it.
Spreadsheets work because the user sees the results of complex interconnected values and calculations. For the user, that complexity is hidden away and left in the background. The user just sees the results.
This would be a nightmare for most users to validate what changes an LLM made to a spreadsheet. There could be fundamental changes to a formula that could easily be hidden.
For me, that the concern with spreadsheets and LLMs - which is just as much a concern with spreadsheets themselves. Try collaborating with someone on a spreadsheet for modeling and you’ll know how frustrating it can be to try and figure out what changes were made.
high precision is possible because they can realize that by multiple cross validations
Claude for Excel isn't doing maths. It's doing Excel. If the llm is bad at maths then teaching it to use a tool that's good at maths seems sensible.
I was thinking along the same lines, but I could not articulate as well as you did.
Spreadsheet work is deterministic; LLM output is probabilistic. The two should be distinguished.
Still, its a productivity boost, which is always good.
Now, granted, that can also happen because Alex fat-fingered something in a cell, but that's something that's much easier to track down and reverse.
Privatized insurance will always find a way to pay out less if they could get away with it . It is just nature of having the trifecta of profit motive , socialized risk and light regulation .
It's the nature of everything. They agree to pay you for something. It's nothing specific to "profit motive" in the sense you mean it.
There are many other entity types from unions[1], cooperatives , public sector companies , quasi government entities, PBC, non profits that all offer insurance and can occasionally do it well.
We even have some in the US and don’t think it is communism even - like the FDIC or things like social security/ unemployment insurance.
At some level government and taxation itself is nothing but insurance ? We agree to paying taxes to mitigate against variety of risks including foreign invasion or smaller things like getting robbed on the street.
[1] Historically worker collectives or unions self-organized to socialize the risks of both major work ending injuries or death.
Ancient to modern armies operate on because of this insurance the two ingredients that made them not mercenaries - a form of long term insurance benefit (education, pension, land etc) or family members in the event of death and sovereign immunity for their actions.
Source?
That's a feature, not a bug.
We also have to remember all claims aren't equal. i.e. some claims end up being way costlier than others. You can achieve similar % margin outcomes by putting a ton of friction like, preconditions, multiple appeals processes and prior authorization for prior authorization, reviews by administrative doctors who have no expertise in the field being reviewed don't have to disclose their identity and so and on.
While U.S. system is most extreme or evolved, it is not unique, it is what you get when you end up privatize insurance any country with private insurance has some lighter version of this and is on the same journey .
Not that public health system or insurance a la NHS in UK or like Germany work, they are underfunded, mismanaged with long times in months to see a specialist and so on.
We have to choose our poison - unless you are rich of course, then the U.S. system is by far the best, people travel to the U.S. to get the kind of care that is not possible anywhere else.
I disagree with the statement that healthcare insurance is predominantly privatized in the US: Medicare and Medicaid, at least in 2023, outspent private plans for healthcare spending by about ~10% [1]; this is before accounting for government subsidies for private plans. And boy, does America have a very unique relationship with these programs.
https://www.healthsystemtracker.org/chart-collection/u-s-spe...
John Oliver had an excellent segment coincidentally yesterday on this topic.
While the government pays for it, it is not managed or run by them so how to classify the program as public or private ?
My take away is that as public health costs are overtaking private insurance and at the same time doing a better job controlling costs per enrollee, it makes more and more sense just to have the government insure everyone.
I can't see what argument the private insurers have in their favor.
So obviously the company that prioritizes accuracy of coverage decisions by spending money on extra labor to audit itself is wasting money. Which means insureds have to waste more time getting the payment for healthcare they need.
More compliance or reporting requirements usually tend to favor the larger existing players who can afford to do it and that is also used to make the life difficult and reject more claims for the end user.
It is kind of thing that keeps you and me busy, major investors don't care about it all, the cost of the compliance or the lack is not more than a rounding number in the balance, the fines or penalties are puny and laughable.
The enormous profits year on year for decades now, the amount of consolidation allowed in the industry show that the industry is able to do mostly what they want pretty much, that is what I meant by light regulation.
https://riskandinsurance.com/us-pc-insurance-industry-posts-...
Meta alone made $62bln in 2024: https://investor.atmeta.com/investor-news/press-release-deta...
So it's weird to see folks on a tech site talking about how enormous all the profits are in health insurance, and citations with numbers would be helpful to the discussion.
I worked in insurance-related tech for some time, and the providers (hospitals, large physician groups) and employers who actually pay for insurance have signficant market power in most regions, limiting what insurers can charge.
Some people - normal people - understand the difference between the holistic experience of a mathematically informed opinion and an actual model.
It's just that normal people always wanted the holistic experience of an answer. Hardly anyone wants a right answer. They have an answer in their heads, and they want a defensible journey to that answer. That is the purpose of Excel in 95% of places it is used.
Lately people have been calling this "syncophancy." This was always the problem. Sycophancy is the product.
Claude Excel is leaning deeply into this garbage.
299 more comments available on Hacker News
Want the full context?
Jump to the original sources
Read the primary article or dive into the live Hacker News thread when you're ready.