Zed's Pricing Has Changed: LLM Usage Is Now Token-Based
Posted3 months agoActive3 months ago
zed.devTechstoryHigh profile
heatedmixed
Debate
80/100
AI-Assisted Coding ToolsPricing ModelsToken-Based Pricing
Key topics
AI-Assisted Coding Tools
Pricing Models
Token-Based Pricing
Zed, a coding editor, has changed its pricing model to token-based for LLM usage, sparking debate among users about the fairness and implications of this change.
Snapshot generated from the HN discussion
Discussion Activity
Very active discussionFirst comment
18m
Peak period
66
0-2h
Avg / period
11.4
Comment distribution160 data points
Loading chart...
Based on 160 loaded comments
Key moments
- 01Story posted
Sep 24, 2025 at 12:13 PM EDT
3 months ago
Step 01 - 02First comment
Sep 24, 2025 at 12:31 PM EDT
18m after posting
Step 02 - 03Peak activity
66 comments in 0-2h
Hottest window of the conversation
Step 03 - 04Latest activity
Sep 25, 2025 at 2:38 PM EDT
3 months ago
Step 04
Generating AI Summary...
Analyzing up to 500 comments to identify key contributors and discussion patterns
ID: 45362425Type: storyLast synced: 11/20/2025, 8:56:45 PM
Want the full context?
Jump to the original sources
Read the primary article or dive into the live Hacker News thread when you're ready.
I'm a big fan of Zed but tbf I'm just using Claude Code + Nvim nowadays. Zed's problem with their Claude integration is that it will never be as good as just using the latest from Claude Code.
The integration in Zed is limited by what the Claude Code SDK exposes. Since about half of the /commands are missing from the SDK, they don’t show up in Zed.
I think ACP was a good strategic move by Zed, but all I personally really need is Claude Code in a terminal pane with diffs of proposed edits in the absolutely wonderful multibuffer view
What is the core business plan then?
I understand that there's nothing you could do to protect me if I make a prompt that ends up using >$5 of usage but after that I would like Zed to reject anything except my personal API keys.
https://zed.dev/docs/ai/plans-and-usage#usage-spend-limits
Though from everything I've read online, Zed's edit prediction model is far, _far_ behind that of Cursor.
Usage pricing on something like aws is pretty easy to figure out. You know what you're going to use, so you just do some simple arithmetic and you've got a pretty accurate idea. Even with serverless it's pretty easy. Tokens are so much harder, especially when using it in a development setting. It's so hard to have any reasonable forecast about how a team will use it, and how many tokens will be consumed.
I'm starting to track my usage with a bit of a breakdown in the hope that I'll find a somewhat reliable trend.
I suspect this is going to be one of the next big areas in cloud FinOps.
I don't see how it matters to you that you aren't saturating your $200 plan. You have it because you hit the limits of the $100/mo plan.
There is probably a lot of value in predictability. Meaning it might be visible for a $200, to offer more tokens than $200.
https://github.com/anthropics/claude-code/issues/1109
This is only true if you can find someone else selling them at cost.
If a company has a product that cost them $150, but they would ordinarily sell piecemeal for a total of $250, getting a stable recurring purchase at $200 might be worthwhile to them while still being a good deal for the customer.
When you get Claude Code's $20 plan, you get "around 45 messages every 5 hours". I don't really know what that means. Does that mean I get 45 total conversations? Do minor followups count against a message just as much as a long initial prompt? Likewise, I don't know how many messages I'll use in a 5 hour period. However, I do understand when I start bumping up against limits. If I'm using it and start getting limited, I understand that pretty quickly - in the same way that I might understand a processor being slower and having to wait for things.
With tokens, I might blow through a month's worth of tokens in an afternoon. On one hand, it makes more sense to be flexible for users. If I don't use tokens for the first 10 days, they aren't lost. If I don't use Claude for the first 10 days, I don't get 2,160 message credits banked up. Likewise, if I know I'm going on vacation later, I can't use my Claude messages in advance. But it's just a lot easier for humans to understand bumping up against rate limits over a more finite period of time and get an intuition for what they need to budget for.
My mental model is they’re assigning some amount of API credits to the account and billing the same way as if you were using tokens, shutting off at an arbitrary point. The point also appears to change based on load / time of day.
Edit: To be clear, I'm not talking about Zed. I'm talking about the companies make the models.
I unfortunately have seen many AI-based tools being demoed with this approach. The goal is clearly to monetize every user action while piggybacking off of models provided by a third-party. The gross thing is that leadership from the director level up LOVES these demos, even when the models very clearly fuck up in the demo.
AI: "I have cleaned the formatting for all 4,650 records in your sample XML files. Let me know if there's anything else I can do to help!"
Me: "There are over 25,000 records in that data..."
AI: "You're absolutely right!"
https://forstarters.substack.com/p/for-starters-59-on-credit...
It already is. There’s been a lot of talk and development around FinOps for AI and the challenges that come with that. For companies, forecasting token usage and AI costs is non-trivial for internal purposes. For external products, what’s the right unit economic? $/token, $/agentic execution, etc? The former is detached from customer value, the latter is hard to track and will have lots of variance.
With how variable output size can be (and input), it’s a tricky space to really get a grasp on at this point in time. It’ll become a solved problem, but right now, it’s the Wild West.
The hard work is the high level stuff like deciding on the scope of the project, how it should fit in to the project, what kind of extensibility the feature might need to be built with, what kind of other components can be extended to support it, (and more), and then reviewing all the work that was done.
Agree with the sentiment, but I do think there are edge cases.
e.g. I could see a place like openrouter getting away with a tiny fractional markup based on the value they provide in the form of having all providers in one place
At scale, OpenRouter will instead get you the lower high-volume fees they themselves get from their different providers.
Why does not justify charging a fraction of your spend on the LLM platform? This is pretty much how every service business operates.
This is not a new concern. And is not unique to Zed.
AI, just like cloud was before this, is being treated as "infrastructure". The reason people invest in roads is not to make money from the road itself, but rather recuper all the losses from the new extremely profitable businesses that will come on top of this infrastructure. To the stock ticker, this looks like a boom+bust cycle. But really, it's a bust+boom cycle as far as the investors are concerned. Saudi Arabia don't give a hoot about transformers, but they know if they invest their large amounts of capital in the new infrastructure - huge models in expensive datacenters using cities worth of power, they can invest in the many new hyper profitable businesses on top. Also helps that it locks in demand for oil, I guess :P
A similar example for cloud would be the gajillion cloud SaaS, DevOps, cloud security whatever businesses that only exist today because the whole cloud infra segment ran unprofitable for a long time and created the infrastructure for all this.
These new businesses will not be multi hundred-billion dollar businesses, no. They will all be million dollar businesses. But you'll have a million such businesses. Everyone that's stuffing money into AI today is hoping that there will be a huge product layer [1] on top of this new infrastructure once it's all consolidated at a few major players allowing "infra" costs to drop considerably, which they can milk.
The stock of all these companies going high is just wall st making it easy for people to buy debt from this infrastructure. If they're indisciplined, they will co-mingle this debt with normal people's daily lives debt and sell it as one, and everything gets affected badly. If they're not, then the stock market will complain, but normal folks will be kind of OK.
[1] well, everyone has this same idea, meaning there also a lot of short term investors trying to make gains while openai/nvidia/etc are on their way up, sort of like greater fool investing, but let's ignore them for the purposes of this argument.
[Note] of course, whether cloud/ai/whatever are actually useful infra that deserves to be force-created is up for debate.. many disagree, not me though.
You're making a huge upfront unprofitable investment into something, so that a lot of insanely profitable investments can be made 10 years in the future, that uses the result of todays investment as infrastructure.
Whether it's as important as roads, all of that is not relevant from an economic standpoint -- I'm just explaining the rationale behind the investments today.
> These companies
It's useful to mentally think of it as an implicit "team effort" among all relevant, rich companies. Openai (just an example) themselves are being "allowed" by everyone else in the "team" to not make money, because everyone knows that at the end there will be a gajillion new product companies they can all make their money back on. Openai/stargate is in that sense just a "front" for this massive infrastructure investment. Sama/Microsoft themselves will make money that way, either by building those products (just think of the ai integrations in MS enterprise in 10 years...) or by investing in others building those products. Defense folks have investments into this for the same reason.
That's why I said it's similar to roads, no one expects to make money from roads. But on those nice roads amazon will create a 1-day delivery product, charge you 10$ a month for it, and investors will make money from that [1].
[1] yes yes this specific example is bunk and historically wrong, but wanted to drive home the parallel with a small example
I've been exclusively using Claude Sonnet 4 model in VSCode and so far I've used 90% of the premium quota at the end of the months. I can always use GPT4.1 or GPT5-mini for free if need be.
I'm curious if they have plans to improve edit prediction though. It's honestly kind of garbage compared to Cursor, and I don't think I'm being hyperbolic by calling it garbage. Most of the time it's suggestions aren't helpful, but the 10-20% of the time it is helpful is worth the cost of the subscription for me.
It obsoleted making Vim macros and multiline editing for example. Now you just make one change and the LLM can derive the rest; you just press tab.
It's interesting that the Cursor team's first iteration is still better than anything I've seen in their competitors. It's been an amazing moat for a year(?) now.
Copilot is the best value by far
It's based on qwen 2.5 coder 7B and you should be able to run it locally quite easily since it would only require about 8 GB of VRAM for the 8-bit version. Not sure if Zed supports this though, I'm not a Zed user myself.
The wording may sound AI generated but the gist of the comment is my true opinion
Why: LLMs are increasingly becoming multimodal, so an image "token" or video "token" is not as simple as a text token. Also, it's difficult to compare across competitors because tokenization is different.
Eventually prices will just be in $/Mb of data processed. Just like bandwidth. I'm surprised this hasn't already happened.
For autoregressive token-based multimodal models, image tokens are as straightforward as text tokens, and there is no reason video tokens wouldn’t also be. (If models also switch architecture and multimodal diffusion models, say, become more common, then, sure, a different pricing model more tied to actual compute cost drivers for that architecture are likely but... even that isn’t likely to be bytes.)
> Also, it's difficult to compare across competitors because tokenization is different.
That’s a reason for incumbents to prefer not to switch, though, not a reason for them to switch.
> Eventually prices will just be in $/Mb of data processed.
More likely they would be in floatint point operations expended processing them, but using tokens (which are the primary drivers for the current LLM architectures) will probably continue as long as the architecture itself is doninant.
In classical computing, there is a clear hierarchy: text < images <<< video.
Is there a reason why video computing using LLMs shouldn't be much more intensive and therefore costly than text or image output?
For text I just assume them to be word stems or more like work-family-members (cat-feline-etc).
For images and videos I guess each character, creature, idea in it is a token? Blue sky, cat walking around, gentleman with a top hat, multiplied by the number of frames?
No, for images, tokens would, I expect, usually be asymptotically proportional to the area of the image (this is certainly the case with input token for OpenAIs models that take image inputs; outputs are more opaque); you probably won’t have a neat one-to-one intuition for what one token represents, but you don’t need that for it to be useful and straightforward for understanding pricing, since the mathematical relationship of tokens to size can be published and the size of the image is a known quantity. (And videos conceptually could be like images with an additional dimension.)
For images, you take patches of the image (say 16x16 patches[1]), and then directly pass it into the FFN+transformer machinery[2]. As such, there is no vocabulary of tokens for images[3]. So, the billing happens per image patch. i.e, for large images, your cost will go up[2] Since it will have more px*py patches.
[1] x 3, due to RGB
[2] Upto a point, it gets downsamples to lower quality beyond a certain res. The downsampling happens in many ways... Qwen-VL uses a CNN, GPT iirc stuffs a downsampler after the embedding layer... As well as before. Anyways, they usually just take some average reduction by that downsampler and cut your billed tokens by that much in all these cases. OpenAIs bin-based billing is like this.
[3] Dall-E from way back when did have a discrete set of tokens and it mapped all patches of all images in the world to one from that, IIRC.
There is very little value that a company that has to support multiple different providers, such as Cursor, can offer on top of tailored agents (and "unlimited" subscription models) by LLM providers.
A tool that is good for everyone is great for no one.
Also, I think we're seeing the limits on "value" of a chat interface already. Now they're all chasing developers since there's a real potential to improve productivity (or sadly cut-costs) there. But even that is proving difficult.
I am really tempted to buy ChatGPT Pro, and probably would have if I lived in a richer country (unfortunetley purchase power parity doesn't equalize for tech products). The problem with Windsurf (and presumably Cursor and others) is that you buy the IDE subscription and then still have to worry about usage costs. With Codex/Claude Code etc., yeah, it's expensive, but, as long as you're within the usage limits, which are hopefully reasonable for the most expensive prices, you don't have to worry about it. AND you get the web and phone apps with GPT 5 Pro, etc.
I get that the pro plan has $5 of tokens and the pricing page says that a token is roughly 3-4 characters. However, it is not clear:
- Are tokens input characters, output characters, or both?
- What does a token cost? I get that the pricing page says it varies by model and is “ API list price +10%”, but nowhere does it say what these API list prices are. Am I meant to go to The OpenAI, Anthropic, and other websites to get that pricing information? Shouldn’t that be in a table on that page which each hosted model listed?
—
I’m only a very casual user of AI tools so maybe this is clear to people deep in this world, but it’s not clear to me just based on Zelda pricing page exactly how far $5 per month will get me.
It’s hard for me to conceptualise what a million tokens actually looks like, but I don’t think there’s a way around that aside from making proving some concrete examples of inputs, outputs, and the number of tokens that actually is. I guess it would become clearer after using it a bit.
I think Zed had a lot of good concepts where they could make paid AI benefits optional longer term. I like that you can join your devs to look at different code files and discuss them. I might still pay for Zed's subscription in order to support them long term regardless.
I'm still upset so many hosted models dont just let you use your subscription on things like Zed or JetBrains AI, what's the point of a monthly subscription if I can only use your LLM in a browser?
This is her another reason why CLI-based coding agents will win. Every editor out there trying to be the middle man between you and an AI provider is nuts.
https://agentclientprotocol.com/overview/introduction
I'm pretty sure that's only while it's in preview, just like they were giving away model access before that was formally launched. Get it while it's hot.
As a corporate purchaser, "bring your own key" is just about the only way we can allow our employees to stay close to the latest happenings in a rapidly moving corner of the industry.
We need to have a decent amount of trust in the model execution environment and we don't like having tons of variable-cost subscriptions. We have that trust in our corporate-managed OpenAI tenant and have good governance and budget controls there, so BYOK lets us have flexibility to put different frontends in front of our trusted execution environment for different use cases.
Just this other day I tried using it for something it sort of advertised itself as the superior thing, which was to load this giant text file I had instantly and let me work on it.
I then tried opening this 1GB text file to do a simple find/replace on it only to find macOS run out of system memory with Zed quickly using 20gb of memory for that search operation.
I then switched to vscode, which, granted opened it in a buffered sort of way and limited capability, but got the job done.
Maybe that was a me issue I don’t know, but aside from this one-off, it doesn’t have a good extensions support in the community for my needs yet. I hope it gets there!
TextEdit may be worth looking into as well? Haven’t tested it for large files before.
In my experience, BBEdit will open files that kill other editors: "Handling large files presents no intrinsic problems for BBEdit, though some specific operations may be limited when dealing with files over 2GB in size."
But, you can go faster depending on your usecase:
- If you're trying to manually look through the file, use `less`. You can scroll up and down, go quickly to the top and bottom of the file, and also search the file for strings quickly
- If you already know the string in the file that you're looking for, use ripgrep
- If you're trying to do a search and replace, and you already know what the strings are, use sed. (macos' built-in sed isn't good, so get the proper gnu coreutills through homebrew, and you can access the good sed through `gsed`)
In fact, I believe that is what vscode uses.
Apparently there’s been a neglected issue about bringing it to Zed. https://github.com/zed-industries/zed/issues/4560
Should be noted that the linked post is almost 15 years old at this point too, so perhaps not the most up to date either.
You can actually disable it in the settings if you want it to try and render the entire thing at once
My JetBrains IDEs (RustRover, Goland) probably would have choked out too.
You can open the kernel in CLion. Don't expect the advanced refactoring features to work, but it can deal with a ~40 million lines project folder for example
They'll index for a long time on huge codebases, but I only go through that like once a month max, I just have the editors as always open
I sometimes worry if we are moving too fast for no reason. Some things are becoming standards in an organic way but they feel suboptimal in my own little bias bubble corner.
Maybe I am getting old and struggling to adapt to the new generation way of getting work done, but I have a gut feeling that we need to revisit some of this stuff more deliberately.
I still see Agents as something that will be more like a background thread that yields rather than a first class citizen inside the Editor you observe as it goes.
I don't know about you, but I feel an existential dread whenever I prompt an Agent and turn into a vegetable watching it breathe. — am I using it wrong? Should I be leaving and coming back later? Should I pick a different file and task while it's doing its thing?
My complaints:
No Claude code hooks support at the time of this writing. As someone who leverages this somewhat heavily this is why I don’t really use it all the time. I actually find it to be somewhat of a feature at times because I can simply run the thing through Zed if I want to temporarily run with no hooks.
Performance is noticeably degraded presumably because of the “ACP” protocol they invented. I usually work either directly in Claude terminal window using its edit tools or using repoprompt’s mcp editor tools and both are noticeably faster than running in zed.
What seems to be memory leaks in the agent window causes sluggish performance, especially when scrolling. It’s not bad enough to make it unusable, but for an editor whose main advertisement is speed it feels particularly painful.
https://github.com/zed-industries/zed/issues/11676
https://github.com/zed-industries/zed/issues/7992
https://github.com/zed-industries/zed/issues/4334
Font rendering should be the most important feature of a text editor.
https://zed.dev/about
A downside that comes with the territory of building the rendering pipeline from scratch is needing to work through the long tail of complex and tradeoff-heavy font rendering issues on different displays, operating systems, drivers, etc.
I know it's taking awhile to get through, but I agree it's important!
Every other piece of software in my toolbox is open-source. The scenarios I've described happened to some of those tools, and I maintain my own forks. Currently, Sublime is the single point of failure on my toolbox.
I would buy a source code license if I could.
Which ones?
https://zed.dev/releases/stable
(And that's even with almost none of the work on the massive Windows project being included in the changelog!)
The way I see it, we’re sort of living in a world where UX is king. (Looking at you Cursor)
I feel like there’s a general sentiment where folks just want a sense of home with their tools more than anything. Yes they need to work, but they also need to work for you in your way. Cursor reinvented autocomplete with AI and that felt like home for most, what’s next? I see so much focus on Agents but to me personally that feels more like it should live on the CI/CD layer of things. Editors are built for humans, something isn’t quite there yet, excited to see how it unfolds.
Saying that, token-based pricing has misaligned incentives as well: as the editor developer (charging a margin over the number of tokens) or AI provider, you benefit from more verbose input fed to the LLMs and of course more verbose output from the LLMs.
Not that I'm really surprised by the announcement though, it was somewhat obviously unsustainable
presumably everyone is just aiming or hoping for inference costs to go down so much that they can do a unlimited-with-tos like most home Internet access etc, because this intermediate phase of having to count your pennies to ask the matrix multiplier questions isn't going to be very enjoyable or stable or encourage good companies to succeed.
https://news.ycombinator.com/item?id=45333425
I suspect I’m not alone on this. Zed is not the editor for hardcore agentic editing and that’s fine. I will probably save money on this transition while continuing to support this great editor for what it truly shines at: editing source code.
But as a mostly claude max + zed user happy to see my costs go down.
37 more comments available on Hacker News