Claude Code on the Web
Posted3 months agoActive3 months ago
anthropic.comTechstoryHigh profile
excitedmixed
Debate
60/100
AI Coding AssistantsClaude CodeSoftware Development
Key topics
AI Coding Assistants
Claude Code
Software Development
Anthropic released Claude Code on the web, a web-based interface for their AI coding assistant, sparking discussion among developers about its features, limitations, and potential impact on their workflow.
Snapshot generated from the HN discussion
Discussion Activity
Very active discussionFirst comment
23m
Peak period
119
0-6h
Avg / period
22.9
Comment distribution160 data points
Loading chart...
Based on 160 loaded comments
Key moments
- 01Story posted
Oct 20, 2025 at 2:12 PM EDT
3 months ago
Step 01 - 02First comment
Oct 20, 2025 at 2:35 PM EDT
23m after posting
Step 02 - 03Peak activity
119 comments in 0-6h
Hottest window of the conversation
Step 03 - 04Latest activity
Oct 22, 2025 at 11:53 PM EDT
3 months ago
Step 04
Generating AI Summary...
Analyzing up to 500 comments to identify key contributors and discussion patterns
ID: 45647166Type: storyLast synced: 11/22/2025, 11:00:32 PM
Want the full context?
Jump to the original sources
Read the primary article or dive into the live Hacker News thread when you're ready.
I wonder if a shared Claude Code instance has the same effect?
Doesn't CC sometimes take twenty, thirty minutes to return an attempt? I wouldn't know, because I'm not rich and my employer has decided CC is too expensive, but I wonder what you would do with your pair programming partner while you wait.
The bosses would like to think we'd start working on something else, maybe start up a different Claude instance, but can you really change contexts and back before the first one is done? You AND your partner?
Nah, just go play air hockey until your boss realizes Claude is what they need, not you.
This is a depressing comment.
I am apprehensive about the future of software development in this milieu. I've pumped out a ~15,000 line application heavily utilizing Claude Code over a few days that seems to work, but I don't know how much to trust it.
Certainly part of the fun of building something was missing during that project, but it was still fun to see something new come to life.
Maybe I should say I am cautiously optimistic but also concerned: I don't feel confident in the best ways to use these tools to build good software, and I'm not sure exactly what skills are useful in order to get them there.
Can I ask what you built?
There was a post recently where someone linked to: https://simplyexplained.com/blog/how-i-built-an-nfc-movie-li...
and I thought the project was amazing, but I didn't like how the IDs were managed in yml, so I built this to make it more dynamic. I plan to add support for other smart home automations with it as well as more streaming services.
One of the features I really like about it is it makes it easy to print and cut out stickers to slap on the NFC cards for playing media.
My toddler loves it so far and one of his friend's has asked me to make one for him as well
in fact, apple made it harder for apps to take payments from its users in the past than others.
https://gs.statcounter.com/os-market-share/mobile/worldwide
So maybe they rather please the home market, I guess.
[1]: https://www.techtarget.com/searchcio/definition/Instagram
Android is simply a much worse platform to make money on. Users spend <25% as much as iOS users. Why would they prioritize that?
It is like trying to make a living selling games to macOS users.
Why would they care about prioritizing users who spend much less? Android pays <25% per user. You need a LOT more than 70% to make that worth prioritizing. Those users are just going to eat up free tier resources without paying. It's borderline parasitic from a business perspective.
Android users are more likely to be useful for spreading word-of-mouth reputation to Apple platform users, than they are as direct spenders. Just another reason to ensure Apple platform features don't trail Android.
Globally in dollars spent, not human heads. iOS is over 2x larger than Android globally, and the gap is widening year over year.
iOS spending growth outpaces Android, which even shrunk during covid while iOS spending continued to grow
https://api.backlinko.com/app/uploads/2024/03/iphone-vs-andr...
Anthropic makes money off product sales, not ad revenue, so wallets count more than eyes for this. Free users who are less than 25% as likely to spend are a burden not to be prioritized for a product business with free tier access. They need to spend much more to get a paying user on Android.
If Android were the bigger market, they'd prioritize it
https://www.reddit.com/r/applesucks/comments/1k6m2fi/why_do_...
- Claude Code has been added to iOS
- Claude Code on the Web allows for seamless switching to Claude Code CLI
- They have open sourced an OS-native sandboxing system which limits file system and network access _without_ needing containers
However, I find the emphasis on limiting the outbound network access somewhat puzzling because the allowlists invariably include domains like gist.github.com and dozens of others which act effectively as public CMS’es and would still permit exfiltration with just a bit of extra effort.
Using a proxy through the `HTTP_PROXY` or `HTTPS_PROXY` environment variables has its own issues. It relies on the application respecting those variables—if it doesn’t, the connection will simply fail. Sure, in this case since all other network connection requests are dropped you are somewhat protected but then an application that doesn't respect them will just not work
You can also have some fun with `DYLD_INSERT_LIBRARIES`, but that often requires creating shims to make it work with codesigned binaries
[0] https://github.com/slopus/happy/
Just to be clear, I'm excited for the capability to use Claude Code entirely within the browser. However, I've heard reports of Max users experiencing throttled usage limits in recent months, and am concerned as to whether this will exacerbate that issue or not.
EDIT: I had meant defer which is the first time I've made a /r/boneappletea in awhile
I'm using Claude Code locally a lot, occasionally with a couple parallel session.
I was very happy when they made the GitHub Action - I used it quite a bit, but in practice I got frustrated that I effectively only get a single back-and-forth out of it, I can't really "continue the conversation without losing context" - Sure, I can respond to it in the PR it makes, but that will be a fresh session with a fresh empty context.
So, as much as I don't like moving out of my standard development workflow with my tools, I think this could be quite useful. The ability to interrupt and/or continue a conversation should be very nice.
My main worry is - usually my unit tests and integration tests rely on a postgres database running on the machine, and it's not obvious to me if I can spin that up here?
With Happy, I managed to turn one of these Claude Code instances into a replacement for Claude that has all the MCP goodness I could ever want and more.
[0]: https://happy.engineering/
without 200% price increase in 3 years, there's no way any of these AI companies will survive.
It's really solid. It's effectively a web (and native mobile) UI over Claude Code CLI, more specifically "claude --dangerously-skip-permissions".
Anthropic have recognized that Claude Code where you don't have to approve every step is massively more productive and interesting than the default, so it's worth investing a lot of resources in sandboxing.
Here's an example from this morning, getting CUDA working on a NVIDIA Spark: https://simonwillison.net/2025/Oct/20/deepseek-ocr-claude-co...
I have a few more in https://github.com/simonw/research
I write code on my phone a lot using ChatGPT voice mode though!
I predict individual offices[1] will be more popular as a choice for startups.
[0]: https://karabiner-elements.pqrs.org
[1]: https://queue.acm.org/detail.cfm?id=1281887
So increasingly I let it run, and then review when it stops, and then I give it a proper review, and let it run until it stops again. It wastes far less of my time, and finishes new code much faster. At least for the things I've made it do.
Their "restricted network access" setting looks questionable to me - it allow-lists a LOT of stuff: https://docs.claude.com/en/docs/claude-code/claude-code-on-t...
If you configure your own allow-list you can restrict to just domains that you trust - which is enforced by a separate HTTP/HTTPS proxy, described here: https://docs.claude.com/en/docs/claude-code/claude-code-on-t...
You use firewalls to prevent code running inside the container from opening network connections to anywhere else. The harness that surrounds it can still be made accessible via the network.
I tend to agree. There’s an opportunity to make it easy to have Claude be able to test out workflows/software within Debian, RPM, Windows, etc… container and VM sandboxes. This could be helpful for users that want to release code on multiple platforms and help their own training and testing, which they seem to be heavily invested in given all the “How Am I doing?” prompts we’re getting.
I don't have any relationship with any AI company, and honestly I was rooting for Anthropic, but Codex CLI is just way way better.
Also Codex CLI is cheaper than Claude Code.
I think Anthropic are going to have to somehow leapfrog OpenAI to regain the position they were in around June of this year. But right now they're being handed their hat.
Kinda sick of Codex asking for approval to run tests for each test instance
https://zed.dev/blog/codex-is-live-in-zed
https://xenodium.com/introducing-acpel
Also while not quite as smart, it's a better pair programmer. If I'm feeling out a new feature and am not sure how exactly it should work yet, I prefer to work with Sonnet 4.5 on it. It typically gives me more practical and realistic suggestions for my codebase. I've noticed that GPT-5 can jump right into very sophisticated solutions that, while correct, are probably not appropriate.
Sonnet 4.5: "Why don't we just poll at an interval with exponential backoff?"
GPT-5: "The correct solution is to include the data in the event stream...let us begin by refactoring the event system to support this..."
That said, if I do want to refactor the event system, I definitely want to use Codex for that.
Moving a bunch of verbose templated HTML around while watching results on a devserver? Haiku all day. It's a bonus that it's cheaper, but the real treat is its speed.
Adding a feature whose planning will involve intake of several files? Sonnet.
Working specifically on 'copy' or taste issues? Still I tend to prefer Opus here.
Individual experiences may vary!
Sonnet 4 was a coding companion, I could see what it was doing and it did what I asked.
Sonnet 4.5 is like Opus, it generates massive amounts of "helper scripts" and "bootstrap scripts" and all kinds of useless markdown documentation files even for the tinies PoC scripts.
The generation of helper, markdown and bootstrap scripts are very dependent on your harness.
It's as if there are two vendors saying they can give up incredibly superpowers for an affordable price, and only one of them actually delivers the full package. The other vendor's powers only work on Tuesdays, and when you're lucky. With that situation, in an environment as competitive as things currently stand, and given the trajectory we're on, Claude is an absolute non-starter for me. Without question.
Codex says “This is a lot of work, let me plan really well.”
Claude says “This is a lot of work, let me step back and do something completely different that you didn’t ask for.”
Edit: I'd like to reply to this comment in particular but can't in a threaded reply, so will do that here: "Ah, super secret problem domains that have been thoroughly represented in the LLM training data. Nice."
This exhibits a fundamental misunderstanding of why coding agents powered by LLMs are such a game changer.
The assumption this poster is making is that LLMs are regurgitating whole cloth after being trained on whole cloth.
This is a common mistake among lay people and non-practitioners. The reality is that LLMs have gained the ability to program, by learning from the code of others. Much like a human would learn from the code of others, and then be able to create a completely novel application.
The difference between a human programmer an an agentic coder is that the agent has much broader and deeper expertise across more programming languages, and understands more design patterns, more operating systems, more about programming history, etc etc and it uses all this knowledge to fulfill the task you've set it to. That's not possible for any single human.
It's important for the poster to take two realities on board: Firstly, agentic coding agents are not regurgitating whole cloth from whole cloth. Instead they are weaving new creations because they have learned how to program. Secondly, agentic coding agents have broader and deeper knowledge than any human that will ever exist, and they never tire, and their mood and energy level never changes. In fact that improves on a continuous basis as the months go by and progress continues. This means we can, as individual practitioners or fast moving teams, create things that were never before possible for us without raising huge amounts of money and hiring large very expensive teams, and then having the overhead of lining everyone up behind a goal AND dealing with the human issues that arise, including communication overhead.
This is a very exciting time. Especially if you're curious, energetic, and are willing to suspend disbelief to go and take a look.
The easier threading-focused approach to the conversation might be to add the additional comment as an edit at the end of the original and reply to the child https://news.ycombinator.com/item?id=45649068 directly. Of course, I've broken the ability to do that by responding to you now about it ;).
Talk is cheap, and we're tired of hearing people tell us how it's enabling them to make incredible software without actually demonstrating it. Your words might be true, or they might be just another over-exaggeration to throw on the pile. Without details we have no way of knowing, and so many make the empirically supported choice.
I recently vibe coded a video analysis pipeline with some related arduino-driven machine control. It was work to prototype an experience on some 3D printed hardware I’ve been skunking out.
By describing the pipeline and filters clearly, I had the analysis system generating useful JSON in an hour or so, including machine control simulation, all while watching TV and answering emails/slacks. Notable misses were that the JSON fields were inconsistent, and the python venvs were inconsistent for the piped way that I wanted the system to operate with.
Small fixes.
Then I wired up the hardware, and the thing absolutely crapped itself, swapping libraries, trying major structural changes, and creating two whole new copies of the machine control host code (asking me each time along the way). This went on for more than three hours, with me debugging the mess for about 20 minutes before resorting to 1) ChatGPT, which didn’t help, followed by 2) a few minutes of good old fashioned googling on serial port behavior on Mac, which, with an old sitting on the shelf Uno R3, meant that I needed to use the cu.* ports instead of tty.*, something that Claude Code had buried deeply in a tangle of files.
Curious about the failure, I told Claude Code to stop being an idiot and use a web browser to go research the problem of specifically locking up on the open operation. 30 seconds later, and with some reflective swearing from Opus 4.1, which I appreciate, I had the code I should have had 3 hours prior (along with other garbage code to clean up).
For my areas of sensing, computer vision, machine learning, etc., these systems are amazingly helpful if the algorithms can be completely and clearly described (e.g., Kalman filter to IoU, box blur followed by subsampling followed by split exponential filtering, etc.).
Attempts to let the robots work complex pipelines out for themselves haven’t gone as well for me.
Well, that, and it’s just a bit annoying to claim that you’ve found some amazing new secret but that you refuse to share what the secret is. It doesn’t contribute to an interesting discussion whatsoever.
What honesty? We're not at the point of "the Godfather was a good/bad movie", we're at "no, trust, there's a really good movie called the Godfather".
Your honesty means nothing for an issue that isn't about taste or mostly subjectivness. How useful AI is and in what way is a technical discussion where the meat of the subject matter is. You've shared nothing on that front. I am not saying you have to, but like obviously people are going to downvote you - not because they might agree/disagree but because it's contributed nothing different from every other ai-hype man selling a course or something.
The original post states "I am seeing Codex do much better than Claude Code", and when asked for examples, you have replied with "I don't have time to give you examples, go do it yourself, its obvious."
That is clearly going to rub folks (anyone) the wrong way. This refrain ("Wheres the data?") pops up frequently on HN, if its so obvious, giving 1 prompt where Codex is much greater than Claude doesn't seem like a heavy lift.
In absence of such an example, or any data, folks have nothing to go on but skepticism. Replying with such a polarizing comment is bound to set folks off further.
Claude struggled long time and still didn’t find.
They can save you some time by doing some fairly complex basic tasks that you can write in plain language instead of coding. To get good results you really need a lot of underlying knowledge yourself and essentially, I think of it as a translator. I can write a program in very good detail using normal language and then the LLM can convert it to code with reasonable accuracy.
I haven't been able to depend on it to do anything remotely advanced. They all make up API endpoints or methods or fill in data with things that simply don't exist, but that's the nature of the model.
What I'm saying was to compare my experience with Claude code vs Codex with GPT-5. CC's better than codex in my experience, contrary to GP's comment.
Claude, especially 4.5 Sonnet, is a lot nicer to interact with, so it may be a better choice in cases where you are co-working with the agent. Its output is nicer, it "improvises" really well even if you give it only vague prompts. That's valueable for interactive use.
But for delegating complete tasks, Codex is far better. The benchmarks indicate that, as do most practicioners I talk to (and it is indeed my own experience).
In my own work, I use Codex for complete end-to-end tasks, and Claude Sonnet for interactive sessions. They're actually quite different.
The output of codex is also not as great. Codex is great at the planning and investigation portion but sucks at execution and code quality.
Then I do a double take and re-read the summary message and realize that it pulled a "and then draw the rest of the owl", seemingly arbitrarily picking and choosing what it felt like doing in that session and what it punted over to "next steps to actually get it running".
Claude is more prone to occasional "cheating" with mocked data or "tbd: make this an actual conditional instead of hardcoded If True" stuff when it gets overwhelmed which is annoying and bad. But it at least has strong task adherence for the user's prompt and doesn't make me write a lawyer-esque contract to avoid any loopholes Codex will use to avoid doing work.
Sora 4.5 tends to randomly hallucinate odd/inappropriate decisions and goes to make stupid changes that have to be patched up manually.
I find that Codex generally requires me to remove code to get to what I want, whereas Claude I tend to use what it gives me and I add to it. Whether this is from additional prompting or from manual typing, i just find that codex requires removal to get to desired state, and Claude requires adding to get to desired state. I prefer adding incrementally than removing.
100% of the time Codex has done a far better job according to both Codex and Claude Code when reviewing. Meeting all the requirements where Claude would leave things out, do them lazily or badly and lose track overall.
Codex high just feels much smarter and more capable than Claude currently and even though it's quite a bit slower, it's work that I don't have to go over again and again to get it to the standards I want.
Now, I will concede that for non-coding long-horizon tasks, GPT-5 is marginally worse than Sonnet 4.5 in my own scaffolds. But GPT-5 is cheaper, and Sonnet 4.5 is about 2 months newer. However, for coding in a CLI context, GPT-5-Codex is night-and-day better. I don't know how they did it.
Also CC tool usage is so much better! Many, many times I’ve seen Codex writing a python script to edit a file which seems to bypass the diff view so you don’t really know what’s going on.
Pretty cool though! Will need to use it for some more isolated work/code edits. Claude Code is now my workhorse for a ton of stuff including non-coding work (esp. with the right MCPs)
230 more comments available on Hacker News