Key Takeaways
1. Prompt the agent
2. The agent gets too work
3. Review the changes
4. Repeat
This can speed up your process significantly, and the UI clearly shows the changes + some other cool features
I built my own coding tool in Rust because of VSCode-based forks Electron app flaws. You would think they could build a memory efficient application with the resources they have but it literally needs a top of the line MacBook to run smoothly.
I heard of Cline and Aider, but didn't try anything.
The days of copying and pasting from Stack Overflow will not be forgotten — they will be honored by our laid off forefathers.
2) Feed it to Opus 4.5 in planning mode and ask it to follow up with any questions or gaps and have it produce a final implementation plan.md. Review this, tweak it, remove any fluff and get it down to bare bones.
3) Run the plan.md through a fresh Agentic session and see what the output is like. Where it’s not quite correct add those clarifications and guardrails into the original plan.md and go again with step 3.
What I absolutely would NOT do is ask for fixes or changes if it does not one-shot it after the first go. I would revise plan.md to get it into a state where it gets you 99% of the way there in the first go and just do final cleanup by hand. You will bang your head against the wall attempting to guide it like you would a junior developer (at least for something like this).
I very often, when reviewing code, think of better abstractions or enhancements and just continue asking for refactors inline. Very very rarely does the model fall off the rails.
I suppose if your unit of work was very large you might have more issues perhaps? Generally though, large units of work have other issues as well.
I tend to create a very high level plan, then code systems, then document the resulting structure if I need documentation.
This works well for very iterative development where I'm changing contracts as I realize the weak point of the current setup.
For example, I was using inheritence for specialized payloads in a pipeline, then realized if I wanted to attach policies/behaviours to them as they flow through the pipeline, I was better off just changing the whole thing to a payload with bag of attached aspects.
Often those designs are not obvious when making the initial architectural plan. So I approach development using AI in much the same way: Generate code, review, think, request revision, repeat.
This really only applies when establishing architecturs though, which is generally the hardest part. Once you have an example, then you can mostly one-shot new instances or minor enhancements.
Things I personally find work well.
1. Chat through with the AI first the feature you want to build. In codex using vscode I always switch to chat mode, talk through what I am trying to achieve and then once myself and the AI are in "agreement" switch to agent mode. Google's antigravity sort of does this by default and I think it's probably the correct paradigm to use.
2. Get the basics right first. It's easy for the AI to produce a load of slop, but using my experience of development I feel I am (sort of) able to guide the AI in advance in a similar way to how I would coach junior developers.
3. Get the AI to write tests first. BDD seems to work really well for AI. The multiplayer game I was building seemed to regress frequently with just unit tests alone, but when I threw cucumber into the mix things suddenly got a lot more stable.
4. Practice, the more I use AI the more I believe prompting is a skill in itself. It takes time to learn how to get the best out of an Agent.
What I love about AI is the time it gives me to create these things. I'd never been able to do this before and I find it very rewarding seeing my "work" being used by my kids and fellow nostalgia driven gamers.
This would have been my tip, as well.
Talk to others who are good with these tools to learn from what they're doing and read blogs/docs/HN for ideas, but most importantly, make time for yourself on a daily/weekly/monthly/whatever basis to practice with the tool.
It's taken me about a year of consistent practice to feel comfortable with LLM coding. It just takes time, like learning any other technology.
I program mostly in VBA these days (a little problematic as is a dead leanguage since 2006 and even then it was niche) and I have never recived a correct high level ""main"" sub but the AIs are pretty good at doing small subs I then organize.
And yes, telling me where I make errors, they are pretty good at that
At the end of the day I want reliability and there is no way I can't do what without full review.
The funny thing is that they try to use the """best practices""" of coding where you would reasonably want to NOT have them.
The more specific and concise you are, the easier it will be for the searcher. Also, the less modification, the better, because the more you try to move away from the data in the training set, the higher the probability of errors.
I would do it like this:
1. Open the project in Zed 2. Add the Gemini CLI, Qwen code, or Claude to the agent system (use Gemini or Qwen if you want to do it for free, or Claude if you want to pay for it) 3. Ask it to correct a file (if the files are huge, it might be better to split them first) 4. Test if it works 5. If not, try feeding the file and the request to Grok or Gemini 3 Chat 6. If nothing works, do it manually
If instead you want to start something new, one-shot prompting can work pretty well, even for large tasks, if the data is in the training set. Ultimately, I see LLMs as a way to legally copy the code of other coders more than anything else
You can also ask it, "do you have any questions?" I find that saying "if you have any questions, ask me, otherwise go ahead and build this" rarely produces questions for me. However, if I say "Make a plan and ask me any questions you may have" then it usually has a few questions
I've also found a lot of success when I tell Claude Code to emulate on some specific piece of code I've previously written, either within the same project or something I've pasted in
I use a keyboard shortcut to start and stop recording and it will put the transcription into the clipboard so I can paste into any app.
It's a huge productivity boost - OP is correct about not overthinking trying to be that coherent - the models are very good at knowing what you mean (Opus 4.5 with Claude Code in my case)
I am using Whisper Medium. The only problem I see is that at the end of the message it sometimes puts a bye or a thank you which is kind of annoying.
Also haven't tried but on latest MacOS 26 apple updated their STT models so their build in voice dictation maybe is good enough.
If you want local transcription, locally running models aren't quite good enough yet.
Superwhisper offers some AI post-processing of the text (e.g., making nice bullets or grammar), but this doesn't seem necessary and just makes things a bit slower
It's incredibly cheap and works reliably for me.
I have got it to paste my voice transcriptions into Chrome (Gemini, Claude, ChatGPT) as well as Cursor.
My main gripe is when the recording window loses focus, I haven't found a way to bring it back and continue the recorded session. So occasionally I have to start from scratch, which is particularly annoying if it happens during a long-winded brain dump.
Claude on macOS and iOS have native voice to text transcription. Haven't tried it but since you can access Claude Code from the apps now, I wonder if you use the Claude app's transcription for input into Claude Code.
Yeah, Claude/ChatGPT/Gemini all offer this, although Gemini's is basically unusable because it will immediately send the message if you stop talking for a few seconds
I imagine you totally could use the app transcript and paste it in, but keeping the friction to an absolute minimum (e.g., just needing to press one hotkey) feels nice
This doesn't feel relatable at all to me. If my writing speed is bottlenecked by thinking about what I'm writing, and my talking speed is significantly faster, that just means I've removed the bottleneck by not thinking about what I'm saying.
GRRM: How do you write so many books?... Don't you ever spend hours staring at the page, agonizing over which of two words to use, and asking 'am I actually any good at this?'
SK: Of course! But not when I'm writing.
And your 5 minute prompt just turned I to 1/2 hour of typing
With voice you get on with it, and then start iterating, getting Claude to plan with you.
Not been impressed with agentic coding myself so far, but I did notice that using voice works a lot better imo, keeping me focused on getting on with letting the agent do the work.
I've also found it good for stopping me doing the same thing in slack messages. I ramble my general essay to ChatGPT/Claude, get them to summarize rewrite a few lines in my own voice. Stops me spending an hour crafting a slack message and tends to soften it.
In principle I don't see why they should have different amounts of thought. That'd be bounded by how much time it takes to produce the message, I think. Typing permits backtracking via editing, but speaking permits 'semantic backtracking' which isn't equivalent but definitely can do similar things. Language is powerful.
And importantly, to backtrack in visual media I tend to need to re-saccade through the text with physical eye motions, whereas with audio my brain just has an internal buffer I know at the speed of thought.
Typed messages might have higher _density_ of thought per token, though how valuable is that really, in LLM contexts? There are diminishing returns on how perfect you can get a prompt.
Also, audio permits a higher bandwidth mode: one can scan and speak at the same time.
In either case, different strokes for different folks, and what ultimately matters is whether you get good results. I think the upside is high, so I broadly suggest people try it out
My regular workflow is to talk (I use VoiceInk for transcription) and then say “tell me what you understood” — this puts your words into a well structured format, and you can also make sure the cli-agent got it, and expressing it explicitly likely also helps it stay on track.
My go-to prompt finisher, which I have mapped to a hotkey due to frequent use, is "Before writing any code, first analyze the problem and requirements and identify any ambiguities, contradictions, or issues. Ask me to clarify any questions you have, and then we'll proceed to writing the code"
It's like a reasoning model. Don't ask, prompt 'and here is where you come up with apropos questions' and you shall have them, possibly even in a useful way.
I would open a chat and refactor the template together with cursor: I would tell it what I want and if I don’t like something, I would help it to understand what I like and why. Do this for one route and when you are ready, ask cursor to write a rules file based on the current chat that includes the examples that you wanted to change and some rationale as to why you wanted it that way.
Then in the next route, you can basically just say refactor and that’s it.
IMO, I found those specific example tasks to be better handled by my IDE's refactoring features, though support for that is going to vary by project/language/IDE. I'm still more of a ludite when it comes to LLM based development tools, but the best case I've seen thus far is small first bites out of a big task. Working on an older no-tests code base recently, it's been things like setting up 4-5 tests that I'll expand on into a full test suite. You can't take more than a few "big" bites out of a task before you have 0 context as to what direction the vector soup sloshed in.
So, in terms of carpentry, I don't want an LLM framer who's work I need to build off of, but an LLM millworker handing me the lumber is pretty useful.
In terms of ai assisted programming. I microanage my ai. Give it specific instructions with single steps. Don't really let it build ehoe files by itself as it usually makes a mess of things, bit it's useful when doing predictable changes and marginally faster than doing it manually.
Open up cursor-agent to make the repo scaffolding in an empty dir. (build system, test harness, etc. )
Open up cursor or Claude code or whatever and just go nuts with it. Remember to follow software engineering best practices (one good change with tests per commit)
It's important (though often surprisingly hard!) to remember it's just a tool, so if it's not doing things the way you want, start over with something else. Don't spend too much time on a lost cause.
1. If there is anything Claude tends to repeatedly get wrong, not understand, or spend lots of tokens on, put it in your CLAUDE.md. Claude automatically reads this file and it’s a great way to avoid repeating yourself. I add to my team’s CLAUDE.md multiple times a week.
2. Use Plan mode (press shift-tab 2x). Go back and forth with Claude until you like the plan before you let Claude execute. This easily 2-3x’s results for harder tasks.
3. Give the model a way to check its work. For svelte, consider using the Puppeteer MCP server and tell Claude to check its work in the browser. This is another 2-3x.
4. Use Opus 4.5. It’s a step change from Sonnet 4.5 and earlier models.
Hope that helps!
> I add to my team’s CLAUDE.md multiple times a week.
How big is that file now? How big is too big?You can also break up your CLAUDE.md into smaller files, link CLAUDE.mds, or lazy load them only when Claude works in nested dirs.
This is the meat of it:
## Code Style (See JULIA_STYLE.md for details)
- Always use explicit `return` statements
- Use Float32 for all numeric computations
- Annotate function return types with `::`
- All `using` statements go in Main.jl only
- Use `error()` not empty returns on failure
- Functions >20 lines need docstrings
## Do's and Don'ts
- Check for existing implementations first
- Prefer editing existing files
- Don't add comments unless requested
- Don't add imports outside Main.jl
- Don't create documentation unless requested
Since Opus 4.0 this has been enough to get it to write code that generally follows our style, even in Julia, which is a fairly niche language.And thank you for your work!! I focus all of my energy on helping families stay safe online, I make educational content and educational products (including software). Claude Code has helped me amplify my efforts and I’m able to help many more families and children as a result. The downstream effects of your work on Claude Code are awesome! I’ve been in IT since 1995 and your tools are the most powerful tools I’ve ever used, by far.
I am currently working on a new slash command /investigate <service> that runs triage for an active or past incident. I've had Claude write tools to interact with all of our partner services (AWS, JIRA, CI/CD pipelines, GitLab, Datadog) and now when an incident occurs it can quickly put together an early analysis of a incident finding the right people to involve (not just owners but people who last touched the service), potential root causes including service dependency investigations.
I am putting this through it's paces now but early results are VERY good!
This concerns me because fighting tooling is not a positive thing. It’s very negative and indicates how immature everything is.
Often the challenge is users aren't interacting with Claude Code about their rules file. If Claude Code doesn't seem to be working with you ask it why it ignore a rule. Often times it provides very useful feedback to adjust the rules and no longer violate them.
Another piece of advice I can give is to clear your context window often! Early in my start in this I was letting the context window auto compact but this is bad! Your model is it's freshest and "smartest" when it has a fresh context window.
@AGENTS.mdI feel like when I do plan mode (for CC and competing products), it seems good, but when I tell it to execute the output is not what we planned. I feel like I get slightly better results executing from a document in chunks (which of course necessitates building the iterative chunks into the plan).
Profilerating .md files need some attention though.
You can also use it in conjunction with planning mode—use the documents to pin everything down at a high-to-medium level, then break off chunks and pass those into planning mode for fine-grained code-level planning and a final check over.
yes the executor only needs the next piece of the plan.
I tend to plan in an entirely different environment, which fits my workflow and has the added benefit of providing a clear boundary between the roles. I aim to spend far more time planning than executing. if I notice getting more caught up in execution than I expected, that's a signal to revise the plan.
This drives up price faster than quality though. Also increases latency.
I used to spend $200+ an hour on a single developer. I'm quite sure that benevolence was a factor when they submitted me an invoice, since there is no real transparency if I was being overbilled or not or that the developer acted in my best interest rather than theirs.
Do you ever feel bad for basically robbing these poor people blind? They're clearly losing so much money by giving you $1800 in FREE tokens every month. Their business can't be profitable like this, but thankfully they're doing it out of the goodness of their hearts.
My current understanding is that it’s for demos and toy projects
You don't just YOLO it. You do extensive planning when features are complex, and you review output carefully.
The thing is, if the agent isn't getting it and you feel like you need to drop down and edit manually, agents are now good enough to do that with nearly 100% reliability if you are specific enough about what you want to do. So the question isn't so much whether to use an agent or edit code manually—it's what level of detail you are working at with the agent. There are still times where it's easier to edit things manually, but you never really need to.
And it makes sense. For most coding problems the challenge isn’t writing code. Once you know what to write typing the code is a drop in the bucket. AI is still very useful, but if you really wanna go fast you have to give up on your understanding. I’ve yet to see this work well outside of blog posts, tweets, board room discussions etc.
Understanding of the code in these situation is more important than the code/feature existing.
Agents make mistakes which need to be corrected, but they also point out edge cases you haven’t thought of.
I think the reality is a lot of code out there doesn’t need to be good, so many people benefit from agents etc.
The few times I've done that, the agent eventually faced a problem/bug it couldn't solve and I had to go and read the entire codebase myself.
Then, found several subtle bugs (like loading a 2GB file in memory as opposed to streaming). Eventually ended up refactoring most of it.
which might be fine if you're doing proof of concept or low risk code, but it can also bite you hard when there is a bug actively bleeding money and not a single person or AI agent in the house that knows how anything work
One other feature with CLAUDE.md I’ve found useful is imports: prepending @ to a file name will force it to be imported into context. Otherwise, whether a file is read and loaded to context is dependent on tool use and planning by the agent (even with explicit instructions like “read file.txt”). Of course this means you have to be judicial with imports.
2. Tell it you want to refactor the code to achieve goal Z. Tell it to take a look and tell you how it will approach this. Consider showing it one example refactor you've already done (before and after).
3. Ask it to refactor one thing (only) and let you look at what it did.
4. Course correct if it didn't do the right thing.
5 Repeat.
Basically a good multiplier, not a replacement. Still requires the user to have good understanding about the codebase.
Use mind altering drugs. Give yourself arbitrary artificial constraints.
Try using it in as many different ridiculous ways you can. I am getting the feeling you are only trying one method.
> I've had a fairly steady process for doing this: look at each route defined in Django, build out my `+page.server.ts`, and then split each major section of the page into a Svelte component with a matching Storybook story. It takes a lot of time to do this, since I have to ensure I'm not just copying the template but rather recreating it in a more idiomatic style.
Relinquish control.
Also, if you have very particular ways of doing things, give it samples of before and after (your fixed output) and why. You can use multishot prompting to train it to get the output you want. Have it machine check the generated output.
> Simple prompting just isn't able to get AI's code quality within 90%
Would simple instructions to a person work? Esp a person trained on everything in the universe? LLMs are clay, you have to mold them into something useful before you can use them.
put an example in the prompt: this was the original Django file and this is the rewritten in SveltKit version.
the ask it to convert another file using the examples as a template.
you will need to add additional rules for stuff not covered.
or maybe fix a bad try of the agent and add it as a second example
I've had very good results with Claude Code using this workflow.
298 more comments available on Hacker News
Not affiliated with Hacker News or Y Combinator. We simply enrich the public API with analytics.