Using Llms at Oxide

Postedabout 1 month agoActiveabout 1 month ago

steveklabnik

700 points

271 comments

rfd.shared.oxide.computerTech DiscussionstoryHigh profile

informativepositive

Debate

20/100

Large Language ModelsOxideTechnical Implementation

Key topics

Large Language Models

Oxide

Technical Implementation

The Oxide company's exploration of using Large Language Models (LLMs) in their workflow has sparked a lively debate about the benefits and drawbacks of AI-assisted coding. While some commenters, like gghffguhvc, argue that LLMs are a rational choice if they reduce overall costs, others, such as monkaiju, are puzzled by Oxide's encouragement of LLM use despite acknowledging significant caveats. The discussion highlights the importance of human oversight, with devmor noting that the onus is on the user to ensure LLMs perform correctly, and zihotki pointing out that seniority and experience play a crucial role in effectively utilizing LLMs. As ahepp and sunshowers share their personal experiences with AI-assisted coding, the conversation turns to the need for research on the impact of LLMs on code quality, with Yeask provocatively asking why large companies aren't already investigating this.

Snapshot generated from the HN discussion

Discussion Activity

Very active discussion

First comment

22m

Peak period

0-3h

Avg / period

14.5

Comment distribution160 data points

Loading chart...

Based on 160 loaded comments

Key moments

01Story posted
Dec 6, 2025 at 8:17 PM EST
about 1 month ago
Step 01
02First comment
Dec 6, 2025 at 8:39 PM EST
22m after posting
Step 02
03Peak activity
56 comments in 0-3h
Hottest window of the conversation
Step 03
04Latest activity
Dec 8, 2025 at 9:45 AM EST
about 1 month ago
Step 04

Generating AI Summary...

Analyzing up to 500 comments to identify key contributors and discussion patterns

Discussion (271 comments)

Showing 160 comments of 271

monkaiju

about 1 month ago

7 replies

Hmmm, I'm a bit confused of their conclusions (encouraging use) given some of the really damning caveats they point out. A tool they themselves determine to need such careful oversight probably just shouldn't be used near prod at all.

gghffguhvc

about 1 month ago

2 replies

For the same quality and quantity output, if the cost of using LLMs + the cost of careful oversight is less than the cost of not using LLMs then the rational choice is to use them.

Naturally this doesn’t factor in things like human obsolescence, motivation and self-worth.

ahepp

about 1 month ago

2 replies

It seems like this would be a really interesting field to research. Does AI assisted coding result in fewer bugs, or more bugs, vs an unassisted human?

I've been thinking about this as I do AoC with Copilot enabled. It's been nice for those "hmm how do I do that in $LANGUAGE again?" moments, but it's also wrote some nice looking snippets that don't do quite what I want it to. And many cases of "hmmm... that would work, but it would read the entire file twice for no reason".

My guess, however, is that it's a net gain for quality and productivity. Humans make bugs too and there need to be processes in place to discover and remediate those regardless.

sunshowers

about 1 month ago

1 reply

I'm not sure about research, but I've used LLMs for a few things here at Oxide with (what I hope is) appropriate judgment.

I'm currently trying out using Opus 4.5 to take care of a gnarly code reorganization that would take a human most of a week to do -- I spent a day writing a spec (by hand, with some editing advice from Claude Code), having it reviewed as a document for humans by humans, and feeding it into Opus 4.5 on some test cases. It seems to work well. The spec is, of course, in the form of an RFD, which I hope to make public soon.

I like to think of the spec is basically an extremely advanced sed script described in ~1000 English words.

AlexCoventry

about 1 month ago

Maybe it's not as necessary with a codebase as well-organized as Oxide's, but I found gemini 3 useful for a refactor of some completely test-free ML research code, recently. I got it to generate a test case which would exercise all the code subject to refactoring, got it to do the refactoring and verify that it leads to exactly the same state, then finally got it to randomize the test inputs and keep repeating the comparison.

Yeask

about 1 month ago

1 reply

This companies have trillions and they are not doing that research. Why?

ahepp

about 1 month ago

I don't know. I guess the flip side applies too? Lots of people arguing either side, when it feels like it shouldn't be that difficult to provide some objective data.

zihotki

about 1 month ago

And it doesn't factor seniority/experience. What's good for a senior developer is not necessarily same for a beginner

devmor

about 1 month ago

The ultimate conclusion seems to be one that leaves it to personal responsibility - the user of the LLM is responsible for ensuring the LLM has done its job correctly. While this is the ethical conclusion to me, but the “gap” left to personal responsibility is so large that it makes me question how useful everything else in this document really is.

I don’t think it is easy to create a concise set of rules to apply in this gap for something as general as LLM use, but I do think such a ruleset is noticeably absent here.

rgoulter

about 1 month ago

What do you find confusing about the document encouraging use of LLMs?

The document includes statements like "LLMs are superlative at reading comprehension", "LLMs can be excellent editors", "LLMs are amazingly good at writing code".

The caveats are really useful: if you've anchored your expectations on "these tools are amazing", the caveats bring you closer to what they've observed.

Or, if you're anchored on "the tools aren't to be used", the caveats give credibility to the document's suggestions of the LLMs are useful for.

sudomateo

about 1 month ago

Medication is littered with warning labels but humans still use it to combat illness. Social media can harm mental health yet people still use it. Pick whatever other example you'd like.

There are things in life that have high risks of harm if misused yet people still use them because there are great benefits when carefully used. Being aware of the risks is the key to using something that can be harmful, safely.

ares623

about 1 month ago

I would think some of their engineers love using LLMs, it would be unfair to them to completely disallow it IMO (even as someone who hates LLMs)

saagarjha

about 1 month ago

There’s a lot of code that doesn’t hit prod.

mathgeek

about 1 month ago

Junior engineers are the usual comparison folks make to LLMs, which is apt as juniors need lots of oversight.

thundergolfer

about 1 month ago

9 replies

A measured, comprehensive, and sensible take. Not surprising from Bryan. This was a nice line:

> it’s just embarrassing — it’s as if the writer is walking around with their intellectual fly open.

I think Oxide didn't include this in the RFD because they exclusively hire senior engineers, but in an organization that contains junior engineers I'd add something specific to help junior engineers understand how they should approach LLM use.

Bryan has 30+ years of challenging software (and now hardware) engineering experience. He memorably said that he's worked on and completed a "hard program" (an OS), which he defines as a program you doubt you can actually get working.

The way Bryan approaches an LLM is super different to how a 2025 junior engineer does so. That junior engineer possibly hasn't programmed without the tantalizing, even desperately tempting option to be assisted by an LLM.

pests

about 1 month ago

2 replies

> That junior engineer possibly hasn't programmed without the tantalizing, even desperately tempting option to be assisted by an LLM.

Years ago I had to spend many months building nothing but Models (as in MVC) for a huge data import / ingest the company I worked on was rewriting. It was just messy enough that it couldn't be automated. I almost lost my mind from the dull monotony and started even having attendance issues. I know today that could have been done with an LLM in minutes. Almost crazy how much time I put into that project compared to if I did it today.

aatd86

about 1 month ago

3 replies

The issue is that it might look good but an LLM often inserts weird mistakes. Or ellipses. Or overindex on the training data. If someone is not careful it is easy to completely wreck the codebase by piling on seemingly innocuous commits. So far I have developed a good sense for when I need to push the llm to avoid sloppy code. It is all in the details.

But a junior engineer would never find/anticipate those issues.

I am a bit concerned. Because the kind of software I am making, a llm would never prompt on its own. A junior cannot make it, it requires research and programming experience that they do not have. But I know that if I were a junior today, I would probably try to use llms as much as possible and would probably know less programming over time.

So it seems to me that we are likely to have worse software over time. Perhaps a boon for senior engineers but how do we train junior devs in that environment? Force them to build slowly, without llms? Is it aligned with business incentives?

Do we create APIs expecting the code to be generated by LLMs or written by hand? Because the impact of verbosity is not necessarily the same. LLMs don't get tired as fast as humans.

ambicapter

about 1 month ago

2 replies

If it's such a mind numbing problem it's easy to check it though, and the checking you do after the LLM will be much smaller than you writing every field (implicitly "checking" it when you write it).

Obviously if it's anything even minorly complex you can't trust the LLM hasn't found a new way to fool you.

pests

about 1 month ago

This is exactly it. There wasn't any complex logic. Just making sure the right fields were mapped, some renaming, and sometimes some more complex joins depending on the incoming data source and how it was represented (say multiple duplicate rows or a single field with comma delimited id's from somewhere else). I would have much rather scanned the LLM output line by line (and most would be simple, not very indented) then hand writing from scratch. I do admit it would take some time to review and cross reference, but I have no doubt it would have been a fraction of the time and effort.

aatd86

about 1 month ago

True. The counterpoint being that back in the days, they could have decided to write a parser if the data was structured and they would have then learnt things that they will never learn by relying on AI.

For a junior in the learning phase that can be useful time spent. Then again, I agree that at times certain menial code tasks are not worth doing and llms are helpful.

It's a bit like a kid not spending time memorizing their time tables since they can use a calculator. They are less likely to become a great mathematician.

agentultra

about 1 month ago

They are trained on code people had to make sacrifices for: deadlines, shortcuts, etc. And code people were simply too ignorant to be writing in the first place. Lots of code with hardly any coding standards.

So of course it’s going to generate code that has non-obvious bugs in it.

Ever play the Undefined Behaviour Game? Humans are bad at being compilers and catching mistakes.

I’d hoped… maybe still do, that the future of programming isn’t a shrug and, “good enough.” I hope we’ll keep developing languages and tools that let us better specify programs and optimize them.

AlexCoventry

about 1 month ago

> So it seems to me that we are likely to have worse software over time.

IMO, it's already happening. I had to change some personal information on a bunch of online services recently, and two out of seven of them were down. One of them is still down, a week later. This is the website of a major utilities company. When I call them, they acknowledge that it's down, but say my timing is just bad. That combined with all the recent outages has left me with the impression that software has been getting (even more) unreliable, recently.

swatcoder

about 1 month ago

> Models for a huge data import / ingest

> just messy enough that it couldn't be automated.

> I know today that could have been done with an LLM in minutes.

LLM's are amazing technology, but this is a terrible task for them.

This is a task where exactness is the whole effort, even though it's mind-numbingly boring, and LLM's are the worst of all computational tools you could leverage against "exacting but exhausting"

You'd think there's some technology that could have helped you. There probably is. LLM's, almost by definition, are not that technology.

This is very close to "count the r's in strawberry" and is a nearly the worst thing you could task one to do.

zackerydev

about 1 month ago

4 replies

I remember in the very first class I ever took on Web Design the teacher spent an entire semester teaching "first principles" of HTML, CSS and JavaScript by writing it in Notepad.

It was only then did she introduce us to the glory that was Adobe Dreamweaver, which (obviously) increased our productivity tenfold.

girvo

about 1 month ago

1 reply

I miss Dreamweaver. Combining it with Fireworks was a crazy productive combo for me back in the mid 00’s!

My first PHP scripts and games were written using nothing more than Notepad too funnily enough

panzi

about 1 month ago

Back in the early 00s I brought gvim.exe on a floppy disk to school because I refused to write XSLT, HTML, CSS, etc without auto-indent or syntax highlighting.

frankest

about 1 month ago

4 replies

DreamWeaver absolutely destroyed the code with all kinds of tags and unnecessary stuff. Especially if you used the visual editor. It was fun for brainstorming but plain notepad with clean understandable code was far far better (and with the browser compatibility issues the only option if you were going to production).

christophilus

about 1 month ago

5 replies

After 25 or so years doing this, I think there are two kinds of developers: craftsmen and practical “does it get the job done” types. I’m the former. The latter seem to be what makes the world go round.

ghurtado

about 1 month ago

3 replies

If you've been doing it for that long (about as long as I have), then surely you remember all the times you had to clean up after the "git 'er done" types.

I'm not saying they don't have their place, but without us they would still be making the world go round. Only backwards.

bigfatkitten

about 1 month ago

I work in digital forensics and incident response. The “git ‘er done” software engineers have paid my mortgage and are putting my kids through private schooling.

ambicapter

about 1 month ago

Well, going round in a circle does project to going forwards then backwards in a line :)

thebruce87m

about 1 month ago

> all the times you had to clean up after the "git 'er done" types

It’s lovely to have the time to do that. This time comes once the other type of engineer has shipped the product and turned the money flow on. Both types have their place.

tarsinge

about 1 month ago

2 replies

I am both, I own a small agency when I have to be practical, and have fun crafting code on the hobby side.

I think what craftsmen miss is the different goals. Projects fall on a spectrum from long lived app that constantly evolve with a huge team working on it to not opened again after release. In the latter, like movie or music production (or most video games), only the end result matters, the how is not part of the final product. Working for years with designers and artists really gave me perspective on process vs end result and what matter.

That doesn’t mean the end result is messy or doesn’t have craftsmanship. Like if you call a general contractor or carpenter for a specific stuff, you care that the end result is well made, but if they tell you that they built a whole factory for your little custom made project (the equivalent of a nice codebase), not only it doesn’t matter for you but it’ll be wildly overpriced and delayed. In my agency that means the website is good looking and bug free after being built, no matter how messy is the temporary construction site.

In contrast if you work on a SaaS or a long lived project (e.g. an OS) the factory (the code) is the product.

So to me when people say they are into code craftsmanship I think they mean in reality they are more interested in factory building than end product crafting.

jfreds

about 1 month ago

I agree wholeheartedly. As for the why do craftsmen care so much about the factory instead of the product, I believe the answer is pride. It’s a bitter pill to swallow, but writing and shipping a hack is sometimes the high road

arevno

about 1 month ago

I also do third party software development, and my approach is always: bill (highly, $300+/hr) for the features and requirements, but do the manual refactoring and architecture/performance/detail work on your own time. It benefits you, it benefits the client, it benefits the relationship, and it handles the misunderstanding of your normie clients with regard to what constitutes "working".

Say it takes 2 hours to implement a feature, and another hour making it logically/architecturally correct. You bill $600 and eat $200 for goodwill and your own personal/organizational development. You're still making $200/hr and you never find yourself in meetings with normie clients about why refactoring, cohesiveness, or quality was necessary.

frankest

about 1 month ago

After becoming a founder and having to deal with my own code for a decade, I’ve learned a balance. Prototype fast with AI crap to get the insight than write slow with structure for stuff that goes to production. AI does not touch production code - ask when needed to fix a tiny bit, but keep the beast at arms distance.

fragmede

about 1 month ago

It takes both.

KronisLV

about 1 month ago

I think there's more dimensions that also matter a bunch:

  * a bad craftsman will get pedantic about the wrong things (e.g. SOLID/DRY as dogma) and will create architectures that will make development velocity plummet ("clever" code, deep inheritance chains, "magic" code with lots of reflection etc.)
  * a bad practician will not care about long term maintainability either, or even correctness enough not to introduce a bunch of bad bugs or slop, even worse when they're subtle enough to ship but mess up your schema or something

So you can have both good and bad outcomes with either, just for slightly different reasons (caring about the wrong stuff vs not caring).

I think the sweet spot is to strive for code that is easy to read and understand, easy to change, and easy to eventually replace or throw out. Obviously performant enough but yadda yadda premature optimization, depends on the domain and so on...

BobbyTables2

about 1 month ago

2 replies

MS FrontPage also went out of its way to do the same.

pram

about 1 month ago

1 reply

It’s funny this came up, because it was kinda similar to the whole “AI frauds” thing these days.

I don’t particularly remember why, but “hand writing” fancy HTML and CSS used to be a flex in some circles in the 90s. A bunch of junk and stuff like fixed positioning in the source was the telltale sign they “cheated” with FrontPage or Dreamweaver lol

supriyo-biswas

about 1 month ago

My only gripe was that they tended to generate gobs of “unsemantic” HTML. You resized a table and expect it to be based on viewport width? No! It’s hardcoded “width: X px” to whatever your size the viewport was set to.

_joel

about 1 month ago

It might have been pretty horrible but I hold Frontpage 97 with fond memories, it started my IT career, although not for HTML reasons.

The _vti_cnf dir left /etc/passwd downloadable, so I grabbed it from my school website. One Jack the Ripper later and the password was found.

I told the teacher resposible for the IT it was insecure and that ended up getting me some work experience. Ended up working the summer (waiting for my GCSE results) for ICL which immeasurably helped me when it was time to properly start working.

Did think about defacing, often wonder that things could have turned out very much differently!

msephton

about 1 month ago

Judicious and careful use of Dreamweaver (its visual editor and properties bar) enabled me to write exactly the code I wanted. I used Dreamweaver foot table layouts and Home Site (later Top Style) for broader code edits. At that time I was famous with the company for being able to make any layout. Good times!

chrisweekly

about 1 month ago

The HTML generated by Dreamweaver's WYSIWYG mode might not have been ideal, but it was far superior to the mess produced by MS Front Page. With Dreamweave, it was at least possible to use it as a starting point.

ghurtado

about 1 month ago

2 replies

> glory that was Adobe Dreamweaver

Dreamweaver was to web development what ...

I just sat here for 5 minutes and I wasn't able to finish that sentence. So I think that's a statement in itself.

zackerydev

about 1 month ago

That was my point! Dreamweaver to web dev felt like what LLM's are to many disciplines!

riffraff

about 1 month ago

..VB6 was to windows dev?

People with very little competence could and did get things done, but it was a mess underneath.

pjmlp

about 1 month ago

I love how people speak about Dreamweaver in the past, while Adobe keeps getting money for it,

https://developer.adobe.com/dreamweaver/

And yes, as you can imagine for the kind of comments I do regarding high level productive tooling and languages, I was a big Dreamwever fan back in the 2000's.

keyle

about 1 month ago

5 replies

> That junior engineer possibly hasn't programmed without the tantalizing, even desperately tempting option to be assisted by an LLM.

This gives me somewhat of a knee jerk reaction.

When I started programming professionally in the 90s, the internet came of age and I remember being told "in my days, we had books and we remembered things" which of course is hilarious because today you can't possibly retain ALL the knowledge needed to be software engineer due to the sheer size of knowledge required today to produce a meaningful product. It's too big and it moves too fast.

There was this long argument that you should know things and not have to look it up all the time. Altavista was a joke, and Google was cheating.

Then syntax highlighting came around and there'd always be a guy going "yeah nah, you shouldn't need syntax highlighting to program, you screen looks like a Christmas tree".

Then we got stuff like auto-complete, and it was amazing, the amount of keystrokes we saved. That too, was seen as heresy by the purists (followed later by LSP - which many today call heresy).

That reminds me also, back in the day, people would have entire Encyclopaedia on DVDs collections. Did they use it? No. But they criticised Wikipedia for being inferior. Look at today, though.

Same thing with LLMs. Whether you use them as a powerful context based auto-complete, as a research tool faster than wikipedia and google, as rubber-duck debugger, or as a text generator -- who cares: this is today, stop talking like a fossil.

It's 2025 and junior developers can't work without LSP and LLM? It's fine. They're not in front of a 386 DX33 with 1 book of K&R C and a blue EDIT screen. They have massive challenged ahead of them, the IT world is complete shambles, and it's impossible to decipher how anything is made, even open source.

Today is today. Use all the tools at hand. Don't shame kids for using the best tools.

We should be talking about sustainability of such tools rather than what it means to use them (cf. enshittification, open source models etc.)

sifar

about 1 month ago

1 reply

It is not clear though, which tools enable and which tools inhibit your development at the beginning of your journey.

keyle

about 1 month ago

2 replies

Agreed, although LLMs definitely qualify as enabling developers compared to <social media, Steam, consoles, and other distractions> of today.

The Internet itself is full of distractions. My younger self spent a crazy amount of time on IRC. So it's not different than spending time on say, Discord today.

LLMs have pretty much a direct relationship with Google. The quality of the response has much to do with the quality of the prompt. If anything, it's the overwhelming nature of LLMs that might be the problem. Back in the day, if you had, say a library access, the problem was knowing what to look for. Discoverability with LLMs is exponential.

As for LLM as auto-complete, there is an argument to be made that typing a lot reinforces knowledge in the human brain like writing. This is getting lost, but with productivity gains.

girvo

about 1 month ago

4 replies

Watching my juniors constantly fight the nonsense auto completion suggestions their LLM editor of choice put in front of them, or worse watching them accept it and proceed to get entirely lost in the sauce, I’m not entirely convinced that the autocompletion part of it is the best one.

Tools like Claude code with ask/plan mode seem to be better in my experience, though I absolutely do wonder about the lack of typing causing a lack of memory formation

A rule I set myself a long time ago was to never copy paste code from stack overflow or similar websites. I always typed it out again. Slower, but I swear it built the comprehension I have today.

zx8080

about 1 month ago

2 replies

> but I swear it built the comprehension I have today.

For interns/junior engineers, the choice is: comprehension VS career.

And I won't be surprised if most of them will go with career now, and comprehension.. well thanks maybe tomorrow (or never).

christophilus

about 1 month ago

2 replies

I don’t think that’s the dichotomy. I’ve been in charge of hiring at a few companies, and comprehension is what I look for 10 times out of 10.

xorcist

about 1 month ago

There are plenty of companies today where "not using AI enough" is a career problem.

It shouldn't be, but it is.

sysguest

about 1 month ago

well you could get "interview-optimized" interviewees with impressive-looking mini-projects

sevensor

about 1 month ago

I have worked with a lot of junior engineers, and I’ll take comprehension any day. Developing their comprehension is a huge part of my responsibility to them and to the company. It’s pretty wasteful to take a human being with a functioning brain and ask them to churn out half understood code that works accidentally. I’m going to have to fix that eventually anyway, so why not get ahead of it and have them understand it so they can fix it instead of me?

keyle

about 1 month ago

1 reply

> Watching my juniors constantly fight the nonsense auto completion suggestions their LLM editor of choice put in front of them, or worse watching them accept it and proceed to get entirely lost in the sauce, I’m not entirely convinced that the autocompletion part of it is the best one.

That's not an LLM problem, they'd do the same thing 10 years ago with stack overflow: argue about which answer is best, or trust the answer blindly.

girvo

about 1 month ago

1 reply

No, it is qualitatively different because it happens in-line and much faster. If it’s not correct (which it seems it usually isn’t), they spend more time removing whatever garbage it autocompleted.

menaerus

about 1 month ago

1 reply

People do it with the autocomplete as well so I guess there's not that much of a difference wrt LLMs. It likely depends on the language but people who are inexperienced in C++ would be over-relying on autocomplete to the point that it looks hilarious, if you have a chance to sit next to them helping to debug something for example.

girvo

about 1 month ago

For sure, but these new tools spit out a lot more and a lot faster, and it’s usually correct “enough” that the compiler won’t yell. It’s been wild to see its suggestions be wrong far more often than they are right, so I wonder how useful they really are at all.

Normal auto complete plus a code tool like Claude Code or similar seem far more useful to me.

sevensor

about 1 month ago

> never copy paste code from stack overflow

I have the same policy. I do the same thing for example code in the official documentation. I also put in a comment linking to the source if I end up using it. For me, it’s like the RFD says, it’s about taking responsibility for your output. Whether you originated it or not, you’re the reason it’s in the codebase now.

zdragnar

about 1 month ago

I spent the first two years or so of my coding career writing PHP in notepad++ and only after that switched to an IDE. I rarely needed to consult the documentation on most of the weird quirks of the language because I'd memorized them.

Nowadays I'm back to a text editor rather than an IDE, though fortunately one with much more creature comforts than n++ at least.

I'm glad I went down that path, though I can't say I'd really recommend as things felt a bit simpler back then.

intended

about 1 month ago

LLMs are in a context where they are the promised solution for most of the expected economic growth on one end, a tool to improve programmer productivity and skill while also being only better than doom scrolling?

Thats comparison undermines the integrity of the argument you are trying to make.

aprilthird2021

about 1 month ago

1 reply

> When I started programming professionally in the 90s, the internet came of age and I remember being told "in my days, we had books and we remembered things" which of course is hilarious because today you can't possibly retain ALL the knowledge needed to be software engineer due to the sheer size of knowledge required today to produce a meaningful product. It's too big and it moves too fast.

But I mean, you can get by without memorizing stuff sure, but memorizing stuff does work out your brain and does help out in the long run? Isn't it possible we've reached the cliff of "helpful" tools to the point we are atrophying enough to be worse at our jobs?

Like, reading is surely better for the brain than watching TV. But constant cable TV wasn't enough to ruin our brains. What if we've got to the point it finally is enough?

darkwater

about 1 month ago

I'm sure I'm biased by my age (mid 40s) but I think you are onto something there. What if this constant decline in how people learn (on average) is not just a grumpy old man feeling? What if it's something real, that was smoothened out by the sheer increase of the student population between 1960 and 2010 and the improvements of tooling?

Barrin92

about 1 month ago

1 reply

>"in my days, we had books and we remembered things" which of course is hilarious

it isn't hilarious, it's true. My father (now in his 60s) who came from a blue collar background with very little education taught himself programming by manually copying and editing software out of magazines, like a lot of people his age.

I teach students now who have access to all the information in the world but a lot of them are quite literally so scatterbrained and heedless anything that isn't catered to them they can't process. Not having working focus and memory is like having muscle atrophy of the mind, you just turn into a vegetable. Professors across disciplines have seen decline in student abilities, and for several decades now, not just due to LLMs.

menaerus

about 1 month ago

Information 30 years ago was more difficult to obtain. It required manual labor but in todays' context there was not much information to be consumed. Today, we have the opposite - a huge vast of information that is easy to obtain but to process? Not so much. Decline is unavoidable. Human intelligence isn't increasing at the pace advancements are made.

discreteevent

about 1 month ago

1 reply

> "in my days, we had books and we remembered things" which of course is hilarious because today you can't possibly retain ALL the knowledge needed to be software engineer

Reading books was never about knowledge. It was about knowhow. You didn't need to read all the books. Just some. I don't know how many developers I met who would keep asking questions that would be obvious to anyone who had read the book. They never got the big picture and just wasted everyone's time, including their own.

"To know everything, you must first know one thing."

uhfraid

about 1 month ago

Which books? Did they not read them?

pjmlp

about 1 month ago

Ah, but lets do leetcode on the whiteboard as interview, for an re-balancing a red-black tree, regardless of how long those people have been in the industry and the job position they are actually applying for.

dachris

about 1 month ago

1 reply

For the other non-native speakers wondering, "fly" means your trouser zipper.

He surely has his fly closed when cutting through the hype with reflection and pragmatism (without the extreme positions on both sides often seen).

vaylian

about 1 month ago

I was also confused when I read that sentence. Wikipedia has an article on it: https://en.wikipedia.org/wiki/Fly_(clothing)

dicytea

about 1 month ago

1 reply

It's funny that I've seen people both argue that LLMs are exclusively useful only to beginners who know next to nothing and also that they are only useful if you are a 50+ YoE veteran at the top of their craft who started programming with punch cards since they were 5-years-old.

I wonder which of these camps are right.

Mtinie

about 1 month ago

Both camps, for different reasons.

For novices, LLMs are infinitely patient rubber ducks. They unstick the stuck; helping people past the coding and system management hurdles that once required deep dives through Stack Overflow and esoteric blog posts. When an explanation doesn’t land, they’ll reframe until one does. And because they’re confidently wrong often enough, learning to spot their errors becomes part of the curriculum.

For experienced engineers, they’re tireless boilerplate generators, dynamic linters, and a fresh set of eyes at 2am when no one else is around to ask. They handle the mechanical work so you can focus on the interesting problems.

The caveat for both: intentionality matters. They reward users who know what they’re looking for and punish those who outsource judgment entirely.

govping

about 1 month ago

1 reply

The craft vs practical tension with LLMs is interesting. We've found LLMs excel when there's a clear validation mechanism - for security research, the POC either works or it doesn't. The LLM can iterate rapidly because success is unambiguous.

Where it struggles: problems requiring taste or judgment without clear right answers. The LLM wants to satisfy you, which works great for 'make this exploit work' but less great for 'is this the right architectural approach?'

The craftsman answer might be: use LLMs for the systematic/tedious parts (code generation, pattern matching, boilerplate) while keeping human judgment for the parts that matter. Let the tool handle what it's good at, you handle what requires actual thinking.

jstrebel

about 1 month ago

I am certain that LLMs can help you with judgment calls as well. I spent the last month tinkering with spec-driven development of a new Web app and I must say, the LLM was very helpful in identifying design issues in my requirements document and actively suggested sensible improvements. I did not agree to all of them, but the conversation around high-level technical design decisions was very interesting and fruitful (e.g. cache use, architectural patterns, trade-offs between speed and higher level of abstraction).

smcameron

about 1 month ago

1 reply

I found it funny that in a sentence that mentions "those who can recognize an LLM’s reveals", a few words later, there's an em-dash. I've often used em-dashes myself, so I find it a bit annoying that use of em-dashes is widely considered to be an AI tell.

bcantrill

about 1 month ago

1 reply

The em-dash alone is not an LLM-reveal -- it's how the em-dash is used to pace a sentence. In my experience, with an LLM, em-dashes are used to even pacing; for humans (and certainly, for me!), the em-dash is used to deliberately change pacing -- to introduce a pause (like that one!), followed by a bit of a (metaphorical) punch. The goal is to have you read the sentence as I would read it -- and I think if you have heard me speak, you can hear me in my writing.

thundergolfer

about 1 month ago

Too much has been written about em-dashes and LLMs, but I'd highly recommend If it cites em dashes as proof, it came from a tool from Scott Smitelli if you haven't read it.

It's a brilliant skewering of the 'em dash means LLM' heuristic as a broken trick.

1. https://www.scottsmitelli.com/articles/em-dash-tool/

btbuildem

about 1 month ago

> The way Bryan approaches an LLM is super different to how a 2025 junior engineer does so

This is a key difference. I've been writing software professionally for over two decades. It took me quite a long time to overcome certain invisible (to me) hesitations and objections to using LLMs in sdev workflows. At some point the realization came to me that this is simply the new way of doing things, and from this point onward, these tools will be deeply embedded in and synonymous with programming work. Recognizing this phenomenon for what it is somehow made me feel young again -- perhaps that's just the crust breaking around a calcified grump, but I do appreciate being able to tap into that all the same.

govping

about 1 month ago

Interesting tension between craft and speed with LLMs. I've been building with AI assistance for the past week (terminal clients, automation infrastructure) and found the key is: use AI for scaffolding and boilerplate, but hand-refine anything customer-facing or complex. The 'intellectual fly open' problem is real when you just ship AI output directly. But AI + human refinement can actually enable better craft by handling the tedious parts. Not either/or, but knowing which parts deserve human attention vs which can be delegated.

bryancoxwell

about 1 month ago

4 replies

Find it interesting that the section about LLM’s tells when using it for writing is absolutely littered with emdashes

matt_daemon

about 1 month ago

2 replies

I believe Bryan is a well known em dash addict

bryancoxwell

about 1 month ago

And I mean no disrespect to him for it, it’s just kind of funny

rl3

about 1 month ago

>I believe Bryan is a well known em dash addict

I was hoping he'd make the leaderboard, but perhaps the addiction took proper hold in more recent years:

https://www.gally.net/miscellaneous/hn-em-dash-user-leaderbo...

https://news.ycombinator.com/user?id=bcantrill

No doubt his em dashes are legit, of course.

minimaxir

about 1 month ago

2 replies

You can stop LLMs from using em-dashes by just telling it to "never use em-dashes". This same type of prompt engineering works to mitigate almost every sign of AI-generated writing, which is one reason why AI writing heuristics/detectors can never be fully reliable.

jgalt212

about 1 month ago

1 reply

I guess, but if even in you set aside any obvious tells, pretty much all expository writing out of an LLM still reads like pablum without any real conviction or tons of hedges against observed opinions.

"lack of conviction" would be a useful LLM metric.

minimaxir

about 1 month ago

I ran a test for a potential blog post where I take every indicator of AI writing and tell the LLM "don't do any of these" and resulted in high school AP English quality writing. Which could be considered a lack of conviction level of writing.

dcre

about 1 month ago

This does not work on Bryan, however.

bccdee

about 1 month ago

2 replies

To be fair, LLMs usually use em-dashes correctly, whereas I think this document misuses them more often than not. For example:

> This can be extraordinarily powerful for summarizing documents — or of answering more specific questions of a large document like a datasheet or specification.

That dash shouldn't be there. That's not a parenthetical clause, that's an element in a list separated by "or." You can just remove the dash and the sentence becomes more correct.

NobodyNada

about 1 month ago

1 reply

LLMs also generally don't put spaces around em dashes — but a lot of human writers do.

kimixa

about 1 month ago

I think you're thinking of british-style "en-dashes" – which is often used for something that could have been separated by brackets but do have a space either side – rather than "em" dashes. They can also be used in a similar place as a colon – that is to separate two parts of a single sentence.

British users regularly use that sort of construct with "-" hyphens, simply because they're pretty much the same and a whole lot easier to type on a keyboard.

the_af

about 1 month ago

1 reply

I don't know whether that use of the em-dash is grammatically correct, but I've seen enough native English writers use it like that. One example is Philip K Dick.

bccdee

about 1 month ago

Perhaps you have—or perhaps you've seen this construction instead, where (despite also using "or") the phrase on the other side of the dash is properly parenthetical and has its own subject.

anonnon

about 1 month ago

3 replies

There was a comment recently by HN's most enthusiastic LLM cheerleader, Simon Willison, that I stopped reading almost immediately (before seeing who posted it), because it exuded the slop stench of an LLM: https://news.ycombinator.com/item?id=46011877

However, I was surprised to see that when someone (not me) accused him of using an LLM to write his comment, he flatly denied it: https://news.ycombinator.com/item?id=46011964

Which I guess means (assuming he isn't lying) if you spend too much time interacting with LLMs, you eventually resemble one.

Philpax

about 1 month ago

I don't know what to tell you: that really does not read like it was written by a LLM. You were perhaps set off by the very first sentence, which sounds like it was responding to a prompt?

jph00

about 1 month ago

It reads exactly like all his writing over many years afaict. Which is to say - it reads well. Just because someone is clear, thoughtful, and thorough, does not make them an AI. AI writing is actually quite different to this.

Jweb_Guru

about 1 month ago

> if you spend too much time interacting with LLMs, you eventually resemble one

Pretty much. I think people who care about reducing their children's exposure to screen time should probably take care to do the same for themselves wrt LLMs.

an_ko

about 1 month ago

2 replies

I would have expected at least some consideration of public perception, given the extremely negative opinions many people hold about LLMs being trained on stolen data. Whether it's an ethical issue or a brand hazard depends on your opinions about that, but it's definitely at least one of those currently.

john01dav

about 1 month ago

He speaks of trust and LLMs breaking that trust. Is this not what you mean, but by another name?

> First, to those who can recognize an LLM’s reveals (an expanding demographic!), it’s just embarrassing — it’s as if the writer is walking around with their intellectual fly open. But there are deeper problems: LLM-generated writing undermines the authenticity of not just one’s writing but of the thinking behind it as well. If the prose is automatically generated, might the ideas be too? The reader can’t be sure — and increasingly, the hallmarks of LLM generation cause readers to turn off (or worse).

> Specifically, we must be careful to not use LLMs in such a way as to undermine the trust that we have in one another

> our writing is an important vessel for building trust — and that trust can be quickly eroded if we are not speaking with our own voice

tolerance

about 1 month ago

I made the mistake of first reading this as a document intended for all in spite of it being public.

This is a technical document that is useful in illustrating how the guy who gave a talk once that I didn’t understand but was captivated by and is well-respected in his field intends to guide his company’s use of the technology so that other companies and individual programmers may learn from it too.

I don’t think the objective was to take any outright ethical stance, but to provide guidance about something ostensibly used at an employee’s discretion.

john01dav

about 1 month ago

2 replies

> it is presumed that of the reader and the writer, it is the writer that has undertaken the greater intellectual exertion. (That is, it is more work to write than to read!)

This applies to natural language, but, interestingly, the opposite is true of code (in my experience and that of other people that I've discussed it with).

worble

about 1 month ago

2 replies

See: Kernighan's Law

> Everyone knows that debugging is twice as hard as writing a program in the first place. So if you’re as clever as you can be when you write it, how will you ever debug it?

https://www.laws-of-software.com/laws/kernighan/

DrewADesign

about 1 month ago

I think people misunderstand this quote. Cleverness in this context is referring to complexity, and generally stems from falling in love with some complex mechanism you dream up to solve a problem rather than challenging yourself to create something simpler and easier to maintain. Bolting together bits of LLM-created code is is far more likely to be “clever” rather than good.

SilverSlash

about 1 month ago

What an amazing quote!

tikhonj

about 1 month ago

1 reply

That's because embarrassingly bad writing is useless, while embarrassingly bad code can still make the computer do (roughly) the right thing and lets you tick off a Jira ticket. So we end up having way more room for awful code than for awful prose.

Reading good code can be a better way to learn about something than reading prose. Writing code like that takes some real skill and insight, just like writing clear explanations.

zeroonetwothree

about 1 month ago

Some writing is functional, e.g. a letter notifying someone of some information. For that type of writing even bad quality can achieve its purpose. Indeed probably the majority of words written are for functional reasons.

john01dav

about 1 month ago

4 replies

> Wherever LLM-generated code is used, it becomes the responsibility of the engineer. As part of this process of taking responsibility, self-review becomes essential: LLM-generated code should not be reviewed by others if the responsible engineer has not themselves reviewed it. Moreover, once in the loop of peer review, generation should more or less be removed: if code review comments are addressed by wholesale re-generation, iterative review becomes impossible.

My general procedure for using an LLM to write code, which is in the spirit of what is advocated here, is:

1) First, feed in the existing relevant code into an LLM. This is usually just a few source files in a larger project

2) Describe what I want to do, either giving an architecture or letting the LLM generate one. I tell it to not write code at this point.

3) Let it speak about the plan, and make sure that I like it. I will converse to address any deficiencies that I see, and I almost always do.

4) I then tell it to generate the code

5) I skim & test the code to see if it's generally correct, and have it make corrections as needed

6) Closely read the entire generated artifact at this point, and make manual corrections (occasionally automatic corrections like "replace all C style casts with the appropriate C++ style casts" then a review of the diff)

The hardest part for me is #6, where I feel a strong emotional bias towards not doing it, since I am not yet aware of any errors compelling such action.

This allows me to operate at a higher level of abstraction (architecture) and remove the drudgery of turning an architectural idea into written, precise, code. But, when doing so, you are abandoning those details to a non-deterministic system. This is different from, for example, using a compiler or higher level VM language. With these other tools, you can understand how they work and rapidly have a good idea of what you're going to get, and you have robust assurances. Understanding LLMs helps, but thus not to the same degree.

ryandrake

about 1 month ago

3 replies

I've found that your step 6 takes the vast majority of the time I spend programming with LLMs. Like 10X+ the combined total of time steps 1-5 take. And that's if the code the LLM produced actually works. If it doesn't work (which happens quite often), then even more handholding and corrections are needed. It's really a grind. I'm still not sure whether I am net saving time using these tools.

I always wonder about the people who say LLMs save them so much time: Do you just accept the edits they make without reviewing each and every line?

Jaygles

about 1 month ago

2 replies

I exclusively use the autocomplete in cursor. I hate reviewing huge chunks of llm code at one time. With the autocomplete, I’m in full control of the larger design and am able to quickly review each piece of llm code. Very often it generates what I was going to type myself.

Anything that involves math or complicated conditions I take extra time on.

I feel I’m getting code written 2 to 3 times faster this way while maintaining high quality and confidence

zeroonetwothree

about 1 month ago

1 reply

Maybe it subjectively feels like 2-3x faster but in studies that measure it we tend to see smaller improvements like in the range of 20-30% faster. It could be that you are an outlier, of course.

Jaygles

about 1 month ago

2-3x faster on getting the code written. Fully completing a coding task maybe only 20-30% faster, if we count chasing down requirements, reviews, waiting for CI to pass so I can merge etc.

NKjNkaka

about 1 month ago

This is my preferred way as well. And when you think about it, it makes sense. With advanced autocomplete you are:

1. Keeping the context very small 2. Keeping the scope of the output very small

With the added benefit of keeping you in the flow state (and in my experience making it more enjoyable).

To anyone that even hates LLMs give autocomplete a shot (with a keying to toggle it if it annoys you, sometimes it’s awful). It’s really no different than typing it manually wrt quality etc, so the speed up isn’t huge, but it feels a lot nicer.

hedgehog

about 1 month ago

You can have the tool start by writing an implementation plan describing the overall approach and key details including references, snippets of code, task list, etc. That is much faster than a raw diff to review and refine to make sure it matches your intent. Once that's acceptable the changes are quick, and having the machine do a few rounds of refinement to make sure the diff vs HEAD matches the plan helps iron out some of the easy issues before human eyes show up. The final review is then easier because you are only checking for smaller issues and consistency with the plan that you already signed off on.

It's not magic though, this still takes some time to do.

mythrwy

about 1 month ago

If it's stuff I have have been doing for years and isn't terribly complex I've found its generally quick to skim review. I don't need to read every line I can glance at it, know it's a loop and why, a function call or whatever. If I see something unusual I take that as an opportunity to learn.

I've seen LLMs write some really bad code a few times lately it seems almost worse than what they were doing 6 or 8 months ago. Could be my imagination but it seems that way.

ec109685

about 1 month ago

1 reply

Don’t make manual corrections.

If you keep all edits to be driven by the LLM, you can use that knowledge later in the session or ask your model to commit the guidelines to long term memory.

klauserc

about 1 month ago

The best way to get an LLM to follow style is to make sure that this style is evident in the codebase. Excessive instructions (whether through memories or AGENT.md) do not help as much.

Personally, I absolutely hate instructing agents to make corrections. It's like pushing a wet noodle. If there is lots to correct, fix one or two cases manually and tell the LLM to follow that pattern.

https://www.humanlayer.dev/blog/writing-a-good-claude-md

qudat

about 1 month ago

Insert before 4: make it generate tests that fail, review, then have it implement and make sure the tests pass.

Insert before that: have it creates tasks with beads and force it to let you review before marking a task complete

CerryuDu

about 1 month ago

How the heck it does not upset your engineering pride and integrity, to limit your own contribution to verifying and touching up machine slop, is beyond me.

You obviously cannot emotionally identify with the code you produce this way; the ownership you might feel towards such code is nowhere near what meticulously hand-written code elicits.

000ooo000

about 1 month ago

1 reply

By this own article's standards, now there are 2 authors who don't understand what they've produced.

MobiusHorizons

about 1 month ago

This is exactly what the advice is trying to mitigate. At least as I see it, the responsible engineer (meaning author, not some quality of the engineer) needs to understand the intent of the code they will produce. Then if using an llm, they must take full owners of that code by carefully reviewing it or molding it until it reflects their intent. If at the end of this the “responsible” engineer does not understand the code the advice has not been followed.

rgoulter

about 1 month ago

5 replies

> LLM-generated writing undermines the authenticity of not just one’s writing but of the thinking behind it as well.

I think this points out a key point.. but I'm not sure the right way to articulate it.

A human-written comment may be worth something, but an LLM-generated is cheap/worthless.

The nicest phrase capturing the thought I saw was: "I'd rather read the prompt".

It's probably just as good to let an LLM generate it again, as it is to publish something written by an LLM.

averynicepen

about 1 month ago

I'll give it a shot.

Text, images, art, and music are all methods of expressing our internal ideas to other human beings. Our thoughts are the source, and these methods are how they are expressed. Our true goal in any form of communication is to understand the internal ideas of others.

An LLM expresses itself in all the same ways, but the source doesn't come from an individual - it comes from a giant dataset. This could be considered an expression of the aggregate thoughts of humanity, which is fine in some contexts (like retrieval of ideas and information highly represented in the data/world), but not when presented in a context of expressing the thoughts of an individual.

LLMs express the statistical summation of everyone's thoughts. It presents the mean, when what we're really interested in are the data points a couple standard deviations away from the mean. That's where all the interesting, unique, and thought provoking ideas are. Diversity is a core of the human experience.

---

An interesting paradox is the use of LLMs for translation into a non-native language. LLMs are actively being used to better express an individual's ideas using words better than they can with their limited language proficiency, but for those of us on the receiving end, we interpret the expression to mirror the source and have immediate suspicions on the legitimacy of the individual's thoughts. Which is a little unfortunate for those who just want to express themselves better.

leobg

about 1 month ago

> I'd rather read the prompt.

That’s what I think when I see a news headline. What are you writing? Who cares. WHY are you writing it — that is what I want to know.

teaearlgraycold

about 1 month ago

One thing I’ve noticed is that when writing something I consider insightful or creative with LLMs for autocompletion the machine can’t successfully predict any words in the sentence except maybe the last one.

They seem to be good at either spitting out something very average, or something completely insane. But something genuinely indicative of the spark of intelligence isn’t common at all. I’m happy to know that while my thoughts are likely not original, they are at least not statistically likely.

crabmusket

about 1 month ago

I think more people should read Naur's "programming as theory building".

A comment is an attempt to more fully document the theory the programmer has. Not all theory can be expressed in code. Both code and comment are lossy artefacts that are "projections" of the theory into text.

LLMs currently, I believe, cannot have a theory of the program. But they can definitely perform a useful simulacrum of such. I have not yet seen an LLM generated comment that is truly valuable. Of course, lots of human generated comments are not valuable either. But the ceiling for human comments is much, much higher.

weitendorf

about 1 month ago

This is something that I feel rather conflicted about, because while I greatly dislike the LLM-slop-style writing that so many people are trying to abuse our attention with, I’ve started noticing that there are a large number of people (varying across “audiences”/communities/platforms”) who don’t really notice it, or at least that whoever is behind the slop is making the “right kind” of slop so that they don’t.

For example, I recently was perusing the /r/SaaS subreddit and could tell that most of the submissions were obviously LLM-generated, but often by telling a story that was meant to spark outrage, resonate with the “audience” (eg being doubted and later proven right), and ultimately conclude by validating them by making the kind of decision they typically would.

I also would never pass this off as anything else, but I’ve been finding it effective to have LLMs write certain kinds of documentation or benchmarks in my repos, just so that they/I/someone else have access to metrics and code snippets that I would otherwise not have time to write myself. I’ve seen non-native English speakers write pretty technically useful/interesting docs and tech articles by translating through LLMs too, though a lot more bad attempts than good (and you might not be able to tell if you can’t speak the language)…

Honestly the lines are starting to blur ever so slightly for me, I’d still not want someone using an LLM to chat with me directly, but if someone who could have an LLM build a simple WASM/interesting game and then write an interesting/informative/useful article about it, or steer it into doing so… I might actually enjoy it. And not because the prompt was good: instructions telling an LLM to go make a game and do a write up don’t help me as much or in the same way as being able to quickly see how well it went and any useful takeaways/tricks/gotchas it uncovered. It would genuinely be giving me valuable information and probably wouldn’t be something I’d speculatively try or run myself.

mcqueenjordan

about 1 month ago

9 replies

As usual with Oxide's RFDs, I found myself vigorously head-nodding while reading. Somewhat rarely, I found a part that I found myself disagreeing with:

> Unlike prose, however (which really should be handed in a polished form to an LLM to maximize the LLM’s efficacy), LLMs can be quite effective writing code de novo.

Don't the same arguments against using LLMs to write one's prose also apply to code? Was this structure of the code and ideas within the engineers'? Or was it from the LLM? And so on.

Before I'm misunderstood as a LLM minimalist, I want to say that I think they're incredibly good at solving for the blank page syndrome -- just getting a starting point on the page is useful. But I think that the code you actually want to ship is so far from what LLMs write, that I think of it more as a crutch for blank page syndrome than "they're good at writing code de novo".

I'm open to being wrong and want to hear any discussion on the matter. My worry is that this is another one of the "illusion of progress" traps, similar to the one that currently fools people with the prose side of things.

lukasb

about 1 month ago

1 reply

One difference is that clichéd prose is bad and clichéd code is generally good.

joshka

about 1 month ago

3 replies

Depends on what your prose is for. If it's for documentation, then prose which matches the expected tone and form of other similar docs would be clichéd in this perspective. I think this is a really good use of LLMs - making docs consistent across a large library / codebase.

minimaxir

about 1 month ago

2 replies

I have been testing agentic coding with Claude 4.5 Opus and the problem is that it's too good at documentation and test cases. It's thorough in a way that it goes out of scope, so I have to edit it down to increase the signal-to-noise.

girvo

about 1 month ago

1 reply

The “change capture”/straight jacket style tests LLMs like to output drive me nuts. But humans write those all the time too so I shouldn’t be that surprised either!

mulmboy

about 1 month ago

1 reply

What do these look like?

pmg101

about 1 month ago

3 replies

  1. Take every single function, even private ones.
  2. Mock every argument and collaborator.
  3. Call the function.
  4. Assert the mocks were  called in the expected way.

These tests help you find inadvertent changes, yes, but they also create constant noise about changes you intend.

ornornor

about 1 month ago

Juniors on one of the teams I work with only write this kind of tests. It’s tiring, and I have to tell them to test the behaviour, not the implementation. And yet every time they do the same thing. Or rather their AI IDE spits these out.

senbrow

about 1 month ago

These tests also break encapsulation in many cases because they're not testing the interface contract, they're testing the implementation.

girvo

about 1 month ago

You beat me to it, and yep these are exactly it.

“Mock the world then test your mocks”, I’m simply not convinced these have any value at all after my nearly two decades of doing this professionally

diamond559

about 1 month ago

If the goal is to document the code and it gets sidetracked and focuses on only certain parts it failed the test. It just further proves llm's are incapable of grasping meaning and context.

danenania

about 1 month ago

2 replies

A problem I’ve found with LLMs for docs is that they are like ten times too wordy. They want to document every path and edge case rather focusing on what really matters.

It can be addressed with prompting, but you have to fight this constantly.

bigiain

about 1 month ago

1 reply

I think probably my most common prompt is "Make it shorter. No more than ($x) (words|sentences|paragraphs)."

pxc

about 1 month ago

I've never been able to get that to work. LLMs can't count; they don't actually know how long their output is.

pxc

about 1 month ago

> A problem I’ve found with LLMs for docs is that they are like ten times too wordy

This is one of the problems I feel with LLM-generated code, as well. It's almost always between 5x and long and 20x (!) as long as it needs to be. Though in the case of code verbosity, it's usually not because of thoroughness so much as extremely bad style.

dcre

about 1 month ago

Docs also often don’t have anyone’s name on them, in which case they’re already attributed to an unknown composite author.

averynicepen

about 1 month ago

2 replies

Writing is an expression of an individual, while code is a tool used to solve a problem or achieve a purpose.

The more examples of different types of problems being solved in similar ways present in an LLM's dataset, the better it gets at solving problems. Generally speaking, if it's a solution that works well, it gets used a lot, so "good solutions" become well represented in the dataset.

Human expression, however, is diverse by definition. The expression of the human experience is the expression of a data point on a statistical field with standard deviations the size of chasms. An expression of the mean (which is what an LLM does) goes against why we care about human expression in the first place. "Interesting" is a value closely paired with "different".

We value diversity of thought in expression, but we value efficiency of problem solving for code.

There is definitely an argument to be made that LLM usage fundamentally restrains an individual from solving unsolved problems. It also doesn't consider the question of "where do we get more data from".

>the code you actually want to ship is so far from what LLMs write

I think this is a fairly common consensus, and my understanding is the reason for this issue is limited context window.

twodave

about 1 month ago

I argue that the intent of an engineer is contained coherently across the code of a project. I have yet to get an LLM to pick up on the deeper idioms present in a codebase that help constrain the overall solution towards these more particular patterns. I’m not talking about syntax or style, either. I’m talking about e.g. semantic connections within an object graph, understanding what sort of things belong in the data layer based on how it is intended to be read/written, etc. Even when I point it at a file and say, “Use the patterns you see there, with these small differences and a different target type,” I find that LLMs struggle. Until they can clear that hurdle without requiring me to restructure my entire engineering org they will remain as fancy code completion suggestions, hobby project accelerators, and not much else.

mac-attack

about 1 month ago

Very well stated.

IgorPartola

about 1 month ago

1 reply

My suspicion is that this is a form of the paradox where you can recognize that the news being reported is wrong when it is on a subject in which you are an expert but then you move onto the next article on a different subject and your trust resumes.

Basically if you are a software engineer you can very easily judge quality of code. But if you aren’t a writer then maybe it is hard for you to judge the quality of a piece of prose.

knollimar

about 1 month ago

Gell-Mann amnesia

AlexCoventry

about 1 month ago

> I think that the code you actually want to ship is so far from what LLMs write

It depends on the LLM, I think. A lot of people have a bad impression of them as a result of using cheap or outdated LLMs.

dcre

about 1 month ago

In my experience, LLMs have been quite capable of producing code I am satisfied with (though of course it depends on the context — I have much lower standards for one-off tools than long-lived apps). They are able to follow conventions already present in a codebase and produce something passable. Whereas with writing prose, I am almost never happy with the feel of what an LLM produces (worth noting that Sonnet and Opus 4.5’s prose may be moving up from disgusting to tolerable). I think of it as prose being higher-dimensional — for a given goal, often the way to express it in code is pretty obvious, and many developers would do essentially the same thing. Not so for prose.

cheeseface

about 1 month ago

There are cases where I would start the coding process by copy-pasting existing code (e.g. test suites, new screens in the UI) and this is where LLMs work especially well and produce code that is majority of the time production-ready as-is.

A common prompt I use is approximately ”Write tests for file X, look at Y on how to setup mocks.”

This is probably not ”de novo” and in terms of writing is maybe closer to something like updating a case study powerpoint with the current customer’s data.

make_it_sure

about 1 month ago

try Opus 4.5, you'll be surprised. It might be true for past versions of LLMs, but they advanced a lot.

themk

about 1 month ago

I recently published an internal memo which covered the same point, but I included code. I feel like you still have a "voice" in code, and it provides important cues to the reviewer. I also consider review to be an important learning and collaboration moment, which becomes difficult with LLM code.

mcqueenjordan

about 1 month ago

I guess to follow up slightly more:

- I think the "if you use another model" rebuttal is becoming like the No True Scotsman of the LLM world. We can get concrete and discuss a specific model if need be.

- If the use case is "generate this function body for me", I agree that that's a pretty good use case. I've specifically seen problematic behavior for the other ways I'm seeing it OFTEN used, which is "write this feature for me", or trying to one shot too much functionality, where the LLM gets to touch data structures, abstractions, interface boundaries, etc.

- To analogize it to writing: They shouldn't/cannot write the whole book, they shouldn't/cannot write the table of contents, they cannot write a chapter, IMO even a paragraph is too much -- but if you write the first sentence and the last sentence of a paragraph, I think the interpolation can be a pretty reasonable starting point. Bringing it back to code for me means: function bodies are OK. Everything else gets questionable fast IME.

bgwalter

about 1 month ago

1 reply

Cantrill jumps on every bandwagon. When he assisted in cancelling a Node developer (not a native English speaker) over pronouns he was following the Zeitgeist, now "Broadly speaking, LLM use is encouraged at Oxide."

He is a long way from Sun.

crabmusket

about 1 month ago

For those interested, here's a take from Bryan after that incident https://bcantrill.dtrace.org/2013/11/30/the-power-of-a-prono...

jhhh

about 1 month ago

I've had the same thought about 'written' text with an LLM. If you didn't spend time writing it don't expect me to read it. I'm glad he seems to be taking a hard stance on that saying they won't use LLMs to write non-code artifacts. This principle extends to writing code as well to some degree. You shouldn't expect other people to peer review 'your' code which was simply generated because, again, you spent no time making it. You have to be the first reviewer. Whether these cultural norms are held firmly remains to be seen (I don't work there), but I think they represent thoughtful application of emerging technologies.

thatxliner

about 1 month ago

The empathy section is quite interesting

fallat

about 1 month ago

The problem with this text is it's a written anecdote. Could all be fake.

111 more comments available on Hacker News

View full discussion on Hacker News

ID: 46178347Type: storyLast synced: 12/10/2025, 1:10:40 AM

Want the full context?