AI Tooling Must Be Disclosed for Contributions
Original: AI tooling must be disclosed for contributions
Key topics
Regulars are buzzing about a GitHub proposal to require disclosure of AI tooling in open-source contributions, sparking a lively debate on the merits and challenges of transparency. Commenters riff on the potential downsides of hiding AI involvement, with some pointing out that it can be hard to review PRs that heavily rely on AI-generated code. While some contributors argue that disclosure is a no-brainer, others raise concerns about the stigma surrounding AI-assisted work and the complexities of enforcing such a policy, particularly when it comes to copyright and ownership. As the discussion unfolds, insights from the US Copyright Office's report on AI and copyright are being shared, highlighting the relevance of this conversation in the rapidly evolving AI landscape.
Snapshot generated from the HN discussion
Discussion Activity
Very active discussionFirst comment
5m
Peak period
141
0-12h
Avg / period
22.9
Based on 160 loaded comments
Key moments
- 01Story posted
Aug 21, 2025 at 2:49 PM EDT
4 months ago
Step 01 - 02First comment
Aug 21, 2025 at 2:54 PM EDT
5m after posting
Step 02 - 03Peak activity
141 comments in 0-12h
Hottest window of the conversation
Step 03 - 04Latest activity
Aug 26, 2025 at 3:44 PM EDT
4 months ago
Step 04
Generating AI Summary...
Analyzing up to 500 comments to identify key contributors and discussion patterns
Want the full context?
Jump to the original sources
Read the primary article or dive into the live Hacker News thread when you're ready.
On the flip side, I’m preparing to open source a project I made for a serializable state machine with runtime hooks. But that’s blood sweat and tears labor. AI is writing a lot of the unit tests and the code, but it’s entirely by my architectural design.
There’s a continuum here. It’s not binary. How can we communicate what role AI played?
And does it really matter anymore?
(Disclaimer: autocorrect corrected my spelling mistakes. Sent from iPhone.)
that being said i feel like this is an intermediate step - it's really hard to review PRs that are AI slop because it's so easy for those who don't know how to use AI to create a multi-hundred/thousand line diff. but when AI is used well, it really saves time and often creates high quality work
Because of the perception that anything touched by AI must be uncreative slop made without effort. In the case of this article, why else are they asking for disclosure if not to filter and dismiss such contributions?
>I try to assist inexperienced contributors and coach them to the finish line, because getting a PR accepted is an achievement to be proud of. But if it's just an AI on the other side, I don't need to put in this effort, and it's rude to trick me into doing so.
Yes.
>but it's to deprioritize spending cycles debugging and/or coaching a contributor on code they don't
This is very much in line with my comment about doing it to filter and dismiss. The author didn't say "So I can reach out and see if their clear eagerness to contribute extends to learning to code in more detail".
An angle not mentioned in the OP is copyright - depending on your jurisdiction, AI-generated text can't be copyrighted, which could call into question whether you can enforce your open source license anymore if the majority of the codebase was AI-generated with little human intervention.
Well, if you had read what was linked, you would find these...
> I think the major issue is inexperienced human drivers of AI that aren't able to adequately review their generated code. As a result, they're pull requesting code that I'm sure they would be ashamed of if they knew how bad it was.
> The disclosure is to help maintainers assess how much attention to give a PR. While we aren't obligated to in any way, I try to assist inexperienced contributors and coach them to the finish line, because getting a PR accepted is an achievement to be proud of. But if it's just an AI on the other side, I don't need to put in this effort, and it's rude to trick me into doing so.
> I'm a fan of AI assistance and use AI tooling myself. But, we need to be responsible about what we're using it for and respectful to the humans on the other side that may have to review or maintain this code.
I don't know specifically what PR's this person is seeing. I do know it's been a rumble around the open source community that inexperienced devs are trying to get accepted PRs for open source projects because they look good on a resume. This predated AI in fact, with it being a commonly cited method to get attention in a competitive recruiting market.
As always, folks trying to get work have my sympathies. However ultimately these folks are demanding time and work from others, for free, to improve their career prospects while putting in the absolute bare minimum of effort one could conceivably put in (having Copilot rewrite whatever part of an open source project and shove it into a PR with an explanation of what it did) and I don't blame them for being annoyed at the number of low-quality submissions.
I have never once criticized a developer for being inexperienced. It is what it is, we all started somewhere. However if a dev generated shit code and shoved it into my project and demanded a headpat for it so he could get work elsewhere, I'd tell him to get bent too.
Are you kidding?
- For ages now, people have used "broad test coverage" and "CI" as excuses for superficial reviews, as excuses for negligent coding and verification.
- And now people foist even writing the test suite off on AI.
Don't you see that this way you have no reasoned examination of the code?
> ... and the code, but it’s entirely by my architectural design.
This is fucking bullshit. The devil is in the details, always. The most care and the closest supervision must be precisely where the rubber meets the road. I wouldn't want to drive a car that you "architecturally designed", and a statistical language model manufactured.
1.) Didn't try to hide the fact that they used AI
2.) Tested their changes
I would not care at all. The main issue is this is usually not the case, most people submitting PRs that are 90% AI do not bother testing (Usually they don't even bother running the automated tests)
What about just telling exactly what role AI played? You can say it generated the tests for you for instance.
But I also think that if a maintainer asks you to jump before submitting a PR, you politely ask, “how high?”
You might argue that by making rules, even futile ones, you at least establish expectations and take a moral stance. Well, you can make a statement without dressing it up as a rule. But you don't get to be sanctimonious that way I guess.
Not every time, but sometimes. The threat of being caught isn't meaningless. You can decide not to play in someone else's walled garden if you want but the least you can do is respect their rules, bare minimum of human decency.
The only legitimate reason to make a rule is to produce some outcome. If your rule does not result in that outcome, of what use is the rule?
Will this rule result in people disclosing "AI" (whatever that means) contributions? Will it mitigate some kind of risk to the project? Will it lighten maintainer load?
No. It can't. People are going to use the tools anyway. You can't tell. You can't stop them. The only outcome you'll get out of a rule like this is making people incrementally less honest.
If someone really wants to commit fraud they’re going to commit fraud. (For example, by not disclosing AI use when a repository requires it.) But if their fraud is discovered, they can still be punished for it, and mitigating actions taken. That’s not nothing, and does actually do a lot to prevent people from engaging in such fraud in the first place.
Yes that is the stated purpose, did you read the linked GitHub comment? The author lays out their points pretty well, you sound unreasonably upset about this. Are you submitting a lot of AI slop PRs or something?
P.S Talking. Like. This. Is. Really. Ineffective. It. Makes. Me. Just. Want. To. Disregard. Your. Point. Out. Of. Hand.
If this rule discourages low quality PRs or allows reviewers to save time by prioritizing some non-AI-generated PRs, then it certainly seems useful in my opinion.
You get someone that didn't use AI getting accused of using AI and eventually telling people to screw off and contributing nothing.
Total bullshit. It's totally fine to declare intent.
You are already incapable of verifying / enforcing that a contributor is legally permitted to submit a piece of code as their own creation (Signed-off-by), and do so under the project's license. You won't embark on looking for prior art, for the "actual origin" of the code, whatever. You just make them promise, and then take their word for it.
Unreviewed generated PRs can still be helpful starting points for further LLM work if they achieve desired results. But close reading with consideration of authorial intent, giving detailed comments, and asking questions from someone who didn't write or read the code is a waste of your time.
That's why we need to know if a contribution was generated or not.
Any contributor who was shown to post provably untested patches used to lose credibility. And now we're talking about accommodating people who don't even understand how the patch is supposed to work?
Example where this kind of contribution was accepted and valuable, inside this ghostty project https://x.com/mitchellh/status/1957930725996654718
It takes attempts, verifying the result behaves as desired, and iterative prompting to adjust. And it takes a lot of time to wait on agents in between those steps (this work isn’t a one shot response). You’re being reductive.
I have no clue in ghostty but I've seen plenty of stuff that doesn't compile much less pass tests. And I assert there is nothing but negative value in such "contributions".
If real effort went into it, then maybe there is value-- though it's not clear to me: When a project regular does the same work then at least they know the process. Like if there is some big PR moving things around at least the author knows that it's unlikely to slip in a backdoor. Once the change is reduced to some huge diff, it's much harder to gain this confidence.
In some projects direct PRs for programmatic mass renames and such have been prohibited in favor of requiring submission of the script that produces the change, because its easier to review the script carefully. The same may be necessary for AI.
Having the original prompts (in sequence and across potentially multiple models) can be valuable but is not necessarily useful in replicating the results because of the slot machine nature of it
Sure though I believe few commenters care much about ghostty specifically and are primarily discussing the policy abstractly!
> because of the slot machine nature of it
One could use deterministically sampled LLMs with exact integer arithmetic... There is nothing fundamental preventing it from being completely reproducible.
Besides, the output of an LLM is not really any more trustworthy (even if reproducible) than the contribution of an anonymous actor. Both require review of outputs. Reproducibility of output from prompt doesn't mean that the output followed a traceable logic such that you can skip a full manual code review as with your mass renaming example. LLMs produce antagonistic output from innocuous prompting from time to time, too.
It would be nice if they did, in fact, say they didn't know. But more often they just waste your time making their chatbot argue with you. And the chatbots are outrageous gaslighters.
All big OSS projects have had the occasional bullshitter/gaslighter show up. But LLMs have increased the incidence level of these sorts of contributors by many orders of magnitude-- I consider it an open question if open-public-contribution opensource is viable in the world post LLM.
Everyone promoting LLMs, especially on HN, claim that they're expertly using them by using artisanal prompts and carefully examining the output but.. I'm honestly skeptical. Sure, some people are doing that (I do it from time to time). But I've seen enough slop to think that more people are throwing around code that they barely understand than these advocates care to admit .
Those same people will swear that they did due diligence, but why would they admit otherwise? And do they even know what proper due diligence is? And would they still be getting their mythical 30%-50% productivity boost if they were actually doing what they claimed they were doing?
And that is a problem. I cannot have a productive code review with someone that does not even understand what their code is actually doing, much less trade offs that were made in an implementation (because they did not consider any trade offs at all and just took what the LLM produced). If they can't have a conversation about the code at all because they didn't bother to read or understand anything about it, then theres nothing I can do except close the PR and tell them to actually do the work this time.
If trust didn't matter, there wouldn't have been a need for the Linux Kernel team to ban the University of Minnesota for attempting to intentionally smuggle bugs through the PR process as part of an unauthorized social experiment. As it stands, if you / your PRs can't be trusted, they should not even be admitted to the review process.
No you don’t. You can’t outsource trust determinations. Especially to the people you claim not to trust!
You make the judgement call by looking at the code and your known history of the contributor.
Nobody cares if contributors use an LLM or a magnetic needle to generate code. They care if bad code gets introduced or bad patches waste reviewers’ time.
That’s exactly opposite of what the author is saying. He mentions that [if the code is not good, or you are a beginner] he will help you get to finish line, but if it’s LLM code, he shouldn’t be putting effort because there’s no human on the other side.
It makes sense to me.
That's the false equivalence right there
I think you just haven't gotten the hang of it yet, which is fine... the tooling is very immature and hard to get consistent results with. But this isn't a given. Some people do get good, steerable LLM coding setups.
LLMs are trained to be steerable at inference time via context/prompting. Fine tuning is also possible and often used. Both count as "feedback" in my book, and my point is that both can be effective at "changing the LLM" in terms of its behavior at inference time.
The PR effectively ends up being an extremely high-latency conversation with an LLM, via another human who doesn't have the full context/understanding of the problem.
Stop trying to equate LLM-generated code with indexing-based autocomplete. They’re not the same thing at all: LLM-generated code is equivalent to code copied off Stack Overflow, which is also something you’d better not be attempting to fraudulently pass off as your own work.
For example, you either make your contributors attest that their changes are original or that they have the right to contribute their changes—or you assume this of them and consider it implicit in their submission.
What you (probably) don’t do is welcome contributions that the contributors do not have the right to make.
Assuring you didn’t include any AGPLv3 code in your contribution is exactly the same kind of assurance. It also doesn’t provide any provenance.
Conflating assurance with provenance is bogus because the former is about making a representation that, if false, exposes the person making it to liability. For most situations that’s sufficient that provenance isn’t needed.
That's a pretty nice offer from one of the most famous and accomplished free software maintainers in the world. He's promising not to take a short-cut reviewing your PR, in exchange for you not taking a short-cut writing it in the first place.
This “short cut” language suggests that the quality of the submission is going to be objectively worse by way of its provenance.
Yet, can one reliably distinguish working and tested code generated by a person vs a machine? We’re well past passing Turing tests at this point.
IMO when people declare that LLMs "pass" at a particular skill, it's a sign that they don't have the taste or experience to judge that skill themselves. Or - when it's CEOs - they have an interest in devaluing it.
So yes if you're trying to fool an experienced open source maintainer with unrefined LLM-generated code, good luck (especially one who's said he doesn't want that).
Would you like to take the Pepsi challenge? Happy to put random code snippets in front of you and see whether you can accurately determine whether it was written by a human or an LLM.
In an open source project I think you have to start with a baseline assumption of "trust nobody." Exceptions possibly if you know the contributors personally, or have built up trust over years of collaboration.
I wouldn't reject or decline to review a PR just because I don't trust the contributor.
Presumably if a contributor repeatedly made bad PRs that didn't do what they said, introduced bugs, scribbled pointlessly on the codebase, and when you tried to coach or clarify at best they later forgot everything you said and at worst outright gaslit and lied to you about their PRs... you would reject or decline to review their PRs, right? You'd presumably ban the outright.
Well that's exactly what commercial LLM products, with the aid of less sophisticated users, have already done to the maintainers of many large open source projects. It's not that they're not trusted-- they should be distrusted with ample cause.
So what if the above banned contributor kept getting other people to mindlessly submit their work and even proxy communication through -- evading your well earned distrust and bans? Asking people to at least disclose that they were acting on behalf of the distrusted contributor would be the least you would do, I hope? Or even asking them to disclose if and to what extent their work was a collaboration with a distrusted contributor?
- People use AI to write cover letters. If the companies don't filter out them automatically, they're screwed.
- Companies use AI to interview candidates. No one wants to spend their personal time talking to a robot. So the candidates start using AI to take interviews for them.
etc.
If you don't at least tell yourself that you don't allow AI PRs (even just as a white lie) you'll one day use AI to review PRs.
Imagine living before the invention of the printing press, and then lamenting that we should ban them because it makes it "too easy" to distribute information and will enable "low quality" publications to have more reach. Actually, this exact thing happened, but the end result was it massively disrupted the world and economy in extremely positive ways.
Citation needed, I don’t think the printing press and gpt are in any way comparable.
Imagine seeing “rm -rf / is a function that returns “Hello World!” and thinking “this is the same thing as the printing press”
https://bsky.app/profile/lookitup.baby/post/3lu2bpbupqc2f
In some cases sure but it can also create the situation where people just waste time for nothing (think AI interviewing other AIs - this might generate GDP by people purchasing those services but I think we can all agree that this scenario is just wasting time and resource without improving society).
I can generate 1,000 PRs today against an open source project using AI. I think you do care, you are only thinking about the happy path where someone uses a little AI to draft a well constructed PR.
There's a lot ways AI can be used to quickly overwhelm a project maintainer.
Then perhaps the way you contribute, review, and accept code is fundamentally wrong and needs to change with the times.
It may be that technologies like Github PRs and other VCS patterns are literally obsolete. We've done this before throughout many cycles of technology, and these are the questions we need to ask ourselves as engineers, not stick our heads in the sand and pretend it's 2019.
Before PR's existed we passed around code changes via email. Before containers we installed software on bare metal servers. And before search engines we used message boards. It's not unfathomable that the whole idea of how we contribute and collaborate changes as well. Actually that is likely going to be the /least/ shocking thing in the next few years if acceleration happens (i.e. The entire OS is an LLM that renders pixels, for example)
If it's exactly the same as what you'd have written manually, and you are confident it works, then what's the point of disclosure?
An LLM is regurgitating things from outside that space, where you have no idea of the provenance of what it’s putting into your code.
It doesn’t just matter that the code you’re contributing to a project is correct, it matters quite a lot if it’s actually something you’re allowed to contribute.
- You can’t contribute code that your employer owns to a project if they don’t want you to. - You can’t contribute code under a license that the project doesn’t want you to use. - And you can’t contribute code written by someone else and claim it’s your intellectual property without some sort of contract in place to grant that.
If you use an LLM to generate code that you’re contributing, you have both of the latter two problems. And all of those apply *even if* the code you’re contributing is identical to what you’d have written by hand off the top of your head.
When you contribute to a project, you’re not just sending that project a set of bits, you’re making attestations about how those bits were created.
Why does this seem so difficult for some supposed tech professionals to understand? The entire industry is intellectual property, and this is basic “IP 101” stuff.
Maybe because 99% of people that complain about this complain about problems that never occur in 99% of the cases they cite. My employer isn’t going to give a shit that code that I’ve written for their internal CRUD app gets more or less directly copied into my own. There’s only one way to do that, it was already in my head before I wrote it for them, and it’ll still be in after. As long as I’m not directly competing with their interests, what the hell do they care.
> When you contribute to a project, you’re not just sending that project a set of bits, you’re making attestations about how those bits were created.
You are really not. You are only doing that if the project requires some attestation of provenance. I can tell you that none of mine do.
If you want me to put in the effort- you have to put it in first.
Especially considering in 99% of cases even the one who generated it didn’t fully read/understand it.
- books
- search engines
- stack overflow
- talking to a coworker
then it's not clear why you would have to disclose talking to an AI.
Generally speaking, when someone uses the word "slop" when talking about AI it's a signal to me that they've been sucked into a culture war and to discount what they say about AI.
It's of course the maintainer's right to take part in a culture war, but it's a useful way to filter out who's paying attention vs who's playing for a team. Like when you meet someone at a party and they bring up some politician you've barely heard of but who their team has vilified.
It’s explained right there in the PR:
> The disclosure is to help maintainers assess how much attention to give a PR. While we aren't obligated to in any way, I try to assist inexperienced contributors and coach them to the finish line, because getting a PR accepted is an achievement to be proud of. But if it's just an AI on the other side, I don't need to put in this effort, and it's rude to trick me into doing so.
That is not true of books, search engines, stack overflow, or talking to a worker, because in all those cases you still had to do the work yourself of comprehending, preparing, and submitting the patch. This is also why they ask for a disclosure of “the extent to which AI assistance was used”. What about that isn’t clear to you?
So fail to disclose at your own peril.
Whether it's prose or code, when informed something is entirely or partially AI generated, it completely changes the way I read it. I have to question every part of it now, no matter how intuitive or "no one could get this wrong"ish it might seem. And when I do, I usually find a multitude of minor or major problems. Doesn't matter how "state of the art" the LLM that shat it out was. They're still there. The only thing that ever changed in my experience is that problems become trickier to spot. Because these things are bullshit generators. All they're getting better at is disguising the bullshit.
I'm sure I'll gets lots of responses trying to nitpick my comment apart. "You're holding it wrong", bla bla bla. I really don't care anymore. Don't waste your time. I won't engage with any of it.
I used to think it was undeserved that we programmers called ourselved "engineers" and "architects" even before LLMs. At this point, it's completely farcical.
"Gee, why would I volunteer that my work came from a bullshit generator? How is that relevant to anything?" What a world.
Programming languages were a nice abstraction to accommodate our inability to comprehend complexity - current day LLMs do not have the same limitations as us.
The uncomfortable part will be what happens to PRs and other human-in-the-loop checks. It’s worthwhile to consider that not too far into the future, we might not be debugging code anymore - we’ll be debugging the AI itself. That’s a whole different problem space that will need an entirely new class of solutions and tools.
Natural language can be specific, but it requires far too many words. `map (+ 1) xs` is far shorter to write than "return a list of elements by applying a function that adds one to its argument to each element of xs and collecting the results in a separate list", or similar.
I believe it won’t be long before we have exceptional “programmers” who have mastered the art of vibe coding. If that does become the de facto standard for 80% programming done, then it’s not a long stretch from there that we might skip programming languages altogether. I’m simply suggesting that if you’re not going to examine the code, perhaps someone will eliminate that additional layer or step altogether, and we might be pleasantly surprised by the final result.
--
[1] https://www.copyright.gov/ai/
[2] https://www.copyright.gov/ai/Copyright-and-Artificial-Intell...
> • Copyright protects the original expression in a work created by a human author, even if the work also includes AI-generated material
> • Human authors are entitled to copyright in their works of authorship that are perceptible in AI-generated outputs, as well as the creative selection, coordination, or arrangement of material in the outputs, or creative modifications of the outputs.
"In the Office’s view, it is well-established that copyright can protect only material that is the product of human creativity. Most fundamentally, the term “author,” which is used in both the Constitution and the Copyright Act, excludes non-humans." "In the case of works containing AI-generated material, the Office will consider whether the AI contributions are the result of “mechanical reproduction” or instead of an author’s “own original mental conception, to which [the author] gave visible form.” 24 The answer will depend on the circumstances, particularly how the AI tool operates and how it was used to create the final work.25 This is necessarily a case-by-case inquiry." "If a work’s traditional elements of authorship were produced by a machine, the work lacks human authorship and the Office will not register it."
The office has been quite consistent that works containing both human-made and AI-made elements will be registerable only to the extent that they contain human-made elements.
The source you linked says the opposite of that: "the inclusion of elements of AI-generated content in a larger human-authored work does not affect the copyrightability of the larger human-authored work as a whole"
That is, it suggests that even if there are elements of human-generated content in a larger machine-generated work, the combined work as a whole is not eligible for copyright protection. Printed page iii of that PDF talks a bit more about that:
Just to be sure that I wasn't misremembering, I went through part 2 of the report and back to the original memorandum[1] that was sent out before the full report issued. I've included a few choice quotes to illustrate my point:
"These are no longer hypothetical questions, as the Office is already receiving and examining applications for registration that claim copyright in AI-generated material. For example, in 2018 the Office received an application for a visual work that the applicant described as “autonomously created by a computer algorithm running on a machine.” 7 The application was denied because, based on the applicant’s representations in the application, the examiner found that the work contained no human authorship. After a series of administrative appeals, the Office’s Review Board issued a final determination affirming that the work could not be registered because it was made “without any creative contribution from a human actor.”"
"More recently, the Office reviewed a registration for a work containing human-authored elements combined with AI-generated images. In February 2023, the Office concluded that a graphic novel comprised of human-authored text combined with images generated by the AI service Midjourney constituted a copyrightable work, but that the individual images themselves could not be protected by copyright. "
"In the Office’s view, it is well-established that copyright can protect only material that is the product of human creativity. Most fundamentally, the term “author,” which is used in both the Constitution and the Copyright Act, excludes non-humans."
"In the case of works containing AI-generated material, the Office will consider whether the AI contributions are the result of “mechanical reproduction” or instead of an author’s “own original mental conception, to which [the author] gave visible form.” The answer will depend on the circumstances, particularly how the AI tool operates and how it was used to create the final work. This is necessarily a case-by-case inquiry."
"If a work’s traditional elements of authorship were produced by a machine, the work lacks human authorship and the Office will not register it."[1], pgs 2-4
---
On the odd chance that somehow the Copyright Office had reversed itself I then went back to part 2 of the report:
"As the Office affirmed in the Guidance, copyright protection in the United States requires human authorship. This foundational principle is based on the Copyright Clause in the Constitution and the language of the Copyright Act as interpreted by the courts. The Copyright Clause grants Congress the authority to “secur[e] for limited times to authors . . . the exclusive right to their . . . writings.” As the Supreme Court has explained, “the author [of a copyrighted work] is . . . the person who translates an idea into a fixed, tangible expression entitled to copyright protection.”
"No court has recognized copyright in material created by non-humans, and those that have spoken on this issue have rejected the possibility. "
"In most cases, however, humans will be involved in the creation process, and the work will be copyrightable to the extent that their contributions qualify as authorship." -- [2], pgs 15-16
---
TL;DR If you make something with the assistance of AI, you still have to be personally involved and contribute more than just a prompt in order to receive copyright, and then you will receive protection only over such elements of originality and authorship that you are responsible for, not those elements which the AI is responsible for.
--- [1] https://copyright.gov/ai/ai_policy_guidance.pdf [2] https://www.copyright.gov/ai/Copyright-and-Artificial-Intell...
or say "fork you."
If, in the dystopian future, a justice court you're subjected to decides that Claude was trained on Oracle's code, and all Claude users are possibly in breach of copyright, it's easier to nuke from orbit all disclosed AI contributions.
On the one hand, it's lowered the barrier to entry for certain types of contributions. But on the other hand getting a vibe-coded 1k LOC diff from someone that has absolutely no idea how the project even works is a serious problem because the iteration cycle of getting feedback + correctly implementing it is far worse in this case.
Also, the types of errors introduced tend to be quite different between humans and AI tools.
It's a small ask but a useful one to disclose how AI was used.
That said, requiring adequate disclosure of AI is just fair. It also suggests that the other side is willing to accept AI-supported contributions (without being willing to review endless AI slop that they could have generated themselves if they had the time to read it).
I would expect such a maintainer to respond fairly to "I first vibecoded it. I then made manual changes, vibecoded a test, cursorily reviewed the code, checked that the tests provide good coverage, ran both existing and new tests, and manually tested the code."
That fair response might be a thorough review, or a request that I do the thorough review before they put in the time, but I'd expect it to be more than a blatant "nope, AI touched this, go away".
https://www.jetbrains.com/help/idea/full-line-code-completio...
Do I need to disclose that I wrote a script to generate some annoying boilerplate? Or that my IDE automatically templates for loops?
Edit: Also, it's always good to provide maximal context to reviewers. For example, when I use code from StackOverflow I link the relevant answer in a comment so the reviewer doesn't have to re-tread the same ground I covered looking for that solution. It also gives reviewers some clues about my understanding of the problem. How is AI different in this regard?
Yes, you have to disclose it.
> Do I need to disclose that I wrote a script to generate some annoying boilerplate?
You absolutely need to disclose it.
> Or that my IDE automatically templates for loops?
That's probably worth disclosing too.
Fraud and misrepresentation are always options for contributors, at some point one needs to trust that they’re adhering to the rules that they agreed to adhere to.
What you’re saying is essentially the code equivalent of “I found this image via Google search so of course it’s OK to put into a presentation, it’s on the web so that means I can use it.” This may not be looked at too hard for an investor presentation, but if you’re doing a high profile event like Apple’s WWDC you’ll learn quickly that all assets require clearance and “I found it on the web” won’t cut it—you’ll be made to use a different image or, if you actually present with the unlicensed image, you could be disciplined or outright fired for causing the company liability.
It’s amazing how many people in this industry think it’s OK to just wing this shit and even commit outright fraud just because it’s convenient.
You can talk about how we should act and be all high and mighty all you like, but it’s just burying your head in the sand about the reality of how code is written.
Also, technically, I never said this made it perfectly ok. It’s just that it’s the reality we live in and if we got rid of everyone doing it we’d have to fire 99% of programmers.
Look around. Do you see the majority of programmers getting fired for copying a line from stackoverflow or using AI?
You must either work in an ultra high security area or are so removed from the groundwork of most programming jobs that you don’t know how people do anything anymore. I’m not surprised you mentioned 30+ years, because that likely puts you squarely out of the trenches where the development is actually done.
Outside of like, the military or airplane software, companies really don’t care about provenance most of the time, their lack of processes to avoid looking into any of that are absolute PROOF of that. It’s don’t ask don’t tell out there.
You can be delusional all you like, it doesn’t change the reality of how most development is done.
Again, I didn’t say it’s a good thing, it’s just that it is reality.
Make a knowledgeable reply and give no reference to the AI you used- comment is celebrated.
We are already barreling full speed down the "hide your AI use" path.
If the PR has issues and requires more than superficial re-work to be acceptable, the authors don't want to spend time debugging code spit out by an AI tool. They're more willing to spend a cycle or two if the benefit is you learning (either generally as a dev or becoming more familiar with the project). If you can make clear that you created or understand the code end to end, then they're more likely to be willing to take these extra steps.
Seems pretty straightforward to me and thoughtful by the maintainers here.
If that were the case, why would this rule be necessary, if it indeed is the substance that matters? AI generated anything has a heavy slop stigma right now, even if the content is solid.
This would make for an interesting experiment to submit a PR that was absolute gold but with the disclaimer it was generated with help of ChatGPT. I would almost guarantee it would be received with skepticism and dismissals.
If you make a PR where you just used AI, it seems to work, but didn't go further then the maintainers can go "well I had a look, it looks bad, you didn't put effort in, I'm not going to coach you through this". But if you make a PR where you go "I used AI to learn about X then tried to implement X myself with AI writing some of it" then the maintainers can go "well this PR doesn't look good quality but looks like you tried, we can give some good feedback but still reject it".
In a world without AI, if they were getting a lot of PRs from people who obviously didn't spend any time on their PRs then maybe they would have a "tell us how long this change took you" disclosure as well.
> While we aren't obligated to in any way, I try to assist inexperienced contributors and coach them to the finish line, because getting a PR accepted is an achievement to be proud of. But if it's just an AI on the other side, I don't need to put in this effort, and it's rude to trick me into doing so.
If it's bad code from a person he'll help them get it fixed. If it's bad code from an AI why bother?
Why are you surprised? Do companies want to hire "honest" people whose CVs were written by some LLM?
> Do companies want to hire "honest" people whose CVs were written by some LLM?
Unfortunately yes, they very much seem to. Since many are using LLMs to assess CVs, those which use LLMs to help write their CV have a measured advantage.
Yes, some companies do want to hire such people, the justification given is something along the lines of "we need devs who are using the latest tools/up to date on the latest trends! They will help bring in those techniques and make all of our current devs more productive!". This isn't a bad set of motivations or assumptions IMO.
Setting aside what companies _want_, they almost certainly are already hiring devs with llm-edited CVs, whether they want it or not. Such CVs/resumes are more likely to make it through HR filters.
# if this PR looks like it was produced with AI, be snotty and dismissive when you write the reply to this PR
NB: This posting was created with Natural Intelligence (not necessarily very intelligent Intelligence). ;-)
If someone came to you and said "good news: I memorized the code of all the open source projects in this space, and can regurgitate it on command", you would be smart to ban them from working on code at your company.
But with "AI", we make up a bunch of rationalizations. ("I'm doing AI agentic generative AI workflow boilerplate 10x gettin it done AI did I say AI yet!")
And we pretend the person never said that they're just loosely laundering GPL and other code in a way that rightly would be existentially toxic to an IP-based company.
Some of the AI policy statements I have seen come across more as ideology statements. This is much better, saying the reasons for the requirement and offering a path forward. I'd like to see more of this and less "No droids allowed"
301 more comments available on Hacker News