The Fsf Considers Large Language Models
Mood
controversial
Sentiment
mixed
Category
other
Key topics
The FSF is considering the implications of large language models (LLMs) on free software, sparking debate about copyright, licensing, and the ethics of AI-generated code.
Snapshot generated from the HN discussion
Discussion Activity
Very active discussionFirst comment
1h
Peak period
74
Day 1
Avg / period
39
Based on 78 loaded comments
Key moments
- 01Story posted
Oct 26, 2025 at 9:38 AM EDT
about 1 month ago
Step 01 - 02First comment
Oct 26, 2025 at 10:50 AM EDT
1h after posting
Step 02 - 03Peak activity
74 comments in Day 1
Hottest window of the conversation
Step 03 - 04Latest activity
Oct 27, 2025 at 4:38 PM EDT
about 1 month ago
Step 04
Generating AI Summary...
Analyzing up to 500 comments to identify key contributors and discussion patterns
Well yes, LLMs like Claude Code are merely a "copyright violation as a service". Everyone is so focused on the next new "AI" feature but we haven't actually resolved the issue of all model providers using stolen code to train their models and their lack of transparency on sourced training data.
The bigger issue (spiritually anyway) seems to be the need to develop free software LLM tools the same way FSF needed to develop free compilers. That's what's going to keep users from being able to adapt and control their machines. The issue is more ecological that programmers equipped with LLM are likely much more productive at creating and modifying code.
Some of the rest seems more like saying that anyone who studies GCC internals is forever tainted and must write copyleft code for life which seems laughable to me. Again this is more a topic of plagiarism than copyright which are fairly similar but actually different and not as clear cut.
You’re right, in the context of a technical legal interpretation they’re different. In the context of right or wrong, they amount to the same.
> anyone who studies GCC internals
LLMs are not a someone, they’re more like … the indigo printout of some text or design, you then use to make a scrapbook to be mass produced for profit. Very different situation.
When the AI bubble pops, I hope we will have some equalisation back to something more ethical.
This is in line with my disagreement over the fair use rulings. Most people who published works that have been used to train AI systems, created those works and published them for other people to consume and benefit from, not for proprietary software systems to consume and benefit from. The existing licenses and laws did not account for this; nobody was anticipating it.
But I also disagree in general about LLMs. LLMs are statistical text models but the general concept of what if there were an "AI" that wasn't a LLM and was trained on open source software is the same at the end of the day. I think whether or not LLM are intelligent or equivalent to humans is a red herring. There's no reason to not consider the implications of machines that are indistinguishable or even superior to human programmers. Particularly if we're discussing ethics getting lost in implementation details seems like a distraction and then all the derived ethics gets thrown out after the next innovation.
As a society, we don't benefit from copyright maximalism, despite how trendy it is around here all of a sudden. See also Oracle v. Google.
[1] https://www.reddit.com/r/programming/comments/oc9qj1/copilot...
Nobody benefits from a law that says that LLMs can't regurgitate the Quake sqrt() approximation. If that's what the law actually says, which it isn't.
I just think it's especially asinine how corporations are perfectly willing to launder copyrighted works via LLMs when it's profitable to do so. We have to perpetually pay them for their works and if we break their little software locks it's felony contempt of business model, but they get to train their AIs on our works and reproduce them infinitely and with total impunity without paying us a cent.
It's that "rules for thee but not for me" nonsense that makes me reach such extreme logical conclusions that I feel empathy for terrorists.
> copyright shouldn't even exist to begin with.
You then get trade secretes and guilds. Hardly an improvement.Absolutely an improvement. Information wants to be free. Stop criminalizing it and people will find a way to free it. And once it's out there it's over, there is no containing it.
Even when the idea of a thing is "out there" there is a lot of grunt work and special stuff that needs to be implemented to get the best outcomes. Nobody owes you that work for free. Regardless of what GPL copers say, it is very hard to make money with software without enforcing some access restrictions and IP. Open source is great when it works, but it does not work for most things nor is it at the leading edge for most things.
Copyright is a functionally perpetual state granted monopoly on information, on numbers. A business model that depends on such a delusion should not even exist to begin with.
Saying that intellectual property is founded on a delusion because someone can infringe on it or it's mere numbers is like saying your physical property rights are a delusion because the stuff is mere matter and can be manipulated by anyone. While true, such statements add absolutely nothing to the conversation. If people are willing to copy from others rather than make their own creations, there is value in the original works that is a direct result of someone's labor, not some immutable law of the universe.
>We have to perpetually pay them for their works and if we break their little software locks it's felony contempt of business model
You don't have to pay them, or break their restrictions.
>but they get to train their AIs on our works and reproduce them infinitely and with total impunity without paying us a cent.
You don't need to allow this either. Unfortunately open-source code is necessarily public.
>It's that "rules for thee but not for me" nonsense that makes me reach such extreme logical conclusions that I feel empathy for terrorists.
The way LLMs use code is fundamentally different from wholesale copying. If someone read your code and paraphrased it and tweaked it, it would be a completely new work not subject to the original copyright. At least it would be really hard to get a court to regard it as an infringement. This is like what LLMs do.
How is it contradictory? Tell it to the corporations who defend copyright for you and public domain fair use for themselves. If they were honest, they'd abolish copyright straight up instead of creating this idiotic caste system.
> Copyright shouldn't exist, but the businesses infringing on it are the bad ones?
Yes. Copyright shouldn't exist to begin with, but since it does, one would expect the corporations to work within the legal framework they themselves created and lobbied so heavily for. One would expect them to reap the consequences of their actions and be bound by the exact same limitations they seek to impose on us.
It is absolutely asinine to watch them make trillions of dollars by breaking their own rules while simultaneously pretendinf that nothing is happening and insisting that you mortal citizen must still abide by the same rules they are breaking.
The sheer dishonesty of it makes me sick to my core.
> If someone read your code and paraphrased it and tweaked it, it would be a completely new work not subject to the original copyright.
Derivative work.
I was once told that corporate programmers are warned by legal not to even read AGPLv3 source code, lest it subconsciously infect their thought processes and the final result. This is also the reason we have clean room reverse engineering where one team produces documentation and another uses it to reimplement the thing. Isolating minds from the copyrighted inputs is the whole point of it. All of this is risk management meant to disallow even the mere possibility that a derivative work was created in the process.
There is absolutely no reason to believe LLMs are any different. They are literally trained on copyrighted inputs. Either they're violating copyrights or we're being oppressed by these copyright monopolists who say we can't do stuff we should be able to do. Both cannot be true at the same time.
> At least it would be really hard to get a court to regard it as an infringement.
It's extremely hard to get a court to do anything. As in tens of thousands if not hundreds of thousands of dollars difficult. Nothing is decided until actual judges start deciding things, and to get to that point you need to actually go through the legal system, and to do that you need to pay expensive lawyers lots of money. It's the reason people instantly fold the second legal action is threatened, doesn't matter if they're right. Corporations have money to burn, we don't.
And that's assuming that courts are presided by honest human beings who believe in law and reason instead of political activist judges or straight up corrupt judges who can be lobbied by industry.
There are different views out there about this. If you literally just copy a piece of code and make stupid changes, it might be a derivative work. But this is not guaranteed. There are times when there is one idiomatic way to do a thing, so your code will necessarily be similar to other code in the world. That type of code is not copyrightable, even if it appears in a larger work that is copyrightable. A small amount of bog standard code resembling something in another project is not in and of itself evidence of infringement.
Corporations would rather not have to deal with unnecessarily similar code or deliberate copyright or patent infringement. So they generally tell you not to look at anything else.
You should read over the criteria for something to be copyrightable: https://guides.lib.umich.edu/copyrightbasics/copyrightabilit...
The biggest issue is that any individual part of a copyrighted work may not be copyrightable. If you dissect a large copyrighted work, it probably contains many uncopyrightable structures. For example, in a book, most phrases and grammatical structures are not copyrightable. The style is not copyrightable. In code, common boilerplate is probably not copyrightable. Please don't make your stuff bizarre to add to its originality though. We need to be able to read and understand it lol.
>There is absolutely no reason to believe LLMs are any different. They are literally trained on copyrighted inputs. Either they're violating copyrights or we're being oppressed by these copyright monopolists who say we can't do stuff we should be able to do. Both cannot be true at the same time.
People who learn to program by reading open-source code are also trained on copyrighted inputs. Copyright may need some rethinking to cope with the reality of LLMs. Unless it is proven that AI is spitting out unique and copyrightable blocks of code from other projects, it really isn't infringing. People do this type of shit all the time. Have you ever looked at Stack Overflow and copied a couple of lines from it? You probably infringed on someone's copyright.
Ultimately, even if you prove copyright infringement happened, you have basically no recourse unless you also prove damages. Since open-source code is public and given away for free, the only possible damage is generally in being deprived of contributions that might have resulted from direct usage of your code. But direct integration of your entire project might have been highly unlikely anyway. Like it or not, people can be inspired by your work in a way that can't be proven without their direct confession.
most countries don't have a concept of fair use
but they nearly all have copyright law
That's all i've been saying from the start, and your pedantry isn't helpful. It doesn't make you look smart. You haven't added any value or illumination to the conversation.
if not there's plenty of competitors that will
don't worry, the orange one is fixing that problem too
Are there any rulings about use of code generated by model trained on copyrighted code?
I believe distinction is clear.
Exactly - that is what copyright was set up for (whereas before copyright, people could copy freely), and then copyleft comes and says that everyone can use and modify the code, and the only thing that the license prohibits is the ability of authors of derived works to apply their own copyright. In this way (known as being "viral"), copyleft uses the legal mechanism of copyright to essentially bring things to how they were before copyright.
"Many years ago, he said, photographs were not generally seen as being copyrightable. That changed over time as people figured out what could be done with that technology and the creativity it enabled. Photography may be a good analogy for LLMs, he suggested."
I have zero trust in the FSF since they backstabbed Stallman.
EDIT: Criticizing anything from LWN, be it Debian, Linux or FSF related, results in instant downvotes. LWN is not a critical publication and just lionizes whoever has a title and bloviates on a mailing list or at a conference.
The controversial line might have also been that one.
Are there any protests or demands for the cancellation of Trump, Clinton, Wexner, Black, Barak?
I have not seen any. The cancel tech people only go after those who they perceive as weak.
No, the reason why this "second cancellation" is vague is because it was the typical feeding frenzy that happens after a successful cancellation, where people hop on to paint previously uninteresting slanders in a new light. Stallman, before saying something goofy about Epstein, was constantly slandered by people who hated what he stood for and by people that were jealous of him. After he said the goofy thing, they all piled in to say "you should have listened to me." The "second cancellation" is when "he asked me out once at a conference" becomes redolent of sexual assault.
None of them seem to like the politics of Free Software, either. They attempt to taint the entire philosophy with the false taint of Stallman saying that sleeping with older teenagers that seemed to be consenting isn't the worst crime in the world. The people who attacked him for that would defend any number of intimately Epstein-related people to the death; the goal imo was to break (or to take over and steer into a perversion of itself) Free Software. Every one of them was the "it's not fair to say that about Apple" type.
It was actually a few years later, prompted by Richard Stallman's reinstatement by the board. I don't know what you mean by "feeding frenzy", but I habitually ignore the unreasonable voices in such cases: it's safe to assume I'm not talking about those.
> "he asked me out once at a conference"
That wasn't the main focus of the criticism I saw. However, there is an important difference between an attendee asking someone out at a conference, and an invited speaker (or organiser) asking someone out at a conference. If you're going to be in a leadership position, you need to be aware of power dynamics.
That's a running theme throughout all of the criticism of Richard Stallman, if you choose to abstract it that way: for all he's written on the subject, he doesn't understand power dynamics in social interactions. He's fully capable of understanding it, but I think he prefers the simpler idea of (right-)libertarian freedom. (And by assuming he expects others to believe he'll behave according to his respect of the (right-)libertarian freedom of others, you can paint a very sympathetic picture of the man. That doesn't mean he should be in a leadership position for an organisation as important as the FSF, behaving as he does.)
> None of them seem to like the politics of Free Software, either.
Several of them are involved in other Free Software projects. To the extent those people have criticisms of the politics of Free Software, it's that it doesn't go far enough to protect user freedoms. (I suspect I shouldn't have got involved in this argument, since I'm clearly missing context you take for granted.)
so one side of social messaging is "Don't bother trying to look for a date if you're not a CEO, worth millions, have a home, an education, a plan, a yacht and a summer home" ,
and the other side is
"If you're powerful you'd better know that any kind of question needs to be re-framed with the concept of a power dynamic involvement, and that if you're sufficiently powerful there is essentially no way to pursue a relationship with a lesser mortal without essentially raping them through the power dynamics of the question itself and the un-deniability of a question asked by such a powerful God."
... and you say birth rates are declining precipitously?
Pretty ridiculous. It used to be that we used conventions as the one and only time to flatten the social hierarchy -- it was the one moment where you could talk and have a slice of pizza with a billionaire CEO or actor or whatever.
Re-substantiating the classism within conventions just pushes them furthest into corporate product marketing and employment fairs -- in other words it turns it into shit no one wants to attend without being paid to sit in a booth.
But all of that isn't the problem : the problem lies with personal sovereignty.
If someone doesn't want to do something, they say no. If they receive retribution because of that no we then investigate the retribution and as a society we turn the ne'er-do-well into a social pariah until they have better behavior.
There is a major problem when we as a society have decided "No, the problem is with the underlying pressure of what a no 'may mean' for their future." 'May' being the operative word.
We have turned this into a witch-hunt, but for maybe-witches or those who may turn into witches without any real evidence of witch craft that prompted the chase.
'Power dynamics's is shorthand for "I was afraid i'd be fired if I denied Stallman." ; did anything resembling this ever occur?
> If someone doesn't want to do something, they say no. If they receive retribution because of that no we then investigate the retribution and as a society we turn the ne'er-do-well into a social pariah until they have better behavior.
This only works if we have accountability. You can't have accountability if there's no evidence that a conversation took place, and if decisions aren't made in open and transparent ways: you can't classify things as "retribution" or "not retribution" without… witch hunts. Oh. So it doesn't solve the witch-hunt problem. (Wearing a body-cam everywhere would, but that kind of mass surveillance has its own problems.)
"Turn the ne'er-do-well into a social pariah" doesn't help the victim of retribution.
If the (alleged) ne'er-do-well has a strong enough support network, no force on earth will turn them into a social pariah, so this becomes an exercise in eroding political support, and… oh. That's also a procedure decoupled from justice.
This is not a simple topic, and it does not have simple solutions. Many of the issues you've identified (such as selective enforcement) are issues, but that doesn't mean your proposed solutions actually work.
> "I was afraid i'd be fired if I denied Stallman." ; did anything resembling this ever occur?
Edit: while waiting for the rate limit to expire, I found some claims of Paul Fisher, quoted in the "Stallman Report" https://stallman-report.org/:
> RMS would often throw tantrums and threaten to fire employees for perceived infractions. FSF staff had to show up to work each day, not knowing if RMS had eliminated their position the night before.
This conflicts with my understanding of Richard Stallman's views and behaviour. I'll have to look into this further. I've left my original answer below.
---
I vaguely recall a time he tried to remove authority from someone, in favour of a packed committee, because he disagreed with a technical decision they made. (It didn't really work, because the committee either had no opinion, or agreed with the former authority figure about that technical decision.) Can't find a reference though.
But in this kind of context, I'm not aware of Richard Stallman ever personally retaliating against someone for saying no to him. I don't imagine he'd approve of such behaviour, and he's principled enough that I doubt he'd ever do it. (There are a few anecdotes set in MIT about pressures from other people, but these are not directly Richard Stallman's fault, so I think it's unfair to blame him for them.)
This isn't really the point, though. A community leader should be aware of "people stuff" like this, and act to mitigate it. If he doesn't want the responsibility, he shouldn't have the power. By all accounts, he doesn't want the responsibility.
"Power" should not be confused with "prestige". If an attendee can ensure the speaker's disinvitation from future events by their complaint, they have plenty of power themselves.
But there does seem to be a lack of uniformity about cancellation. For example, many people seem totally unconcerned by the stuff Bernie Sanders wrote about toddler sexuality, but the same writings from RMS would almost certainly be considered deeply problematic. My take is that both are problematic.
In other cases, sometimes I'll hear about problematic behavior that is on the order of "this awkward person asked me out," where the same behavior by a more attractive and charming person would be often welcome. It seems like the standard of acceptable behavior probably shouldn't depend on whether someone is attractive or charming.
So IMO, the current approach depends a lot on social dynamics and both misses problematic behavior of popular people and also overly restricts normal behavior of unpopular people.
I think we need a more rules-based approach, and my guess is that's where we'll eventually settle. Arguably, things seem to have trended that way already.
It does not. Before asking someone out, there is also nonverbal communication happening like eye contact. Ignoring eye contact and all the little signs and then asking out of the blue can be awkward, but is also not problematic in a bad way, unless it was asked in a sexually charged way.
The standard of acceptable behavior is reading all the little signs, before asking for more. But asking someone out in no sexually charged way is never problematic, even though it might result in comments.
The thing that defines awkward behavior is being unable to read those signs. Conversely what makes someone charming is that they're better than average at reading those signs.
So the standard as you've stated if we take it literally seems to reiterate the idea that some behavior from charming people is acceptable but from awkward people is not. But it also sounds like that is not what you intend, is that correct?
It's also unclear how someone would learn to read those signs if the standard for acceptable behavior is already being able to read them. It seems like you'd need to build in the idea that learning a skill requires some failure trials.
No. Not at all. It has nothing to do with being charming or attractive, that you look someone in the eye first and evaluate their reaction, before making closer contact. That is part of normal human communication.
So if someone really cannot do it out of being autistic ... then it is up to them to communicate their unability to communicate like normal persons. Those are exceptions.
But there is also a big group of people who use autism as an excuse to be an asshole. In the sense not really caring what reaction the other person gives. And then complaining other people feel molested (I had this exact conversation with someone who thought it is appropriate to ask unknown women directly for sex, without ever daring to look them in the eye first).
Deb Nicholson, PSF "Executive Director", won an FSF award in 2018, handed to her by Stallman himself. Note that at that time at least one of Stallman's embarrassing blog posts was absolutely already known:
https://www.fsf.org/news/openstreetmap-and-deborah-nicholson...
In 2021 Deb Nicholson then worked to cancel Stallman:
https://rms-open-letter.github.io/
In 2025 Deb Nicholson's PSF takes money from all new Trump allies, including from those that finance the ballroom and the destruction of the historical East Wing like Google and Microsoft. Will Deb Nicholson sign a cancellation petition for the above named figures?
They could also have done research in 2018 before accepting the award, which is standard procedure for politicians etc. But of course they wanted the award for their career.
https://web.archive.org/web/20200522063800/https://2020.copy...
The whole software foundation industry is sketchy.
They are entirely silent since January 2025.
I've always been in favor of the GPLs being pushed as proprietary, restrictive licenses, and being as aggressive in enforcement as any other restrictive license. GPL'd software is public property. The association with Open Source, "Creative Commons" and "Public Domain" code is nothing but a handicap; proprietary code can take advantage of all permissively licensed code without pretending that it shares anything in terms of philosophy, and without sharing back unless it finds it strategically advantageous.
> They are not working on a new license and Siewicz is already low-key pushing in favor of LLMs
I just have no idea what I would put in a new license, or what it means to be "in favor" of LLMs. Are Free Software supporters just supposed to not use them, ever? Even if they're only trained on permissively licensed code? Do you think that it means that people are pushing to allow LLMs to train on GPL-licensed software?
I just don't understand what you're trying to say. I also have zero trust in the FSF over Stallman, simply because I don't hear people who speak like Stallman at the FSF i.e. I think his vision was pushed out along with his voice. But I do not understand what you're getting at.
I don't see any sense of urgency in the reported discussion or any will to fight against large corporations. The quoted parts in the article do not seem very prepared, there are a lot of maybes, no clear stance and no overarching vision that LLMs must be fought for software freedom.
I have a feeling the people who write these haven't really used LLMs for programming because even just playing around with them will make it obvious that this makes no sense - especially if you try to use something local based that lets you rewrite the discussion at will, including any code the LLM generated. E.g. sometimes when trying to get Devstral make something for me, i let it generate whatever (sometimes buggy/not working) code it comes up with[0] and then i start editing its response to fix the bug so that further instructions are under the assumption it generated the correct code from the get go instead of trying to convince it[0] to fix the code it generated. In such a scenario there is no clear separation between LLM-generated code and manually written code nor any specific "prompt" (unless you count all snapshots of the entire discussion every time one hits the "submit" button as a series of prompts, which technically is what the LLM using as a prompt instead of what the user types, but i doubt this was what the author had in mind).
And all that without taking into account what someone commented in the article about code not even done in a single session but with plans, restarting from scratch, summarizing, etc (and there are tools to automate these too and those can use a variety of prompts by themselves that the end user isn't even aware of).
TBH i think if FSF wants to "consider LLMs" they should begin by gaining some real experience using them first - and bringing people with such experience on board to explain things for them.
[0] i do not like anthropomorphizing LLMs, but i cannot think of another description for that :-P
Hal and Dave work together. Hal is going home at 6:00 PM, but before it's time to leave, Dave tells Hal to go ahead and start working on some new feature. At 5:50 PM, Hal hits Cmd+Q, saving whatever unfinished work there is and no matter what state it's in and commits it to a new development branch with the commit message "Start on $X" followed by a copy of the explanation of that Dave first gave Hal about what they needed to do. Then Hal pushes that commit upstream for Dave and leaves. At 6:00 PM Dave, still at the office, runs git-pull, spends a little time fixing up several issues with the code Hal wrote, then commits the result and pushes it to the development branch of the shared repo. Dave's changes mainly focus on getting the project to build again and making sure some or all of the existing tests pass. Dave then writes an email to Hal about this progress. At 8:30 PM Hal reads Dave's email about what Dave fixed and what Hal should do now. Hal then runs git-pull and writes some more code, pushing the result to the development branch before watching a movie and going to bed. Around midnight, Dave runs git-pull, fixes some more problems with the code that Hal wrote, and then pushes that to the repo. The next day at the office, they resume their work together following this pattern, where Hal writes the bulk of the code followed by Dave fixing it up and/or providing instruction for Hal about how to proceed. When they're done, one of them switches to the main branch with `git checkout main` and runs `git merge $OUR_DEVELOPMENT_BRANCH_NAME`.
Which part of this entails "destructive modifications of branch history"?
This is one problem with LLM generated code. It is very greenfield. There’s no correct or even good way to do it. Because it’s a little bit unbounded in possible approaches and quality of output.
I’ve tried tracking prompt history in many permutations as a means to documenting and making rollbacks more possible. I hasn’t felt like that's the right way to think about it.
2. Ban tainted code.
Consider code that (in the old days) had been copy pasted from elsewhere. Is that any better than LLM generated code? Why yes - to make it work a human had to comb through it, tweaking as necessary, and if they did not then stylistic cues make the copy pasta quite evident. LLMs effectively originate and disguise copy pasta (including mimicking house styles), making it harder/impossible to validate the code without stepping through every single statement. The process can no longer be validated, so the output has to be. Which does not scale.
There have been many occasions when working in a very verbose enterprise-y codebase where I know exactly what needs to happen, and the LLM just types it out. I carefully review all 100 lines of code and verify that it is very nearly exactly what I would have typed myself.
I'm playing a bit fast and loose here but there's a solid idea at the heart of this statement - I'm just on the cusp of going to bed so wanted to post a placeholder until tomorrow. The gist is "what do copyleft licences aim to achieve as an end goal - and what would non-copyrightable code mean in that broader context?"
This is because LLMs are a type of assistive technology, usually for those with mental disabilities. It's a shame that mental disabilities are still seen as less important than physical disabilities. If one takes them seriously, one would realize that banning LLMs is inherently ableist. Just make sure that the developer takes accountability for the submitted code.
7 more comments available on Hacker News
Want the full context?
Jump to the original sources
Read the primary article or dive into the live Hacker News thread when you're ready.