AI Slop Vs. Oss Security
Postedabout 2 months agoActiveabout 2 months ago
devansh.bearblog.devTechstoryHigh profile
heatednegative
Debate
80/100
AI-Generated ContentOss SecurityBug Bounty Programs
Key topics
AI-Generated Content
Oss Security
Bug Bounty Programs
The article discusses the issue of AI-generated 'slop' in OSS security reports, and the HN discussion revolves around potential solutions and the implications of this problem.
Snapshot generated from the HN discussion
Discussion Activity
Very active discussionFirst comment
22m
Peak period
110
0-12h
Avg / period
39.3
Comment distribution118 data points
Loading chart...
Based on 118 loaded comments
Key moments
- 01Story posted
Nov 6, 2025 at 7:05 AM EST
about 2 months ago
Step 01 - 02First comment
Nov 6, 2025 at 7:27 AM EST
22m after posting
Step 02 - 03Peak activity
110 comments in 0-12h
Hottest window of the conversation
Step 03 - 04Latest activity
Nov 12, 2025 at 3:11 AM EST
about 2 months ago
Step 04
Generating AI Summary...
Analyzing up to 500 comments to identify key contributors and discussion patterns
ID: 45834303Type: storyLast synced: 11/20/2025, 4:44:33 PM
Want the full context?
Jump to the original sources
Read the primary article or dive into the live Hacker News thread when you're ready.
This is such an important problem to solve, and it feels soluble. Perhaps a layer with heavily biased weights, trained on carefully curated definitional data. If we could train in a sense of truth - even a small one - many of the hallucinatory patterns disappear.
Hats off to the curl maintainers. You are the xkcd jenga block at the base.
Even if Problems feel soluble, they often aren't. You might have to invent an entirely new paradigm of text generation to solve the hallucination problem. Or it could be the Collatz Conjecture of LLMs, that it "feels" so possible, but you never really get there.
- dictionary definitions - stable apis for specific versions of software - mathematical proofs - anything else that is true by definition rather than evidence-based
(i realize that some of these are not actually as stable over time as they might seem, but they ought to do good enough with the pace that we train new models at).
If you even just had an MOE component whose only job was verifying validity against this dataset in chain-of-thought I bet you'd get some mileage out of it.
Would this be different if the underlying code had a viral license? If google's infrastructure was built on a GPL'ed libcurl [0], would they have investment in the code/a team with resources to evaluate security reports (slop or otherwise)? Ditto for libxml.
Does GPL help the linux kernel get investment from it's corporate users?
[0] Perhaps an impossible hypothetical. Would google have skipped over the imaginary GPL'ed libcurl or libxml for a more permissively licensed library? And even if they didn't, would a big company's involvement in an openly developed ecosystem create asymmetric funding/goals, a la XMPP or Nix?
> Does GPL help the linux kernel get investment from it's corporate users?
GPL has helped "linux kernel the project" greatly, but companies invest in it out of their self-interest. They want to benefit from upstream improvements and playing nicely by upstreaming changes is just much cheaper than maintaining own kernel fork.
On other side you have companies like Sony that used BSD OS code for their game consoles for decades and contributed shit.
So... Two unrelated things.
Yes they want it to be secure, but as always nobody except few very large orgs care about security for real.
It certainly helped with "under-resourced" part. Whatever you considered "exploited" is up to discussion. From project perspective ofc copyleft licensing benefited the project.Linus Torvalds end up with a good amount of publicity and is now somewhat well set-off, but almost all other kernel developers live in obscurity earning somewhat average salaries. I pretty sure we can all agree that Linux Kernel made a massive positive impact on whole humanity and compared to that payoff to stakeholders is rather small IMO.
If this isn't already a requirement, I'm not sure I understand what even non-AI-generated reports look like. Isn't the bare-minimum of CVE reporting a minimally reproducible example? Like, even if you find some function, that for example doesn't do bounds-checking on some array, you can trivially write some unit testing code that's able to break it.
Regex exploitation is the forever example to bring up here, as it's generally the main reason that "autofail the CI system the moment an auditing command fails" doesn't work on certain codebases. The reason this happens is because it's trivial to make a string that can waste significant resources to try and do a regex match against, and the moment you have a function that accepts a user-supplied regex pattern, that's suddenly an exploit... which gets a CVE. A lot of projects then have CVEs filed against them because internal functions rely on Regex calls as arguments, even if they're in code the user is flat-out never going to be able interact with (ie. Several dozen layers deep in framework soup there's a regex call somewhere, in a way the user won't be able to access unless a developer several layers up starts breaking the framework they're using in really weird ways on purpose).
The CVE system is just completely broken and barely serves as an indicator of much of anything really. The approval system from what I can tell favors acceptance over rejection, since the people reviewing the initial CVE filing aren't the same people that actively investigate if the CVE is bogus or not and the incentive for the CVE system is literally to encourage companies to give a shit about software security (at the same time, this fact is also often exploited to create beg bounties). CVEs have been filed against software for what amounts to "a computer allows a user to do things on it" even before AI slop made everything worse; the system was questionable in quality 7 years ago at the very least, and is even worse these days.
The only indicator it really gives is that a real security exploit can feel more legitimate if it gets a CVE assigned to it.
You sort of want to reject them all, but ocassionally a gem gets submitted which makes you reluctant.
For example, years ago i was responsible for triaging bug bounty reports at a SaaS company i worked at at the time. One of the most interesting reports was that someone found a way to bypass our oauth thing by using a bug in safari that allowed them to bypass most oauth forms. The report was barely understandable written in broken english. The impression i got was they tried to send it to apple but apple ignored them. We ended up rewriting the report and submitting it to apple on there behalf (we made sure the reporter got all credit).
If we ignored poorly written reports we would have missed that. Is it worth it though? I dont know.
So safari was not following the web browser specs in a way that compromised oauth in a common mode of implementation.
Referral systems are very efficient at filtering noise.
There is also the possibility that in trying to get someone to refer me I give enough details that the trusted person can submit instead of me and claim credit.
I think this is the fundamental problem of LLMs in general. Some of the time looks just enough right to seem legitimate. Luckily the rest of the time it doesn’t.
But all of it’s responses definitely seem convincing (as it has been trained to do)
I feel like I'm watching a tsunami about to hit while literally already drowning from a different tsunami.
Any large enough organization gathers them en mass to cloud real development work with "compliance."
But of course producing fake ones is far easier and cheaper.
Everything looks right but misses the underlying details that actually matter.
There is a larger problem that I think we like to pretend that everything is so simple you don't need expertise. This is especially bad in our CS communities where there's a tendency of thinking intelligence in one domain cleanly transfers to others. In this respect I generally advise people not to first ask LLMs what they don't know but what they are experts in. That way they can properly evaluate their responses. Least we all fall for Murry Gelmann amnesia lol
https://en.wikipedia.org/wiki/Cargo_cult
- Primarily relies on a single piece of evidence from the curl project, and expands it into multiple paragraphs
- "But here's the gut punch:", "You're not building ... You're addressing ...", "This is the fundamental problem:" and so many other instances of Linkedin-esque writing.
- The listicle under "What Might Actually Work"
In point of fact, I had not.
After the security reporting issue, the next problem on the list is "trust in other people's writing".
This has additional layers to it as well. For example, I actively avoid using em dash or anything that resembles it right now. If I had no exposure to the drama around AI, I wouldn't even be thinking about this. I am constraining my writing simply to avoid the implication.
You don't know whose style the LLM would pick for that particular prompt and project. You might end up with Carmack or maybe that buggy, test-failing piece of junk project on Github.
There's no "LLM style". There's "human style mimicked by LLMs". If they default to a specific style, then that's on the human user who chooses to go with it, or, likely, doesn't care. They could just as well make it output text in the style of Shakespeare or a pirate, eschew emojis and bulleted lists, etc.
If you're finding yourself influenced by LLMs—don't be. Here's why:
• It doesn't matter.
• Keep whatever style you had before LLMs.
:tada:
There is a "default LLM style", which is why I call it that. Or technically, one per LLM, but they seem to have converged pretty hard since they're all convergently evolving in the same environment.
It's trivial to prompt it out of that style. Word about how to do it and that you should do it has gotten around in the academic world where the incentives to not be caught are high. So I don't call it "the LLM style". But if you don't prompt for anything in particular, yes, there is a very very strong "default LLM style".
I'm still using bullet lists sometimes, as they have their place, and I'm hoping LLMs don't totally nuke them.
https://news.ycombinator.com/item?id=44072922
https://news.ycombinator.com/item?id=45766969
https://news.ycombinator.com/item?id=45073287
HN discussed it here https://news.ycombinator.com/item?id=44384610
The responses were a surprisingly mixed bag. What I thought was a very common sense observation had some heavy detractors in those threads.
It's sad because people that are ok with AI art are still enjoying the human art just the same. Somehow their visceral hate of AI-art managed to ruin human art for themselves as well.
AI outputs mimicking art rob audiences of the ability to appreciate art on its own in the wild without further markers of authenticity, which steals joy from a whole generation of digital artists that have grown up sharing their creativity with each other
If you lack the empathy to understand why AI art-like outputs are abhorrent, I hope someone wastes a significant portion of your near future with generated meaningless material presented to you as something that is valuable and was time consuming to make, and you gain nothing from it, so that you can understand the problem for yourself first hand.
But instead we had a 'non-profit' called 'Open'AI that irresponsibly unleashed this technology on the world and lied about its capabilities with no care of how it would affect the average person.
Between this and the flip side of AI-slop it's getting really frustrating out here online.
Today there are scams that look just like real companies trying to get you to buy from them instead. Who knows what happens if you put your money down. (Scams were of course always a problem, but there is much less cost to create a scam)
It's better to stay neutral and say you suspect it may be AI generated.
And for everyone else, responsible disclosure of using AI tools to write stuff would be appreciated.
(this comment did not involve AI. I don't know how to write an emdash)
> Certain sections of this content were grammatically refined/updated using AI assistance
> I don't know how to write an emdash
Same here, and at this point I don’t think I will ever learn
Literally the first two sentences on the linked article:
> Disclosure: Certain sections of this content were grammatically refined/updated using AI assistance, as English is not my first language. Quite ironic, I know, given the subject being discussed.
Personally, I've read enough AI generated SEO spam that anything with the AI voice comes off as being inauthentic and spammy. I would much rather read something with the mistakes a non-native English speaker would make than something AI written/edited.
I know that this poses new problems (some people can't afford to spend this money), but it would be better than just wasting people's time.
As much as I'd like to see Russia, China and India disconnected off of the wide Internet until they clean up shop with abusive actors, the Hacktoberfest stuff you're likely referring to doesn't have anything to do with your implication - that was just a chance at a free t-shirt [1] that caused all the noise.
In ye olde times, you'd need to take care how you behaved in public because pulling off a stunt like that could reasonably lead to your company going out of business - but even a "small" company like DO is too big to fail from FAFO, much less ultra large corporations like Google that just run on sheer moat. IMHO, that is where we have to start - break up the giants, maybe that's enough of a warning signal to also alert "smaller" large companies to behave like citizens again.
[1] https://domenic.me/hacktoberfest/
fake games by fake studios played by fake players is still a thing
If I would have written it, I would have perhaps mentioned that similar problems exist also in other domains, including science, arts, and media. Maybe the solutions might be similar too? I am particularly pointing toward the following quote that wasn't yet discussed here:
"New reporters could be required to have established community members vouch for them, creating a web-of-trust model. This mirrors how the world worked before bug bounty platforms commodified security research. The only downside is, it risks creating an insider club."
> The downside is that it makes it harder for new researchers to enter the field, and it risks creating an insider club.
I also think this concern can be largely mitigated or reduced to a nonissue. New researchers would have a trust score of zero for example, but people who consistently submit AI slop will have a very low score and can be filtered out fairly easily.
Even more so when there is a bounty payout.
Refundable if the PR/report is accepted.
What do other countries do for their stuff like this?
I’m not saying that AI hasn’t already given us useful things, but this is a symptom of one very negative change that’s drowned a lot of the positive out for many people: the competence gap used to be an automatic barrier for many things. For example, to get hired as a freelance developer, you had to have at least cargo-culted something together once or twice, and even if you were way overconfident in your capability, you probably knew you weren’t the real thing. However, the AI tools industry essentially markets free competence, and out of context for any given topic, that little disclaimer is meaningless. It’s essentially given people climbing the Dunning-Krueger Mt. Stupid the agency to produce garbage at a damaging volume that’s too plausible looking for them (or laypeople, for that matter) to realize it’s garbage. I also think somewhat nihilist people prone to get-rich-quick schemes (e.g. drop-shipping, NFTs) play these workflows like lottery tickets while remaining deliberately ignorant of their dubious value.
As I just commented in the other AI trust thread on the front page, this dynamic is funnily enough what any woman using online dating services has always been very familiar with. With the exact same tragedy of the commons that results. Except for the important difference that terrible profiles and intro messages have traditionally usually been very short and easily red-flagged. But that is, of course, now also changing or already changed due to LLMs.
(Someone I follow on a certain social media platform just remarked that she got no less than fifty messages within a single hour of marking herself as "single". And she's just some average person, not a "star" of any sort.)
How ironic, considering every time I've reported a complicated issue to a program on HackerOne, the triggers have completely rejected them because they do not understand the complicated codebase that they are triaging for.
Also the curl examples given in TFA completely ignore recent developments, where curl's maintainers welcomed and fixed literally hundred of AI-found bugs: https://www.theregister.com/2025/10/02/curl_project_swamped_...
It's a cargo cult. Maybe the airplanes will land and bring the goodies!
https://en.wikipedia.org/wiki/Cargo_cult
> When you're volunteering out of love in a market society, you're setting yourself up to be exploited.
I sound like a broken record but there's unifying causes to most issues I observe in the world.
None of the proposed solutions address the cause (and they can't of course): public scrutiny doesn't do anything if account creation is zero-effort; monetary penalization will kill the submissions entirely.
In a perfect world OSS maintainers would get paid properly. But, we've been doing this since the 90s, and all that's happened is OSS got deployed by private companies, concentrating the wealth and the economic benefits. When every hour is paid labour, you pick the AWS Kafka over spinning up your own cluster, or you run Linux in the cloud instead of your own metal. This will always keep happening so long as the incentives are what they are and survival hinges on capital. That people still put in their free time speaks to the beautiful nature of humans, but it's in spite of the current systems.
It's good for the site collecting the fee, it's good for the projects being reported on and it doesn't negatively affect valid reports.
It does exactly what we want by disincentivizing bad reports, either AI generated or not.
> A security report lands in your inbox. It claims there's a buffer overflow in a specific function. The report is well-formatted, includes CVE-style nomenclature, and uses appropriate technical language.
Given how easy it is to generate a POC these days, I wonder if HackerOne needs to be pivoting hard into scaffolding to help bug hunters prove their vulns.
- Claude skills/MCP for OSS projects
- Attested logging/monitoring for API investigations (eg hosted BURP)
Most people's initial contributions are going to be more concrete exploits.
Different models perform differently when it comes to catching/fixing security vulnerabilities.
Welcome to the Internet.
It doesn't have to make the final judgement, just some sort of filter that automatically flags things like function calls that don't exist in the code.