Codemender: an AI Agent for Code Security

Posted3 months agoActive3 months ago

ravenical

199 points

29 comments

deepmind.googleTechstory

skepticalmixed

Debate

80/100

Artificial IntelligenceCode SecurityLarge Language Models

Key topics

Artificial Intelligence

Code Security

Large Language Models

DeepMind introduces CodeMender, an AI agent for code security, sparking debate about its potential impact, availability, and implications for security.

Snapshot generated from the HN discussion

Discussion Activity

Active discussion

First comment

Peak period

0-2h

Avg / period

3.6

Comment distribution29 data points

Loading chart...

Based on 29 loaded comments

Key moments

01Story posted
Oct 6, 2025 at 5:28 PM EDT
3 months ago
Step 01
02First comment
Oct 6, 2025 at 7:02 PM EDT
2h after posting
Step 02
03Peak activity
11 comments in 0-2h
Hottest window of the conversation
Step 03
04Latest activity
Oct 7, 2025 at 10:34 AM EDT
3 months ago
Step 04

Generating AI Summary...

Analyzing up to 500 comments to identify key contributors and discussion patterns

Discussion (29 comments)

Showing 29 comments

sobiolite

3 months ago

2 replies

I wonder if we're going to end up in an arms race between AIs masquerading as contributors (and security researchers) trying to introduce vulnerabilities into popular libraries, and AIs trying to detect and fix them.

sublinear

3 months ago

1 reply

Why would it be like that instead of the way we already handle low-trust environments?

Projects that get a lot of attention already put up barriers to new contributions, and the ones that get less attention will continue to get less attention.

The review process cannot be left to AI because it will introduce uncertainty nobody wants to be held responsible for.

If anything, the people who have always seen code as a mere means to an end will finally come to a forced decision: either stop fucking around or get out of the way.

An adversarial web is ultimately good for software quality, but less open than it used to be. I'm not even sure if that's a bad thing.

sobiolite

3 months ago

1 reply

What I'm suggesting is: what if AIs get so good at crafting vulnerable (but apparently innocent) code than human review cannot reliably catch them?

And saying "ones that get less attention will continue to get less attention" is like imagining that only popular email addresses get spammed. Once malice is automated, everyone gets attention.

courseofaction

3 months ago

1 reply

Significantly easier to detect than create? Not quite NP, but intuitively an AI which can create such an exploit could also detect it.

The economics is more about how much the defender is willing to spend in advance protection vs the expected value of a security failure

cookiengineer

3 months ago

I think the issue I have with this argument is that it's not a logical conclusion that's based on technological choice.

It's an argument about affordability and the economics behind it, which puts more burden on the (open source) supply chain which is already stressed to its limit. Maintainers simply don't have the money to keep up with foreign state actors. Heck, they don't even have money for food at this point, and have to work another job to be able to do open source in their free time.

I know there are exceptions, but they are veeeery marginal. The norm is: open source is unpaid, tedious, and hard work to do. It will get harder if you just look at the sheer amount of slopcode pull requests that plague a lot of projects already.

The trend is likely going to be more blocked pull requests by default rather than having to read and evaluate each of them.

torginus

3 months ago

If you are doing security of all things - why wouldn't you verify the provenance of your tooling and libs?

zb3

3 months ago

1 reply

DeepMind = not available for use

esafak

3 months ago

It's lost its charm.

sigmar

3 months ago

1 reply

4.5 million lines of code for one fix is impressive for an LLM agent, but there's so little detail in this post otherwise. Perhaps this is a tease to what will be released on Thursday...

wrs

3 months ago

1 reply

That's how I read it at first too, but I think the more probable interpretation is that it was a fix to a project that has 4.5M lines of code.

sigmar

3 months ago

oh, that would definitely make more sense.

bgwalter

3 months ago

1 reply

So it is a secret tool, they will "gradually reach out to interested maintainers of critical open source projects with CodeMender-generated patches", then they "hope to release CodeMender as a tool that can be used by all software developers".

Why is everything in "AI" shrouded in mystery, hidden behind $200 monthly payments and has glossy announcements. Just release the damn thing and let us test it. You know, like the software we write and that you steal from us.

philipwhiuk

3 months ago

It could instead be used to automate the finding of zero-days.

And $200 payments is probably revenue neutral for actual cost of this stuff.

nickpinkston

3 months ago

6 replies

I'm optimistic that it's easier to find/solve vulnerabilities via auto pen-testing / patching, and other security measures, than it will be to find/exploit vulnerabilities after - ie defense is easier in an auto-security world.

Does anyone disagree?

This is purely my intuition, but I'm interested in how others are thinking about it.

All this with the mega caveat of this assuming very widespread adoption of these defenses, which we know won't be true and auto-hacking may be rampant for a while.

closeparen

3 months ago

1 reply

If you can compromise an employee desktop and put a too-cheap-to-meter intelligence equivalent to a medium-skilled software developer in there to handcraft an attack on whatever internal applications they have access to, it's kind of over. This kind of stuff isn’t normally hardened against custom or creative attacks. Cybersecurity rests on bot attacks having known signatures, and sophisticated human attackers having better things to do with their time.

squigz

3 months ago

Why not put a more powerful agent in there to handcraft defences?

courseofaction

3 months ago

I've also thought this for scam perpetration vs mitigation. An AI listening to grandma's call would surely detect most confidence or pig butchering scams (or suggest how to verify), and be able to cast doubt on the caller's intentions or inform a trusted relative before the scammer can build up rapport. Security and surveillance concerns notwithstanding.

manquer

3 months ago

In open source codebases perhaps, either because big tech would be generous enough to run and generate PRs(if they are welcome ) for those issues.

In proprietary/closed source it depends on ability to spend the money these tools would end up costing.

As there is more and more vibe coded apps there will be more security bugs because app owners just don’t know better or don’t care to fix them .

This happened when rise of Wordpress and other cmses and their plugin ecosystem or languages like early PHP or for that matter even C opened up software development to wider communities.

On average we will see more issues not less.

NitpickLawyer

3 months ago

> I'm optimistic that it's easier to find/solve vulnerabilities via auto pen-testing / patching, and other security measures, than it will be to find/exploit vulnerabilities after - ie defense is easier in an auto-security world.

I somewhat share the feeling that this is where it's going, but not sure if fixing will be easier. In "meatbag" red vs. blue teams, reds have it easier as they only have to make it once, blue has to always be right.

I do imagine something adversarial being the new standard, though. We'll have red vs blue agents that constantly work on owning the other side.

Joel_Mckay

3 months ago

In general, most modern vulnerabilities are initially identified with fuzzing systems under abnormal conditions. Whether these issues may be consistently exploited can be probabilistic in nature, and thus repeatability with a POC dataset is already difficult.

That being said, most modern exploits are already auto-generated though brute-force, as nothing more complex is required.

>Does anyone disagree?

CVE agents already pose a serious threat vector in and of itself.

1. Models can't currently be made inherently trustworthy, and the people claiming otherwise are selling something.

"Sleeper Agents in Large Language Models - Computerphile"

https://www.youtube.com/watch?v=wL22URoMZjo

2. LLMs can negatively impact logical function in human users. However, people feel 20% more productive, and that makes their contributed work dangerous.

3. People are already bad at reconciling their instincts and rational evaluation. Adding additional logical impairments is not wise:

https://www.youtube.com/watch?v=-Pc3IuVNuO0

4. Auto merging vulnerabilities into opensource is already a concern, as it falls into the ambiguous "Malicious sabotage" or "Incompetent noob" classifications. How do we know someone or some models intent? We can't, and thus the code base could turn into an incoherent mess for human readers.

Mitigating risk:

i. Offline agents should only have read-access to advise on identified problem patterns.

ii. Code should never be cut-and-pasted, but rather evaluated for its meaning.

iii. Assume a system is already compromised, and consider how to handle the situation. In this line of reasoning, the policy choices should become clear.

Best of luck, =3

dotancohen

3 months ago

In many small companies (e.g. startups), the attackers are far more experienced and skilled than are the defenders. For attacking specific targets, they also have the leisure of choosing the timing of the attack - maybe the CTO just boarded a four hour flight?

summarity

3 months ago

1 reply

If you want to get reliable automated fixes today, I'd encourage you to enable code scanning on your repo. It's free for open-source repos and includes Copilot Autofix (also for free).

We've already seen more than 100,000 fixes applied with Autofix in the last 6 months, and we're constantly improving it. It's powered by CodeQL, our deterministic and in-depth static analysis engine, which also recently gained support for Rust.

To enable go to your repo -> Security -> code scanning.

Read more about how autofix works here: https://docs.github.com/en/code-security/code-scanning/manag...

And stay tuned for GitHub Universe in a few weeks for other relevant announcements ;).

Disclaimer: I'm the Product lead on detection & remediation engines at GitHub

inemesitaffia

3 months ago

Please tell your people about 2FA SMS delivery issues to certain West African countries. I'd rather have it via email or have the option of WhatsApp

I was fine before 2FA and I'm willing to pay to go without. Same username

Can't scan my code if I can't access my account

philipwhiuk

3 months ago

If this is released publicly it will be immediately used to find zero-days in software by black hats.

mmaunder

3 months ago

Can we just flag this since it’s not actually a thing available to anyone?

blibble

3 months ago

what an annoying page

pointless videos, without enough time to read the code

narmiouh

3 months ago

Not a fan of future products being announced as if they are here but are basically is still in "Internal Research" stages. I'm not sure who this is really helping? except creating unnecessary anticipation which we kinda all know are in this loop lately of "yes it works great, but".

sitkack

3 months ago

Remember kids! Everything is dual use.

Yoric

3 months ago

Does anybody know how such LLMs are trained/fine-tuned?

View full discussion on Hacker News

ID: 45496533Type: storyLast synced: 11/20/2025, 2:52:47 PM

Want the full context?