Web Bot Auth
Key topics
The web is ablaze with debate over Cloudflare's verified bots program, with some hailing it as a game-changer for blocking malicious traffic, while others cry foul, claiming it unfairly restricts legitimate bots and concentrates too much power in Cloudflare's hands. Proponents argue that bot blocking should be the default, protecting sites from scraping and abuse, while detractors see it as discriminatory against robots and a step towards a more locked-down internet. As one commenter noted, the program's effectiveness is also being questioned, with some wondering if Cloudflare can truly stop AI-powered bots. Amidst the heated discussion, a surprising consensus emerged: one site owner reported a year of success with Cloudflare's "bot super fight mode," enjoying a significant reduction in abusive traffic without blocking legitimate users.
Snapshot generated from the HN discussion
Discussion Activity
Very active discussionFirst comment
49m
Peak period
22
0-3h
Avg / period
5.7
Based on 74 loaded comments
Key moments
- 01Story posted
Aug 28, 2025 at 2:35 PM EDT
4 months ago
Step 01 - 02First comment
Aug 28, 2025 at 3:25 PM EDT
49m after posting
Step 02 - 03Peak activity
22 comments in 0-3h
Hottest window of the conversation
Step 03 - 04Latest activity
Aug 30, 2025 at 4:43 AM EDT
4 months ago
Step 04
Generating AI Summary...
Analyzing up to 500 comments to identify key contributors and discussion patterns
Want the full context?
Jump to the original sources
Read the primary article or dive into the live Hacker News thread when you're ready.
In the end, only people with non-mainstream browsers (or using VPN to escape country-level blocks, or Tor, or noJS) suffer.
It's like how anti-piracy measures only affect paying customers, while pirates ironically get a better experience. The best way to get around endless CAPTCHAs is to just use LLMs instead.
You would... lead your response with that argument? This has nothing to do with DRM. When people talk about how bots suck, the focus is on billion or trillion dollar businesses making everyone on the web pay.
There's also a reason why the bot conversation flared up; we've always had bots, but before the conversation centered on Google and SEO. Now the conversation centers on companies like OpenAI.
That's the entire point.
Thing is, my browser isn’t configured that way. So works well, I guess.
How are you so sure of that? Their marketing?
The internet was designed to work the way it does for good reasons.
You not understanding those reasons is not an excuse for allowing a giant tech company to step in and be the gatekeeper for a huge portion of the internet. Nor to monetize, enshittify, balkanize, and fragment the web with no effective recourse or oversight.
Cloudflare shouldn't be allowed to operate, in my view.
Are you somehow under the impression that Cloudflare is forcing their service on other companies? They’re not stepping in, the people who own those sites have decided paying them is a better deal than building their own alternatives.
They did exactly that, they just outsourced it to cloudflare. The problem became bad enough that a lot of other people did the same thing.
If your argument is "companies shouldn't be allowed to outsource components to other companies, or cloudflare specifically", then sure, but good luck ever enforcing that.
This press release today is a better statement of _why_ this feature exists (as opposed to the submission link, which is nuts-and-bolts of implementing): https://blog.cloudflare.com/signed-agents/
Web Bot Auth is a way for bots to self-identify cryptographically. Unlike the user agent header (which is trivially spoofed) or known IPs (painful to manage), Web Bot Auth uses HTTP Message Signatures using the bot's key, which should be published at some well-known location.
This is a good thing! We want bots to be able to self-identify in a way that can't be impersonated. This gives website operators the power to allow or deny well-behaved bots with precision. It doesn't change anything about bots who try to hide their identity, who are not going to self-identify anyways.
It's worth reading the proposal on the details: https://datatracker.ietf.org/doc/html/draft-meunier-web-bot-... . Nothing about this is limited to Cloudflare.
I'm also working on support for Web Bot Auth for our Agent Identification project at Stytch https://www.isagent.dev . Well-behaved bots benefit from this self-identification because it enables a better Agent Experience: https://stytch.com/blog/introducing-is-agent/
A way to authenticate identity for crawlers so I can allow-list ones I want to get in, exempt them from turnstile/captcha, etc -- is something I need.
I'm not following what makes this controversial. Cryptographic verification of identity for web requests, sounds right.
it seems like complaints about Cloudflare's anti-DOS protection services and how they have a monopoly on such, I get that.
I'm not seeing the connection to a protocol for bots/crawlers voluntarily cryptographically signing their http requests, so sites (anyone implementing the protocol not just cloudflare) can use it to authenticate known actors?
I am interested in using it to exempt bots/crawlers I trust/support/have an agreement with from the anti-bot measures I, like many, am being forced to implement to keep our sites up under an enormously increased wave of what is apparently AI-training-motivated repeat crawling. Right now these measures are keeping out bots I don't want to keep out too. I would like to be able to securely identify them to let them in.
(And then it can of course get derailed, but that's a separate story)
Obviously this technology is different but the same sort of result.
What's the end game here? All humans end up having to use a unique encryption key to prove their humanness also?
Who is we? I absolutely don't want that.
guaranteed as long as no attacker gets hold of the private key, which cannot be guaranteed
That's an argument against all authentication anywhere.
its a problem isnt it
It is only useful for whitelisting bots, not for banning bad ones, as bad ones can rotate keys.
Whitelisting clients by identity is the death of the open web, and means that nobody will ever be able to compete with capital on even footing.
That said, I do think it's the whole procedure is more than a bit overcomplicated to the degree where I doubt it will be widely implemented. You could likely achieve almost the full effect with a request signing alone.
They can offer what they want for bots. But stop ruining the experience for humans first.
Web operators choose to use them; hell they even pay Cloudflare to be between them. Seriously I just think you don't understand how bad it is to run a site without someone in-front of it.
I let bots hit Gitea 2-3 times per second on a $10/month VPS, and the only actual problem was that it doesn't seem to ever delete zip snapshots, filling up the disk when enough snapshot links are clicked. So I disabled that feature by setting the snapshots folder read-only. There were no other problems. I mention Gitea because people complain about having to protect Gitea a lot, for some reason.
The alternative, of course, is to set up a caching system server-side (like Redis), which most people who set up their WordPress blog don't have the first idea how to do in a secure way.
I suspect I'm missing something, what am I missing?
i'm sure the next step here will be a cloudflare product that sits in front of your website and blocks all bot traffic except for the bots that are verified to have paid for access. (or maybe that already exists?)
That's needed because many APIs are either nonexistent or extremely marginal in design and content coverage.
1. THey have already proven to be a bad faith actor with their "DDoS protection."
2. This is pretty much the typical Cloudflare HN playbook. They release soemthing targeted at the current wave and hide behind an ideological barrier; meanwhile if you try to use them for anything serious they require a call with sales who jumps you with absurdly high pricing.
Do other cloud providers charge high fees for things they have no business charging for? Absolutely. But they typically tell you upfront and don't run ideological narratives.
This is not a company we should be putting much trust in, especially not with their continued plays to become the gatekeepers of the internet.
There is a whole segment of tech designed around helping you understand and manage cloud costs, through consultations, automations, etc. It has spawned companies and career paths!
2) Then don't use them? Either they provide enough value to pay them or they don't.
The standard looks fine as a distributed protocol until you have to register to pay a rent to Cloudflare, which they say will eventually trickle down into publishers pocket but you know what having a middleman this powerful means to the power dynamics of the market. Publishers have a really bad hand no matter what we do to save them, content as we know it will have to adapt.
Give it a couple more iterations and some MBA will come up with the brilliant idea of introducing an internet toll to humans and selling a content bundle with unlimited access to websites.
Register with CF is the specific part I object to. Of all of the numerous hazards here centralizing the registration with CF is most clearly problematic. This part of the spec could have easily been an additional header linking to key data.
Cloudflare is doing this registration as part of their "verified" program, which gives special treatment to bots/agents who go through the process. That's a Cloudflare-specific feature, not part of the spec.
A practical flow:
1. Bot self-identifies (Web Bot Auth)
2. Fetch policy
3. Accept terms or negotiate (HTTP 402 exists)
4. Present a signed receipt proving consent/payment
5. Origin/CDN verifies receipt and grants access
That keeps things decentralized: identity is transport; policy stays with the site; receipts provide auditability, no single gatekeeper required. There’s ongoing work in this direction (e.g., PEAC using /.well-known/peac.txt) that aims to pair Web Bot Auth with site-controlled terms and verifiable receipts.
Disclosure: I work on PEAC, but the pattern applies regardless of implementation.
The age of agents: cryptographically recognizing agent traffic
https://blog.cloudflare.com/signed-agents/
(https://news.ycombinator.com/item?id=45052276)
While it builds on standards as the top poster notes, cloudflare's version is a business moat driven central registry service and nothing what the decentralized internet would/should look.
i wrote a but more about this on my blog if someone care to read https://blog.agentcommunity.org/2025-08-23-web_auth_box_not_...