How to Stop Google from AI-Summarising Your Website
Original: How to stop Google from AI-summarising your website
Key topics
The cat's out of the bag: Google's AI summaries are changing the game, and website owners are scrambling to figure out how to stop them. As one commenter astutely pointed out, these summaries might cannibalize Google's ad revenue, but others counter that high-intent searches like "buy macbook" still rake in the dough. The discussion gets even more interesting with the revelation that Google's got a tech that injects product ads into AI summaries, potentially offsetting losses. With LLM services eating into Google's lunch, it's clear that the search giant is trying to stay ahead of the curve – even if it means giving users what they want, AI summaries and all.
Snapshot generated from the HN discussion
Discussion Activity
Very active discussionFirst comment
38m
Peak period
67
Day 1
Avg / period
24
Based on 72 loaded comments
Key moments
- 01Story posted
Aug 29, 2025 at 4:28 PM EDT
4 months ago
Step 01 - 02First comment
Aug 29, 2025 at 5:06 PM EDT
38m after posting
Step 02 - 03Peak activity
67 comments in Day 1
Hottest window of the conversation
Step 03 - 04Latest activity
Sep 9, 2025 at 8:30 AM EDT
4 months ago
Step 04
Generating AI Summary...
Analyzing up to 500 comments to identify key contributors and discussion patterns
Want the full context?
Jump to the original sources
Read the primary article or dive into the live Hacker News thread when you're ready.
Google even put the AI snippet above their ads, so you know how bad it stings.
IMO a LLM is just a superior technology to a search engine in that it can understand vague questions, collate information and translate from other languages. In a lot of cases what I want isn't to find a particular page but to obtain information, and a LLM gets closer to that ideal.
It's nowhere near perfect yet but I won't be surprised if search engines go extinct in a decade or so.
I suspect that they're hoping to "win" the AI war, get a monopoly, and then enshittify the whole thing. Good luck with that.
People will still spend the same amount of money to purchase goods and services. Advertisers will be willing to spend money to capture that demand.
Having their own websites is an optional part. It can also happen via Google Merchant Center, APIs, AI Agents, MCP servers, or other platforms.
I believe there will be fewer clicks going to the open web. But Google can simply charger a higher CPC for each click since the conversion rate is higher if a users clicks to buy after a 20 minute chat vs if a user clicks on an ad during every second or third Google search.
They won an award for the paper, and the example they given was a "holiday" search, where a hotel inserted their name, and an airline company wedged themselves as the best way to go there.
If I can find it again, I'll print and stick its link all over walls to make sure everybody knows what Google is up to.
Edit: Found it!
[0]: https://research.google/blog/mechanism-design-for-large-lang...
> and Reclaim Your Organic Traffic
Content:
> 1. Set Snippet Length to Zero with max-snippet:0
Sure, buddy, sure. Users are notorious for clicking a link in search result without description, right.
Header set X-Robots-Tag "noindex, nofollow, noarchive, nositelinkssearchbox, nosnippet, notranslate, noimageindex"
Of course, only the beeping Internet Archive totally ignored it and scraped my site. And now, despite me trying many times, they won't remove it.
It seems to mostly work, I also have Anubis in front of it now to keep the scrapers at bay.
(It's a personal diary website, started in 2000 before the term "blog" existed [EDIT: Not true - see below comment]. I know it's public content, I just don't want it searchable public)
Look at the reason, and get mad to the correct people.
It might be the archive themselves, but just be sure.
I still don't fathom why they just _ignore_ the request not to be scraped with the above headers. It's rude.
Why would you NOT want internet archive to scrape your website? (Im Clueless - thank you)
Yes I could password protect it (and any really personal content is locked behind being logged in, AI hasn't scraped that) but I _like_ being able to share links with people without having to also share passwords.
I realise the HN crowd is very much "More eyeballs are better for business" but this isn't business. This is a tiny, 5 hits a month (that's not me writing it) website.
In all honestly, if you're hosting it on the internet, why is this a problem? If you didn't want it to backed up, why is it publicly accessible at all? I'm glad the internet archive will keep hosting this content even when the original is long gone.
Let's say I'd read your website and wanted to look it up one day in the far future, only to find many years later the domain had expired, I'd be damn glad at least one organization had kept it readable.
Additionally, when I die, I want my website to go dark and that's that. It's a diary, it's very very mundane. My tech blog I post to, sure, I'm 200% happy to have that scraped/archived. My diary I keep very up-to-date offline copies of that my family have access to, should I tip over tomorrow.
I realise this goes against the usual Internet wisdom, and I'm sure there's more than one Chinese AI/bot out there that's scraped it and I have zero control over. But where I allegedly do have control, I'd like to exercise it. I don't think that's an unfair/ridiculous request.
>Good! It's literally the Internet Archive and you published it on the internet. That was your choice.
>As a general rule, people shouldn't get to remove things from the historical record.
>Sometimes we make exceptions for things that were unlawful to publish in the first place -- e.g. defamation, national secrets, certain types of obscene photos -- where there's a larger harm otherwise.
>But if you make someone public, you make it public. I'm sorry you seem to at least partially regret that decision, but as a general rule, it's bad for humanity to allow people to erase things from what are now historical records we want to preserve.
But it's my content - it's not your content. I don't regret my decision, anything I really don't want public is behind a login. The website is still there, still getting crawled.
What really upsets me the MOST though is IA won't even reply to my requests to tell me "We're not going to remove it" - your reply (I am assuming from your wording you have some relationship with them, apologies if that's not the case) is the only information I've got! (Thanks)
[Note reply was from user crazygringo but I can't find it now, almost like they... removed it? It was public though and I'm SURE they won't mind me archiving it here for them.]
So... you believe that your and IA's behavior is or is not okay? Because it's a touch odd to start playing the other side now.
Try using robots.txt to get it removed or excluded from The Internet Archive. The organization went back and forth on respecting robots.txt a couple of times, but it started respecting it (again) some years ago.
Several years ago I was also frustrated by its refusal to remove some content taken from a site I owned, but later the change to follow robots.txt was implemented (and my site was removed).
The FAQ has more information on how this works (there may be caveats). [1]
https://support.archive-it.org/hc/en-us/articles/208001096-R...
Yeah, maybe some will want to only read the imdb plot summary of Lord of the Rings. I am not sure why any author would care about those people unless they are really desperate for clicks.
It's akin to me putting up billboards and stickers around town and then demanding to decide who gets to look at them.
Same thing with online publishers. If they want to control who uses their content and how, there's a tried and true solution and it's spelled "paywall".
And no, sharing your labor for free with anyone who wants it (as long as they agree to a few simple rules) is nothing like putting up a billboard and "demanding to decide who gets to look at them".
The entire premise of billboards is to force people to look at something they had no intention or desire to look at. You weren't forced to search for, look at, or use someone's free software or other type of content. You did so willingly and intentionally.
Recipes are a good real world example of open source working properly. Anybody is free to use and improve. And anybody is free to not share their recipes or improvements with the public.
I don't think the Free Software Foundation is asking a lot when it uses the rule of law to control who uses their content and how.
Perhaps the answer for me is to put my content behind a login. A sad future for the web.
Part of the reason for writing is to cultivate an audience, to bring like-minded people together.
Letting a middleman wedge itself between you and your reader damages the ability and does NOT benefit the writer. If the writer wanted an LLM summary, they always have the option to generate it themselves. But y'know what? Most writers don't. Because they don't want LLM summaries.
---
Also, LLMs have been known to introduce biases into their output. Just yesterday somebody said they used an LLM for translation and it silently removed entire paragraphs because they triggered some filters. I for one don't want a machine which pretends to be impartial to pretend to "summarize" my opinions when in fact it's presenting a weaker version.
The best way to discredit an idea is not to argue against it, but to argue for it poorly.
Or asking if you want to pay to remove false information that they generate which makes you look bad.
Honestly, publishers should just allow it. If the concern is lost traffic, it could be worse — the “source” link in the summary is still above all the other results on the page. If the concern is misinformation, that’s another issue but could hopefully be solved by rewriting content, submitting accuracy reports, etc.
I do think Google needs to allow publishers to opt out of AI summary without also opting out of all “snippets” (although those have the same problem of cannibalizing clicks, so presumably if you’re worried about it for the AI summary then you should be worried about it for any other snippet too).
If Google doesn’t take you to someone else’s website or app, they can’t charge advertisers any money.
It’s kind of like how the movie industry killed their Blu-ray, DVD, and theater ticket sales in favor of streaming.
Or how digital download/streaming music took decades to match the pre-Napster revenue peak of the industry. It’s still barely ahead of that level and that’s before adjusting for inflation.
They are doing it but it is less lucrative than the non-AI search engine.
Like video streaming, they are forced into this via new competition.
E.g., ChatGPT is the marketshare leader in the new version of search engines, local AI models + ChatGPT for complex queries is the default “search engine” of Apple Intelligence, not Google on Safari.
Google’s risk here is that they’re about to lose everyone who isn’t running queries from their own platforms who still overwhelmingly use Google for their “general life queries” today (Apple users on web browsers, Windows users on web browsers).
One would think that they have a plan why they are doing it. They are the ones seeing the numbers.
We can speculate here about their risks or the stupidity of the plan... but i wouldn't say Google zero is some conspiration theory - flawed strategy maybe. I don't think people would be surprised if google.com became big "ask gemini" field. Many users probably wouldn't even notice.
And technically speaking, this citation is the first link on the results page, so you “rank” higher than all the other results. But it does take two clicks to get to your page.
They should make the citations more prominent and use the page title as anchor text. And when there’s multiple citations, the side panel should be open by default or they should put all of the citations inline as prominent links with page titles as anchor text.
When you order on amazon, you no longer deal with the merchant. When you order food, you no longer directly pay the restaurant. When you ask for information from web, you no longer want to deal with idiosyncrasies of the content authors (page styles, navigation, fragmentation of content, ads etc).
Is it bad for content owners? Yes, because people won't visit your pages any longer, affecting your ad revenue. Is it compensated? Now this is where it differs from amazon and food delivery apps. There is no compensation for the lost ad revenue. If the only purpose of your content is ads, well, that is gone.
But wait, a whole lot of content on internet is funded by ads. And Google's bread and butter lies in the ad revenues of the sites. Why would they kill their geese? Because they have no other option. They just need to push the evolution and be there when future arrives. They hope to be part of the future somehow.
> Is it bad for content owners? Yes, because people won't visit your pages any longer, affecting your ad revenue.
So it is actually better for the both sides. One is getting hurt in this transition process.
This feels like the wrong solution for wanting to be compensated for information.
I don't how what the solution is because one often doesn't know if the information is worth paying for until after viewing it.