Why We're Taking Legal Action Against Serpapi's Unlawful Scraping
Key topics
The gloves are off as Google takes legal action against SerpApi for allegedly unlawful scraping, sparking a heated debate about the search giant's own scraping practices. Commenters aren't holding back, pointing out the hypocrisy of Google's actions, with some arguing that the company's business model relies on scraping the web and subsidizing its service with ads. While some defend Google's position, noting that it only scrapes websites that haven't opted out via robots.txt, others counter that this "opt-out" model is effectively coercive, given Google's monopoly power. As the discussion unfolds, a consensus emerges around the need for clearer legal guidelines on "adversarial interoperability" to promote a more competitive market.
Snapshot generated from the HN discussion
Discussion Activity
Very active discussionFirst comment
2h
Peak period
28
3-4h
Avg / period
7.1
Based on 50 loaded comments
Key moments
- 01Story posted
Jan 7, 2026 at 11:46 PM EST
1d ago
Step 01 - 02First comment
Jan 8, 2026 at 1:53 AM EST
2h after posting
Step 02 - 03Peak activity
28 comments in 3-4h
Hottest window of the conversation
Step 03 - 04Latest activity
Jan 8, 2026 at 2:23 PM EST
1d ago
Step 04
Generating AI Summary...
Analyzing up to 500 comments to identify key contributors and discussion patterns
Want the full context?
Jump to the original sources
Read the primary article or dive into the live Hacker News thread when you're ready.
Data wants to be free. They knew that once.
EDIT: Also to be clear I am not saying they can't win legally. I'm sure they can do legal games and could shop around until they were successful. They are in the wrong conceptually.
The biggest joke was all the “hackers” 25 years ago shouting “Don’t be evil like Oracle, Microsoft, Apple or Adobe and charge for your software, be good like Google and just put like a banner ad or something and give it away for free”
You can search Google _for free_ (with all the caveats of that statement), part of their grievance is that serpapi use the scraped data as a paid for service
Lots of Google bot blocking is also circumvented, which they seem to have made a lot of efforts towards in the past year
- robots.txt directives (fwiw)
- You need JS
- If you have no cookie you'll be given a set of JS fingerprints, apparently one set for mobile and one for desktop. You may have to tweak what fingerprints you give back in order to get results custom to user agent etc.
Google was never that bothered about scraping if it was done at a reasonable volume. With pools of millions of IPs and a handle on how to get around their blocking they're at the mercy of how polite the scraping is. They're maybe also worried about people reselling data en masse to competitors i.e. their usual all your data belongs to us and only us.
I thought the ads counted as payment? That seems to be the logic used to take technical measures against adblockers on YouTube while pushing users towards a paid ad-free subscription, at least.
If viewing ads is payment, then Google isn't a free service. If viewing ads isn't payment, then Google should have no problem with people using adblockers.
Google would like you to click through as it looks better for their stats, but they don't actually care.
Well not through their API which you do need to pay for and is a paid service.
SERP API just assumes everybody wants to be scraped, and doesn't give you a choice.
(whether websites should have such a choice is a different matter entirely).
You know what getting my consent would look like? Google hosting a form where i can tell them PLEASE SCRAPE MY WEBSITE and include it in your search results. That is what consent looks like.
Google has never asked for my consent. Yet they expect others to behave by different rules.
Now where google may have a reasonable case is that google scrapes with the intention of offering the data “for free”. SerpAPI does not.
I don't think this suit is actually about that, though. I think Google's complaint is that
> SerpApi deceptively takes content that Google licenses from others
In other words, this is just a good old-fashioned licence violation.
Is that true with how they trained Gemini? Doesn't everyone with a foundational model scrape the web relentlessly without regard for robots.txt?
Like if you give a friend a key to your house so they can check on your plants when you're out of town but they throw a rager and trash the place.
That was not a phrase I expected to read on Hacker News! Haven't heard it since I was about 13. I always assumed it was a Scottish phrase.
Being used as AI training data provides negative value for a website owner, as it takes traffic away.
It's the difference between a movie review, and a ripped torrent.
Following that same logic, may I inform you that your income going forward is different: it has to be directed to my bank account, because the account needs the money! :-)
Almost everybody wants to appear in search, so disallowing the entirety of Google is far more costly than E.G. disallowing Openai, who even differentiates between content scraped for training and content accessed to respond to a user request.
The short answer is that scraping isn't a CFAA offence but might be a terms and conditions violation, depending on the specifics of the access.
The index would just point a local crawler towards hubs of resources, links, feeds, and specialized search engines. Then fresh information would come from the crawler itself. My thinking is that reputable sites don't appear every day, if you update your local index once every few months it is sufficient.
The index could host 1..10...100M stubs, each one touching on a different topic, and concentrating the best entry points on the web for that topic. A local LLM can RAG-search it, and use an agent to crawl from there on.
If you solve search this way, without Google, and you also have local code execution sandbox, and local model, you can cut the cord.
Imagine this stack: local LLM, local search stub index, and local code execution sandbox - a sovereign stack. You can get some privacy and independence back.
I imagine you'd get on just fine for short tail queries but the other cases (longer tail, recent queries, things that haven't been crawled) begin to add up.
I certainly did not and find using the content google scraped from my website for money or AI (which they also sell on a token basis) more questionable than some third party offering API access to it.
[1] https://docs.cloud.google.com/generative-ai-app-builder/docs...
https://blog.cloudflare.com/perplexity-is-using-stealth-unde...
Testimony https://medium.com/@brianwarner/celebritynetworths-statement...
CNW ended up putting up content for fake celebrity's after declining Google's request for API usage to prove that Google was scraping them.
They also started caring about this, probably because they don't want their competitors to get the same data as they have.