Why We're Taking Legal Action Against Serpapi's Unlawful Scraping
Key topics
The debate rages on as Google takes legal action against SerpApi for allegedly unlawful scraping, with commenters weighing in on the nuances of web scraping, robots.txt, and licensing agreements. Some argue that Google's own scraping practices are hypocritical, while others point out that Google respects robots.txt and doesn't obfuscate its crawlers, making it a key distinction. The discussion gets heated, with some commenters labeling SerpApi's actions as "malicious" and others suggesting that Google's monopoly on search gives it a unique privilege to respect robots.txt. Amidst the disagreement, a common thread emerges: the need for clearer guidelines on web scraping and its legality.
Snapshot generated from the HN discussion
Discussion Activity
Very active discussionFirst comment
8m
Peak period
49
0-12h
Avg / period
11.6
Based on 58 loaded comments
Key moments
- 01Story posted
Dec 19, 2025 at 1:24 PM EST
14 days ago
Step 01 - 02First comment
Dec 19, 2025 at 1:32 PM EST
8m after posting
Step 02 - 03Peak activity
49 comments in 0-12h
Hottest window of the conversation
Step 03 - 04Latest activity
Dec 24, 2025 at 5:19 PM EST
9 days ago
Step 04
Generating AI Summary...
Analyzing up to 500 comments to identify key contributors and discussion patterns
Want the full context?
Jump to the original sources
Read the primary article or dive into the live Hacker News thread when you're ready.
They have a different definition of "licensing" than most people I guess. Aren't site operators complaining about Google using this "licensed" content in AI overviews... not to mention the scraping for AI model training.
The pot is calling the kettle black.
DDoS remains illegal regardless of robots.txt.
SerpApi doesn't have that privilege.
I'm probably just being naive though...
You can of course argue a lot of edge cases if you really want. For the most part I want to say "it isn't worth the argument". In some cases I will take your side if I really have to think about it, but in general the system google has been using mostly works and is mostly an acceptable compromise.
Also most people would agree they are fine with being indexed in general. That is different from email spam where people don't want it.
People are generally fine with indexing operations so long as you don't use too much bandwidth.
Using AI to summarize content is still and open question - I wouldn't be surprised if this develops to some form of "you can index but not summarize", but only time will tell.
Do you have an example of a court ruling that violating robots.txt violates an existing law?
In Ziff Davis v. OpenAI [1], a court with jurisdiction over the Southern District of New York found that violating robots.txt does not violate DMCA section 1201(a) (which prohibits circumvention of technological protection measures of copyrighted content).
Also, it's my understanding that robots.txt started as a voluntary rule and mostly remains voluntary.
[1] https://blog.ericgoldman.org/archives/2025/12/are-robots-txt...
Along with all the other AI companies out there, the've committed the biggest theft in human history.
Adversarial Interoperability is Digital Human Right. Either companies can provide it reasonably or the people will assert their rights through other means.
> Stealthy scrapers like SerpApi override those directives and give sites no choice at all. SerpApi uses shady back doors — like cloaking themselves, bombarding websites with massive networks of bots and giving their crawlers fake and constantly changing names — circumventing our security measures to take websites’ content wholesale. [...] SerpApi deceptively takes content that Google licenses from others (like images that appear in Knowledge Panels, real-time data in Search features and much more), and then resells it for a fee. In doing so, it willfully disregards the rights and directives of websites and providers whose content appears in Search.
To me this seems... interesting, for sure. I think that Google already set a bad precedent by pulling content from the web directly into its results, and an even worse one by paying websites with user-generated content for said content (while those sites didn't pay the users that actually made the user-generated content, as an additional bitchslap.)
But it seems like at the very least Google is suggesting that SerpApi is effectively trying to "steal" the work Google did, rather than do the same work themselves. Though I wonder if this is really Google pulling up the ladder behind them a bit, given how privileged of a position they are in with regards to web scraping.
It's a tough case. I think that something does need to ultimately be done about "malicious" web scraping that ignores robots.txt, but traditionally that sort of thing did not violate any laws, and I feel somewhat skeptical that it will be found to violate the law today. I mean, didn't LinkedIn try this same thing?
Like GoogleBot?
And yeah, robots.txt is not enforced by any law.
I think this is just about dragging SerpApi through a lengthy legal procedure and fees.
They abuse this power to scrape your work, summarize it and cut you out as much as possible. Pure value extraction of others' work without equal return. Now intensified with AI
But yeah, you're right. They're not deceptive
nobody is forcing anyone. This is the same argument that people said about google search. Nobody is forcing anyone to use google search, google chrome, or even allow googlebot for scraping.
Thousands of poeple have switched over to chatgpt, brave/firefox ..
Your argument sounds like "I dont like Apple's practices, and I'm forced to buy iPhones. No buddy, if you dont like Apple, dont buy their products"
If you want people to visit your website, limiting yourself to the "thousands" of people who don't use google isn't really an option.
> Your argument sounds like "I dont like Apple's practices, and I'm forced to buy iPhones. No buddy, if you dont like Apple, dont buy their products"
Well, I don't like Apple's or Google's practices, but I basically [1] have to use either iOS or Android.
[1]: yes there are things like GrapheneOS and librem, but those aren't really practical for most people.
Then, if you do allow google to scrape, they will take your content and display it in the AI Overview with a best effort attempt to cut your click out completely.
Traffic to smaller sites are being decimated now due to AI Overview and other AI tools including Gemini that are scraping all of this data and no longer directing traffic to the source sites.
This is certainly a case of the pot calling the kettle black.
And then pretending that they're fighting for other people's copyright is just the cherry on top of the pile of hypocrisy.
Then they bend over backwards and do the "but not like that!" crap with their legal team and swing their wealth and influence around to screw over other companies and people, and a vast majority of it just vanishes, gets memory holed, with NDAs and out of court settlements, so you never get to see the full scope of harm they inflict unless you're watching like a hawk and catch the headlines before they get disappeared.
Google needs to be broken up and we need to legislate the dismantling of the current adtech regime, with a privacy and sovereignty respecting digital bill of rights that puts the interests of individual citizens above that of giant corporate blobs and the mass surveillance data industry.
Reddit Accuses 'Data Scraper' Companies of Stealing Its Information
https://news.ycombinator.com/item?id=45695433
Our Response to Reddit, Inc. vs. SerpApi, LLC: Defending the First Amendment
https://news.ycombinator.com/item?id=45739889
> SerpApi’s answer to SearchGuard is to mask the hundreds of millions of automated queries it is sending to Google each day to make them appear as if they are coming from human users. SerpApi’s founder recently described the process as “creating fake browsers using a multitude of IP addresses that Google sees as normal users.”
* that's the sound of a ladder being yanked up
Their entire ai model was scraped.