Google Flags Immich Sites as Dangerous
Posted3 months agoActive3 months ago
immich.appTechstoryHigh profile
heatednegative
Debate
85/100
Google Safe BrowsingSelf-HostingWeb Security
Key topics
Google Safe Browsing
Self-Hosting
Web Security
Google flagged Immich, a self-hosted photo management platform, as a 'dangerous site', sparking a heated discussion on HN about the implications of such flagging and the potential for abuse of power by Google.
Snapshot generated from the HN discussion
Discussion Activity
Very active discussionFirst comment
53m
Peak period
55
6-12h
Avg / period
17.8
Comment distribution160 data points
Loading chart...
Based on 160 loaded comments
Key moments
- 01Story posted
Oct 22, 2025 at 4:53 PM EDT
3 months ago
Step 01 - 02First comment
Oct 22, 2025 at 5:46 PM EDT
53m after posting
Step 02 - 03Peak activity
55 comments in 6-12h
Hottest window of the conversation
Step 03 - 04Latest activity
Oct 25, 2025 at 10:13 PM EDT
3 months ago
Step 04
Generating AI Summary...
Analyzing up to 500 comments to identify key contributors and discussion patterns
ID: 45675015Type: storyLast synced: 11/27/2025, 3:36:12 PM
Want the full context?
Jump to the original sources
Read the primary article or dive into the live Hacker News thread when you're ready.
https://old.reddit.com/r/immich/comments/1oby8fq/immich_is_a...
I had my personal domain I use for self-hosting flagged. I've had the domain for 25 years and it's never had a hint of spam, phishing, or even unintentional issues like compromised sites / services.
It's impossible to know what Google's black box is doing, but, in my case, I suspect my flagging was the result of failing to use a large email provider. I use MXRoute for locally hosted services and network devices because they do a better job of giving me simple, hard limits for sending accounts. That way if anything I have ever gets compromised, the damage in terms of spam will be limited to (ex) 10 messages every 24h.
I invited my sister to a shared Immich album a couple days ago, so I'm guessing that GMail scanned the email notifying her, used the contents + some kind of not-google-or-microsoft sender penalty, and flagged the message as potential spam or phishing. From there, I'd assume the linked domain gets pushed into another system that eventually decides they should blacklist the whole domain.
The thing that really pisses me off is that I just received an email in reply to my request for review and the whole thing is a gas-lighting extravaganza. Google systems indicate your domain no longer contains harmful links or downloads. Keep yourself safe in the future by blah blah blah blah.
Umm. No! It's actually Google's crappy, non-deterministic, careless detection that's flagging my legitimate resources as malicious. Then I have to spend my time running it down and double checking everything before submitting a request to have the false positive mistake on Google's end fixed.
Convince me that Google won't abuse this to make self hosting unbearable.
This seems like the flagging was a result of the same login page detection that the Immich blog post is referencing? What makes you think it's tied to self-hosted email?
In my case, the Google Search Console explicitly listed the exact URL for a newly created shared album as the cause.
https://photos.example.com/albums/xxxxxxxx-xxxx-xxxx-xxxx-xx...
I wish I would have taken a screenshot. That URL is not going to be guessed randomly and the URL was only transmitted once to one person via e-mail. The sending was done via MXRoute and the recipient was using GMail (legacy Workspace).
The only possible way for Google to have gotten that URL to start the process would have been by scanning the recipient's e-mail. What I was trying to say is that the only way it makes sense to me is if Google via GMail categorized that email as phishing and that kicked off the process to add my domain to the block list.
So, if email categorization / filtering is being used as a heuristic for discovering URLs for the block list, it's possible Google's discriminating against domains that use smaller email hosts that Google doesn't trust as much as themselves, Microsoft, etc..
All around it sucks and Google shouldn't be allowed to use non-deterministic guesswork to put domains on a block list that has a significant negative impact. If they want to operate a clown show like that, they should at least be liable for the outcomes IMO.
It's scary how much control Google has over which content people can access on the web - or even on their local network!
https://news.ycombinator.com/item?id=45538760
Doesn't that effectively let anyone host anything there?
It's more like sites.google.com.
Sqlite used to have a limit of 999 query parameters, which was much easier to hit. It's now a roomy 32k.
We probably should have been partitioning the data instead of inserting it twice, but I never got around to fixing that.
COPY is likely a better option if you have access to the host, or provider-specific extensions like aws_s3 if you have those. I'm sure a data engineer would be able to suggest a better ETL architecture than "shove everything into postgres", too.
is even funnier :D
>Some phones will silently strip GPS data from images when apps without location permission try to access them.
That strikes me as the right thing to do?
And wait. Uh oh. Does this mean my Syncthing-Fork app (which itself would never strike me as needing location services) might have my phone's images' location be stripped before making their way to my backup system?
EDIT: To answer my last question: My images transferred via Syncthing-Fork on a GrapheneOS device to another PC running Fedora Atomic have persisted the GPS data as verified by exiftool. Location permissions have not been granted to Syncthing-Fork.
Happy I didn't lose that data. But it would appear that permission to your photo files may expose your GPS locations regardless of the location permission.
Looking now I can't even find that setting anymore on my current phone. But the photos still does have the GPS data intact.
Yep, and it's there for very goos reasons. However if you don't know about it, it can be quite surprising and challenging to debug.
Also it's annoying when your phones permissions optimiser runs and removes the location permissions from e.g. Google Photos, and you realise a few months later that your photos no longer have their location.
What happens is that when an application without location permissions tries to get photos, the corresponding OS calls strip the geo location data when passing them. The original photos still have it, but the application doesn't, because it doesn't have access to your location.
This was done because most people didn't know that photos contain their location, and people got burned by stalkers and scammers.
Every kind of permission should fail the same way, informing the user about the failure, and asking if the user wants to give the permission, deny the access, or use dummy values. If there's more than one permission needed for an operation, you should be able to deny them all, or use any combination of allowing or using dummy values.
Try to get an iPhone user to send you an original copy of a photo with all metadata. Even if they want to do it most of them don't know how.
I don't disagree that months should be 1-indexed, but I would not make that assumption solely based on days/years being 1-indexed, since 0-indexing those would be psychotic.
I don't think adding counterintuitive behavior to your data to save a "- 1" here and there is a good idea, but I guess this is just legacy from the ancient times.
Can't wait for it to be stable and widely available, it's just too good.
> month values start at 1, which is different from legacy Date where months are represented by zero-based indices (0 to 11)
[0] https://tc39.es/proposal-temporal/docs/
[1] https://tc39.es/proposal-temporal/docs/plaindate.html#month
For example, the first day of the first month of the first year is 1.1.1 AD (at least for Gregorian calendar), so we could just go with 0-indexed 0.0.0 AD.
I've now read the entire Cursed Knowledge list & - while I found some of them to be invaluable insights & absolutely love the idea of projects maintaining a public list of this nature to educate - there are quite a few red flags in this particular list.
Before mentioning them: some excellent & valuable, genuinely cursed items: Postgres NOTIFY (albeit adapter-specific), npm scripts, bcrypt string lengths & especially the horrifically cursed Cloudflare fetch: all great knowledge. But...
> Secure contexts are cursed
> GPS sharing on mobile is cursed
These are extremely sane security feature. Do we think keeping users secure is cursed? It honestly seems crazy to me for them to have published these items in the list with a straight face.
> PostgreSQL parameters are cursed
Wherein their definition of "cursed" is that PG doesn't support running SQL queries with more than 65535 separate parameters! It seems to me that any sane engineer would expect the limit to be lower than that. The suggestion that making an SQL query with that many parameters is normal seems problematic.
> JavaScript Date objects are cursed
Javascript is zero-indexed by convention. This one's not a huge red flag but it is pretty funny for a programmer to find this problematic.
> Carriage returns in bash scripts are cursed
Non-default local git settings can break your local git repo. This isn't anything to do with bash & everyone knows git has footguns.
Also the full story here seemed to be
1. Person installs git on Windows with autocrlf enabled, automatically converting all LF to CRLF (very cursed in itself in my opinion).
2. Does their thing with git on the Windows' side (clone, checkout, whatever).
3. Then runs the checked out (and now broken due to autocrlf) code on Linux instead of Windows via WSL.
The biggest footgun here is autocrlf but I don't see how this is whole situation is the problem of any Linux tooling.
TL;DR - if your repo will contain bash scripts, use .gitattributes to make sure they have LF line endings.
If git didn't have this setting, then after checking out a bash file with LFs in it, there are many Windows editors that would not be able to edit that file properly. That's a limitation of those editors & nobody should be using those pieces of software to edit bash files. This is a problem that is entirely out of scope for a VCS & not something Git should ever have tried to solve.
In fact, having git solve this disincentives Windows editors from solving it correctly.
Well, bash could also handle crlf nicely. There's no gain from interpreting cr as a non-space character.
(The same is valid for every language out there and all the spacey things, like zero-width space, non-breaking space, and vertical tabs.)
This is just a list of things that can catch devs off guard
> JavaScript date objects are 1 indexed for years and days, but 0 indexed for months.
This mix of 0 and 1 indexing in calendar APIs goes back a long way. I first remember it coming from Java but I dimly recall Java was copying a Taligent Calendar API.
Dark-grey text on black is cursed. (Their light theme is readable.)
Also, you can do bulk inserts in postgres using arrays. Take a look at unnest. Standard bulk inserts are cursed in every database, I'm with the devs here that it's not worth fixing them in postgres just for compatibility.
Its unclear exactly what conditions cause a site to get blocked by safe browsing. My nextcloud.something.tld domain has never been flagged, but I’ve seen support threads of other people having issues and the domain name is the best guess.
https://photos.example.com/albums/xxxxxxxx-xxxx-xxxx-xxxx-xx...
Then suddenly the domain is banned even though there was never a way to discover that URL besides GMail scanning messages. In my case, the server is public so my siblings can access it, but there's nothing stopping Google from banning domains for internal sites that show up in emails they wrongly classify as phishing.
Think of how Google and Microsoft destroyed self hosted email with their spam filters. Now imagine that happening to all self hosted services via abuse of the safe browsing block lists.
https://photos.example.com/albums/xxxxxxxx-xxxx-xxxx-xxxx-xx...
That's not going to be gleaned from a CT log or guessed randomly. The URL was only transmitted once to one person via e-mail. The sending was done via MXRoute and the recipient was using GMail (legacy Workspace).
The only possible way for Google to have gotten that URL to start the process would have been by scanning the recipient's e-mail.
I've read almost everything linked in this post and on Reddit and, with what you pointed out considered, I'd say the most likely thing that got my domain flagged is having a redirect to a default styled login page.
The thing that really frustrates me if that's the case is that it has a large impact on non-customized self-hosted services and Google makes no effort to avoid the false positives. Something as simple as guidance for self-hosted apps to have a custom login screen to differentiate from each other would make a huge difference.
Of course, it's beneficial to Google if they can make self-hosting as difficult as possible, so there's no incentive to fix things like this.
Also - when you say banned, you're speaking of the "red screen of death" right? Not a broader ban from the domain using Google Workplace services, yeah?
Yes.
> I would love for someone to attempt this in as controlled of a manner as possible.
I'm pretty confident they scanned a URL in GMail to trigger the blocking of my domain. If they've done something as stupid as tying GMail phishing detection heuristics into the safe browsing block list, you might be able to generate a bunch of phishy looking emails with direct links to someone's login page to trigger the "red screen of death".
I'm guessing Google's phishing analysis must be going off the rails seeing all of these login prompts saying "immich" when there's an actual immich cloud product online.
If I were tasked with automatically finding phishing pages, I too would struggle to find a solution to differentiate open-source, self-hosted software from phishing pages.
I find it curious that this is happening to Immich so often while none of my own self-hosted services have ever had this problem, though. Maybe this is why so many self-hosted tools have you configure a name/descriptor/title/whatever for your instance, so they can say "log in to <my amazing photo site>" rather than "log in to Product"? Not that Immich doesn't offer such a setting.
Possible scenario:
- A self-hosted project has a demo instance with a default login page (demo.immich.app, demo.jellyfin.org, demo1.nextcloud.com) that is classified as "primary" by google's algorithms
- Any self-hosted instance with the same login page (branding, title, logo, meta html) becomes a candidate for deceptive/phishing by their algorithm. And immich.cloud has a lot of preview envs falling in that category.
BUT in Immich case its _demo_ login page has its own big banner, so it is already quite different from others. Maybe there's no "original" at all. The algorithm/AI just got lost among thousands of identically looking login pages and now considers every other instance as deceptive...
Normally I see the PSL in context of e.g. cookies or user-supplied forms.
Yes. For instance in circumstances exactly as described in the thread you are commenting in now and the article it refers to.
Services like google's bad site warning system may use it to indicate that it shouldn't consider a whole domain harmful if it considers a small number of its subdomains to be so, where otherwise they would. It is no guarantee, of course.
For example, if users are supposed to log in on the base account in order to access content on the subdomains, then using the public suffix list would be problematic.
I'm not sure how people not already having hit this very issue before is supposed to know about it beforehand though, one of those things that you don't really come across until you're hit by it.
Fun learning new things so often but I never once heard of the public suffix list.
That said, I do know the other best practices mentioned elsewhere
Which then links to: https://github.com/publicsuffix/list/wiki/Guidelines#submitt...
Fairly obvious and typical webpage > documentation flow I think, doesn't seem too hard to find.
Google 90s to 2010 is nothings like Google 2025. There is a reason they removed "Don't be evil" ... being evil and authoritarian makes more money.
Looking at you Manifest V2 ... pour one out for your homies.
This is the first thing i disable in Chrome, Firefox and Edge. The only safe thing they do is safely sending all my browsing history to Google or Microsoft.
This feature is there for my mother-in-law, who never saw a popup ad she didn't like. You might think I'm kidding; I am not. I periodically had to go into her Android device and dump twenty apps she had manually installed from the Play Store because they were in a ring of promoting each other.
Well, if the legal system used the same "Guilty until proven innocent" model, we would definitely "catch more bad actors than false positive good actors".
That's a tricky one, isn't it.
A better analogy, unfortunately for all the reasons it's unfortunate, is police: acting on the partial knowledge in the field to try to make the not-worst decision.
Google needs to be held liable for the damages they do in cases like this or they will continue to implement the laziest solutions as long as they can externalize the costs.
many google employee is in here, so I dont expect them to be agree with you
I wish this comment were top ranked so it would be clear immediately from the comments what the root issue was.
For example:
At this point if someone else on that hosting provider gets that IP address assigned, your subdomain is now hosting their content.I had this happen to me once with PDF books being served through a subdomain on my site. Of course it's my mistake for not removing the A record (I forgot) but I'll never make that mistake again.
10 years of my domain having a good history may have gotten tainted in an unrepairable way. I don't get warnings visiting my site but traffic has slowly gotten worse over time since around that time, despite me posting more and more content. The correlation isn't guaranteed, especially with AI taking away so much traffic but it's something I do think about.
This is very clearly just bad code from Google.
God I hate the web. The engineering equivalent of a car made of duct tape.
Kind of. But do you have a better proposition?
End of random rant.
But then you would loose plattform independency, the main selling point of this atrocity.
Having all those APIs in a sandbox that mostly just work on billion devices is pretty powerful and a potential succesor to HTML would have to beat that, to be adopted.
The best thing to happen, that I can see, is that a sane subset crystalizes, that people start to use dominantly, with the rest becoming legacy, only maintained to have it still working.
But I do dream of a fresh rewrite of the web since university (and the web was way slimmer back then), but I got a bit more pragmatic and I think I understood now the massive problem of solving trusted human communication better. It ain't easy in the real world.
This all just drives a need to come up with ever more tacked-on protection schemes because browsers have big targets painted on them.
And that's before realizing it's already a bad idea with existing devices because they were never designed for giving untrusted actors direct access.
Anyway, in your scenario the controller would be essentially a one off and you'd be better off writing a native app to interface with it for the one computer this experiment will run on.
Not unlike the programming language or the app (growing until it half-implements LISP or half-implements an email client), the browser will grow until it half-implements an operating system.
For everyone else, there's already w3m.
You have sites now that let you debug microcontrollers on your browser, super cool.
Same thing but with firmware updates in the browser. Cross platform, replaced a mess of ugly broken vendor tools.
Your micro-controllers should use open standards for their debugging interface and not force people to use the vendor website.
You remove that, and videoconferencing (for business or person to person) has to rely on downloading an app, meaning whoever is behind the website has to release for 10-15 OSes now. Some already do, but not everyone has that budget so now there's a massive moat around it.
> But do we need e.g serial port or raw USB access straight from a random website
Being able to flash an IoT (e.g. ESP32) device from the browser is useful for a lot of people. For the "normies", there was also Stadia allowing you to flash their controller to be a generic Bluetooth/usb one on a website, using that webUSB. Without it Google would have had to release an app for multiple OSes, or more likely, would have just left the devices as paperweights. Also, you can use FIDO/U2F keys directly now, which is pretty good.
Browsers are the modern Excel, people complain that they do too much and you only need 20%. But it's a different 20% for everyone.
Same as your camera/microphone/location.
But do we need audio, images, Canvas, WebGL, etc? The web could just be plain text and we’d get most of the “useful” content still, add images and you get a vast majority of it.
But the idea that the web is a rich environment that has all of these bells and whistles is a good thing imo. Yes there’s attack surface to consider, and it’s not negligible. However, the ability to connect so many different things opens up simple access to things that would otherwise require discrete apps and tooling.
One example that kind of blew my mind is that I wanted a controller overlay for my Twitch stream. After a short bit of looking, there isn’t even a plugin needed in OBS (streaming software). Instead, you add a Web View layer and point it to GamePad Viewer[1] and you’re done.
Serial and USB are possibly a boon for very specific users with very specific accessibility needs. Also, iirc some of the early iPhone jailbreaks worked via websites on a desktop with your iPhone plugged into usb. Sure these are niche, and could probably be served just as well or better with native apps, and web also makes the barrier to entry so much lower .
[1]: https://gamepadviewer.com/
Yes. Regards, CIA, Mossad, FSB etc.
WebUSB I don't use or would miss it right now, but .. the main potential use case is security and it sounds somewhat reasonable
"Use in multi-factor authentication
WebUSB in combination with special purpose devices and public identification registries can be used as key piece in an infrastructure scale solution to digital identity on the internet."
https://en.wikipedia.org/wiki/WebUSB
I think the giant major downside, is that they've written a rootkit that runs on everything, and to try to make up for that they want to make it so only sites they allow can run.
It's not really very powerful at all if nobody can use it, at that point you are better off just not bothering with it at all.
The Internet may remain, but the Web may really be dead.
What do you mean, you can run whatever you want on localhost, and it's quite easy to host whatever you want for whoever you want too. Maybe the biggest modern added barrier to entry is that having TLS is strongly encouraged/even needed for some things, but this is an easily solved problem.
But people do use it, like the both of us right now?
People also use maps, do online banking, play games, start complex interactive learning environments, collaborate in real time on documents etc.
All of that works right now.
I don't see how that solves the issue that PSL tries to fix. I was a script kiddy hosting neopets phishing pages on free cpanel servers from <random>.ripway.com back in 2007. Browsers were way less capable then.
It's not even broken as the edge cases are addressed by ad-hoc solutions.
OP is complaining about global infrastructure not having a pristine design. At best it's a complain over a desirable trait. It's hardly a reason to pull the Jr developer card and mindlessly advocate for throwing everything out and starting over.
We live in world where whatever faang adopts is de facto a standard. Accessible these days means google/gmail/facebook/instagram/tiktok works. Everything else is usually forced to follow along.
People will adopt whatever gives them access to their daily dose of doomscrolling and then complain about rather crucial part of their lives like online banking not working.
> And of course, if the new solution completely invalidates old sites, it just won't get picked up.
Old sites don't matter, only high-traffic sites riddled with dark patterns matter. That's the reality, even if it is harsh.
Try 90s! We had to fight off ActiveX Plugins left and right in the good olde Internet Explorer! Yarr! ;-)
This might be what's needed to break out of the current local optimum.
https://www.uzbl.org/
[1] https://www.uzbl.org/
which is still much too new to be able to shut down the PSL of course. but maybe in 2050.
528 more comments available on Hacker News