Nano Banana Pro
Mood
excited
Sentiment
positive
Category
startup_launch
Key topics
Artificial Intelligence
Machine Learning
Discussion Activity
Very active discussionFirst comment
2m
Peak period
160
Day 1
Avg / period
160
Based on 160 loaded comments
Key moments
- 01Story posted
Nov 20, 2025 at 10:04 AM EST
3d ago
Step 01 - 02First comment
Nov 20, 2025 at 10:06 AM EST
2m after posting
Step 02 - 03Peak activity
160 comments in Day 1
Hottest window of the conversation
Step 03 - 04Latest activity
Nov 20, 2025 at 3:50 PM EST
3d ago
Step 04
Generating AI Summary...
Analyzing up to 500 comments to identify key contributors and discussion patterns
But of course there’s no way to enforce it on local generation.
Google doesn't claim that Gemini would call SynthID detector at this point.
Edit: well they actually do. I guess it is not rolled out yet.
> Today, we are putting a powerful verification tool directly in consumers’ hands: you can now upload an image into the Gemini app and simply ask if it was generated by Google AI, thanks to SynthID technology. We are starting with images, but will expand to audio and video soon.
Re-rolling a few times got it to mention trying SynthID, but as a false negative, assuming it actually did the check and isn't just bullshitting.
> No Digital Watermark Detected: I was unable to detect any digital watermarks (such as Google's SynthID) that would definitively label it as being generated by a specific AI tool.
This would be a lot simpler if they just exposed the detector directly, but apparently the future is coaxing an LLM into doing a tool call and then second guessing whether it actually ran the tool.
Not sure how that makes any sense
edit: apparently people have been able to remove these watermarks with a high success rate so already this feels like a DOA product
No, its not the beginning, multiple different watermarking standards, watermark checking systems, and, of course, published countermeasures of various effectiveness for most of them, have been around for a while.
(The Gemini 3 post has a million comments too many to ask this now)
Gemini 2 still goes "While I cannot check Google Flights directly, I can provide you with information based on current search results…" blah blah
But I wouldn't mind being easily able to make infographics like these, I'd just like to supply the textual and factual content myself.
After launch, Google's public branding for the product was "Gemini" until Google just decided to lean in and fully adopt the vastly more popular "Nano Banana" label.
The public named this product, not Google. Google's internal codename went virally popular and outstaged the official name.
Branding matters for distribution. When you install yourself into the public consciousness with a name, you'd better use the name. It's free distribution. You own human wetware market share for free. You're alive in the minds of the public.
Renaming things every human has brand recognition of, eg. HBO -> Max, is stupid. It doesn't matter if the name sucks. ChatGPT as a name sucks. But everyone in the world knows it.
This will forever be Nano Banana unless they deprecate the product.
Failed to generate content: permission denied. Please try again.
If you triggered the safeguard it'll give you the typical "sorry, I can't..." LLM response.
ChatGPT's imagegen has been released for half a year but there isn't anything remotely similar to it in the open weight realm.
Assuming that this new model works as advertised, it's interesting to me that it took this long to get an image generation model that can reliably generate text. Why is text generation in images so hard?
- It requires an AI that actually understands English, I.e. an LLM. Older, diffusion-only models were naturally terrible at that, because they weren’t trained on it.
- It requires the AI to make no mistakes on image rendering, and that’s a high bar. Mistakes in image generation are so common we have memes about it, and for all that hands generally work fine now, the rest of the picture is full of mistakes you can’t tell are mistakes. Entirely impossible with text.
Nano Banana Pro seems to somewhat reliably produce entire pictures without any mistakes at all.
Looks like: "When tested on images marked with Google’s SynthID, the technique used in the example images above, Kassis says that UnMarker successfully removed 79 percent of watermarks." From https://spectrum.ieee.org/ai-watermark-remover
> Rolling out globally in the Gemini app
wanna be any more vague? is it out or not? where? when?
And in AI Studio, you need to connect a paid API key to use it:
https://aistudio.google.com/prompts/new_chat?model=gemini-3-...
> Nano Banana Pro is only available for paid-tier users. Link a paid API key to access higher rate limits, advanced features, and more.
I had second thoughts about this comment, but if I stopped typing in the middle of it, I would've had to pay a cancellation fee.
Adobe, at least, makes money by selling software. Google makes money by capturing eyeballs; only incidentally does anything they do benefit the user.
The 2nd take is AI is costing companies so much money, that they need to cut workforce to pay for their AI investments.
I'm inclined to think the latter is represents what's happening more than the former.
Like it would be nice if all photo and video generated by the big players would have some kind of standardized identifier on them - but now you're left with the bajillion other "grey market" models that won't give a damn about that.
I don't see how it would defeat the cat and mouse game.
For example, it's trivial to post an advertisement without disclosure. Yet it's illegal, so large players mostly comply and harm is less likely on the whole.
It still won't prevent it, but it would prevent large players from doing it.
Plus, any service good at reverse-image search (like Google) can basically apply that to determine whether they generated it.
There will always be a way to defeat anything, but I don't see why this won't work for like 90% of cases.
It may be easier if you have an oracle on your end to say "yes, this image has/does not have the watermark," which could be the case for some proposed implementations of an AI watermark. (Often the use-case for digital watermarks assumes that the watermarker keeps the evaluation tool secret - this lets them find, e.g, people who leak early screenings of movies.)
Always has been so far. You add noise until the signal gets swamped. In order to remain imperceptible it's a tiny signal, so it's easy to swamp.
No, but model training technology is out in the open, so it will continue to be possible to train models and build model toolchains that just don't incorporate watermarking at all, which is what any motivated actor seeking to mislead will do; the only thing watermarking will do is train people to accept its absence as a sign of reliability, increasing the effectiveness of fakes by motivated bad actors.
We will always have local models. Eventually the Chinese will release a Nano Banana equivalent as open source.
https://generative-ai.review/2025/09/september-2025-image-ge... (non-pro Nano Banana)
If watermarking becomes a legal mandate, it will inevitably include a prohibition on distributing (and using and maybe even possessing, but the distribution ban is the thing that will have the most impact, since it is the part that is most policable, and most people aren't going to be training their own models, except, of course, the most motivated bad actors) open models that do not include watermarking as a baked-in model feature. So, for most users, it'll be much less accessible (and, at the same time, it won't solve the problem.)
have some kind of standardized identifier on them
Take this a step further and it'll be a personal identifying watermark (only the company can decode). Home printers already do this to some degree.All of this is trivially easy to circumvent ceremony.
Google is doing this to deflect litigation and to preserve their brand in the face of negative press.
They'll do this (1) as long as they're the market leader, (2) as long as there aren't dozens of other similar products - especially ones available as open source, (3) as long as the public is still freaked out / new to the idea anyone can make images and video of whatever, and (4) as long as the signing compute doesn't eat into the bottom line once everyone in the world has uniform access to the tech.
The idea here is that {law enforcement, lawyers, journalists} find a deep fake {illegal, porn, libelous, controversial} image and goes to Google to ask who made it. That only works for so long, if at all. Once everyone can do this and the lookup hit rates (or even inquiries) are < 0.01%, it'll go away.
It's really so you can tell journalists "we did our very best" so that they shut up and stop writing bad articles about "Google causing harm" and "Google enabling the bad guys".
We're just in the awkward phase where everyone is freaking out that you can make images of Trump wearing a bikini, Tim Cook saying he hates Apple and loves Samsung, or the South Park kids deep faking each other into silly circumstances. In ten years, this will be normal for everyone.
Writing the sentence "Dr. Phil eats a bagel" is no different than writing the prompt "Dr. Phil eats a bagel". The former has been easy to do for centuries and required the brain to do some work to visualize. Now we have tools that previsualize and get those ideas as pixels into the brain a little faster than ASCII/UTF-8 graphemes. At the end of the day, it's the same thing.
And you'll recall that various forms of written text - and indeed, speech itself - have been illegal in various times, places, and jurisdictions throughout history. You didn't insult Caesar, you didn't blaspheme the medieval church, and you don't libel in America today.
How can they distinguish from real people exploited to AI models autogenerating everything?
I mean right now this is possible, largely because a lot of the AI videos have shortcomings. But imagine in 5 years from now on ...
The people who care don't consume content which even just plausibly looks like real people exploited. They wouldn't consume the content even if you pinky promised that the exploited looking people are not real people. Even if you digitally signed that promise.
The people who don't care don't care.
Watermarking by compliant models doesn't help this much because (1) models without watermarking exist and can continue to be developed (especially if absence of a watermark is treated as a sign of authenticity), so you cannot rely on AI fakery being watermarked, and (2) AI models can be used for video-to-video generation without changing much of the source, so you can't rely on something accurately watermarked as "AI-generated" not being based in actual exploitation.
Now, if the watermarking includes provenance information, and you require certain types of content to be watermarked not just as AI using a known watermarking system, but by a registered AI provider with regulated input data safety guardrails and/or retention requirements, and be traceable to a registered user, and...
Well, then it does something when it is present, largely by creating a new content gatekeepiing cartel.
So, you exploit real people, but run your images through a realtime AI video transformation model doing either a close-to-noop transformation or something like changing the background so that it can't be used to identify the actual location if people do figure out you are exploiting real people, and then you have your real exploitation watermarked as AI fakery.
I don't think this is solving a problem, unless you mean a problem for the would-be exploiter.
The arguments put forward by people generally I don't find compelling -- for example, in this thread around protecting against counterfeit.
The "force" applied to address these concerns is totally out of proportion. Whenever these discussions happen, I feel like they descend into a general viewpoint, "if we could technically solve any possible crime, we should do everything in our power to solve it."
I'm against this viewpoint, and acknowledge that that means _some crime_ occurs. That's acceptable to me. I don't feel that society is correctly structured to "treat" crime appropriately, and technology has outpaced our ability to holistically address it.
Generally, I don't see (speaking for the US) the highest incarceration rate in the world to be a good thing, or being generally effective, and I don't believe that increasing that number will change outcomes.
> Were politicians 20 years ago as overreative they'd have demanded Photoshop leave a trace on anything it edited.
I think that by now it should be crystal clear to everyone that it matters a lot the sheer scale a new technology permits for $nefarious_intent.
Knives (under a certain size) are not regulated. Guns are regulated in most countries. Atomic bombs are definitely regulated. They can all kill people if used badly, though.
When a photo was faked/composed with old tech, it was relatively easy to spot. With photoshop, it became more complicated to spot it but at the same time it wasn't easy to mass-produce altered images. Large models are changing the rules here as well.
I don’t think this is a good comparison: knives are easy to produce, guns a bit harder, atomic bombs definitely harder. You should find something that is as easy to produce as a knife, but regulated.
The DEA and ATF have entered the chat
Or, if you see the altered photo as the "product", then the "product" of the knife/gun/bomb is the damage it creates to a human body.
The story of human history is newer generations freaking about progress and novel changes that have never been seen before. And later generations being perfectly okay with it and adapting to a new style of life.
We could use the opportunity to deploy robust systems of verification and validation to all digital works. One that allows for proving authenticity while respecting privacy if desired. For example… it’s insane in the US we revolve around a paper social security number that we know damn well isn’t unique. Or that it’s a massive pain in the ass for most people to even check the hash of a download.
Guess which we’ll do!
But people with actual nefarious intent will easily be able to remove these watermarks, however they're implemented. This is copy protection and key escrow all over again - it hurts honest people and doesn't even slow down bad people.
https://www.nbcnews.com/tech/tech-news/ai-generated-evidence...
> “My wife and I have been together for over 30 years, and she has my voice everywhere,” Schlegel said. “She could easily clone my voice on free or inexpensive software to create a threatening message that sounds like it’s from me and walk into any courthouse around the country with that recording.”
> “The judge will sign that restraining order. They will sign every single time,” said Schlegel, referring to the hypothetical recording. “So you lose your cat, dog, guns, house, you lose everything.”
At the moment, the only alternative is courts simply never accept photo/video/audio as evidence. I know if I were a juror I wouldn't.
At the same time, yeah, watermarks won't work. Sure, Google can add a watermark/fingerprint that is impossible to remove, but there will be tools that won't put such watermarks/fingerprints.
Image verification has never been easy. People have been airbrushed out of and pasted into photos for over a century; AI just makes it easier and more accessible. Expecting a “click to verify” workflow is unreasonable as it has ever been; only media literacy and a bit of legwork can accomplish this task.
Photo-of-a-screen: https://gemini.google.com/share/ab587bdcd03e
It reported 25-50% for the image without having been through that analog hole: https://gemini.google.com/share/022e486fd6bf
I bet it will be called "Real Photos" or something like that, and the pictures will be signed by the camera hardware. Then iMessage will put a special border around it or something, so that when people share the photos with other Apple users they can prove that it was a real photo taken with their phone's camera.
How "real" are iPhone photos? They're also computationally generated, not just the light that came through the lens.
Even without any other post-processing, iPhones generate gibberish text when attempting to sharpen blurry images, they delete actual textures and replace them with smooth, smeared surfaces that look like a watercolor or oil paintings, and combine data from multiple frames to give dogs five legs.
There used to be a joke about people who did slideshows (on an actual slide projector) of their vacation photos at parties.
Hell, it might even be possible for some arbitrary photographs to come up with an AI prompt that produces them or something similar enough to be indistinguishable to the human eye, opening up the possibility of "proving" something is fake even when it was actually real.
What you want just can't work, not even from a theoretical or practical standpoint, let alone the other concerns mentioned in this thread.
You're right that there will existed generated content without these watermarks, but you can bet that all the commercial providers burning $$$$ on state of the art models will gradually coalesce around some means of widespread by-default/non-optional watermarking for content they let the public generate so that they can all avoid drowning in their own filth.
Unless the watermark randomly replaces objects in the scene with bananas, these images/videos will still spread like wildfire on platforms like TikTok, where the average netizen's idea of due diligence is checking for a six‑fingered hand... at best.
If social media platforms are required by law to categorize content as AI generated, this means they need to check with the public "AI generation" providers. And since there is no agreed upon (public) standard for imperceptible watermarks hashing that means the content (image, video, audio) in its entirety needs to be uploaded to the various providers to check if it's AI generated.
Yes, it sounds crazy, but that's the plan; imagine every image you post on Facebook/X/Reddit/Whatsapp/whatever gets uploaded to Google / Microsoft / OpenAI / UnnamedGovernmentEntity / etc. to "check if it's AI". That's what the current law in Korea and the upcoming laws in California and EU (for August 2026) require :(
The inline verification of images following the prompt is awesome, and you can do some _amazing_ stuff with it.
It's probably not as fun anymore though (in the early access program, it doesn't have censoring!)
To me the AI revolution is making visual media (and music) catch up with the text-based revolution we've had since the dawn of computing.
Computers accelerated typing and text almost immediately, but we've had really crude tools for images, video, and 3D despite graphics and image processing algorithms.
AI really pushes the envelope here.
I think images/media alone could save AI from "the bubble" as these tools enable everyone to make incredible content if you put the work into it.
Everyone now has the ingredients of Pixar and a music production studio in their hands. You just need to learn the tools and put the hours in and you can make chart-topping songs and Hollywood grade VFX. The models won't get you there by themselves, but using them in conjunction with other tools and understanding as to what makes good art - that can and will do it.
Screw ChatGPT, Claude, Gemini, and the rest. This is the exciting part of AI.
AI for images, video, music - these tools can already make movies, games, and music today with just a little bit of effort by domain experts. They're 10,000x time and cost savers. The models and tools are continuing to get better on an obvious trend line.
For example, I'm currently vibe coding an app that will be specific to our company, that helps me run all the aspects of our business and integrates with our systems (so it'll integrate with quickbooks for invoicing, etc), and help us track whether we have the right insurance across multiple contracts, will remind me about contract deadlines coming up, etc.
It's going to combine the information that's currently in about 10 different slightly out of sync spreadsheets, about 2 dozen google docs/drive files, and multiple external systems (Gusto, Quickbooks, email, etc).
Even though I could build all this manually (as a software developer), I'd never take the time to do it, because it takes away from client work. But now I can actually do it because the pace is 100x faster, and in the background while I'm doing client work.
In the past, I've deliberately stuck a Vision-language model in a REPL with a loop running against generative models to try to have it verify/try again because of this exact issue.
EDIT: Just tested it in Gemini - it either didn't use a VLM to actually look at the finished image or the VLM itself failed.
Output:
I have finished cross-referencing the image against the user's specific requests. The primary focus was on confirming that the number of points on the star precisely matched the requested nine. I observed a clear visual representation of a gold-colored star with the exact point count that the user specified, confirming a complete and precise match.
Result: Bog standard star with *TEN POINTS*.This has been an oddly difficult benchmark for Gemini's NB models. Googles images models have always been pretty bad at the studio ghibli prompt, but I'm shocked at how poorly it performs at this task still.
1. Trigger Circle to Search with long holding the home button/bar
2. Select the image
3. Navigate to About this image on the Google search top bar all the way to the right - check if it says "Made by Google AI" - which means it detected the SynthID watermark.
I had trouble reliably getting it to...
* produce just two lanes of traffic
* have all the cars facing the same way—sometimes even within one lane they'd be facing in opposite directions.
* contain the construction within the blocked-off area. I think similarly it wouldn't understand which side was supposed to be blocked off. It'd also put the lane closure sign in lanes that were supposed to be open.
* have the cars be in proportion to the lane and road instead of two side-by-side within a lane.
* have the arrows go in the correct direction instead of veering into the shoulder or U-turning back into oncoming traffic
* use each number once, much less on the correct car
This is consistent with my understanding of how LLMs work, but I don't understand how you can "visualize real-time information like weather or sports" accurately with these failings.
Below is one of the prompts I tried to go from scratch to an image:
> You are an illustrator for a drivers' education handbook. You are an expert on US road signage and traffic laws. We need to prepare a diagram of a "zipper merge". It should clearly show what drivers are expected to do, without distracting elements.
> First, draw two lanes representing a single direction of travel from the bottom to the top of the image (not an entire two-way road), with a dotted white line dividing them. Make sure there's enough space for the several car-lengths approaching a construction site. Include only the illustration; no title or legend.
> Add the construction in the right lane only near the top (far side). It should have the correct signage for lane closure and merging to the left as drivers approach a demolished section. The left lane should be clear. The sign should be in the closed lane or right shoulder.
> Add cars in the unclosed sections of the road. Each car should be almost as wide as its lane.
> Add numbered arrows #1–#5 indicating the next cars to pass to the left of the "lane closed" sign. They should be in the direction the cars will move: from the bottom of the illustration to the top. One car should proceed straight in the left lane, then one should merge from the right to the left (indicate this with a curved arrow), another should proceed straight in the left, another should merge, and so on.
I did have a bit better luck starting from a simple image and adding an element to it with each prompt. But on the other hand, when I did that it wouldn't do as well at keeping space for things. And sometimes it just didn't make any changes to the image at all. A lot of dead ends.
I also tried sketching myself and having it change the illustration style. But it didn't do it completely. It turned some of my boxes into cars but not necessarily all of them. It drew a "proper" lane divider over my thin dotted line but still kept the original line. etc.
DeepMind Page: https://deepmind.google/models/gemini-image/pro/
Model Card: https://storage.googleapis.com/deepmind-media/Model-Cards/Ge...
SynthID in Gemini: https://blog.google/technology/ai/ai-image-verification-gemi...
519 more comments available on Hacker News
Want the full context?
Jump to the original sources
Read the primary article or dive into the live Hacker News thread when you're ready.