Functiongemma 270m Model
Key topics
The FunctionGemma 270M model's capabilities have sparked excitement among tech enthusiasts, with many marveling at its ability to run in the browser and respond to goal-based commands. The research lead behind the model, canyon289, chimed in to answer technical questions and share their own amazement at the advancements in the web ML community. As commenters raved about the model's potential, they also began to speculate about future developments, with some wondering when the next iteration, "Gemma4," might arrive. The conversation revealed a shared enthusiasm for smaller, more efficient models that can be run locally, with some envisioning a future where "agentic" capabilities are paired with larger models for enhanced performance.
Snapshot generated from the HN discussion
Discussion Activity
Very active discussionFirst comment
15m
Peak period
38
0-6h
Avg / period
11.2
Based on 56 loaded comments
Key moments
- 01Story posted
Dec 18, 2025 at 1:26 PM EST
23 days ago
Step 01 - 02First comment
Dec 18, 2025 at 1:41 PM EST
15m after posting
Step 02 - 03Peak activity
38 comments in 0-6h
Hottest window of the conversation
Step 03 - 04Latest activity
Dec 21, 2025 at 5:37 PM EST
20 days ago
Step 04
Generating AI Summary...
Analyzing up to 500 comments to identify key contributors and discussion patterns
Want the full context?
Jump to the original sources
Read the primary article or dive into the live Hacker News thread when you're ready.
Happy to answer whatever technical questions I can!
Personally speaking its really neat to see other people who take these models and run with them, creating things I could haven't have imagined. I'm hoping many others in the open community do the same in the coming weeks and the new year
But on a serious note, I'm happy to see more research going into vSLMs (very small...) My "dream" scenario is to have the "agentic" stuff run locally, and call into the "big guns" as needed. Being able to finetune these small models on consumer cards is awesome, and can open up a lot of niche stuff for local / private use.
>My "dream" scenario is to have the "agentic" stuff run locally, and call into the "big guns" as needed.
FunctionGemma 270m is your starter pack for this, train your own functions to call out to whatever larger models you choose. It's been quite effective my testing, and the finetuning guides should show you how to add in your own capabilities.
Speaking from the research side its incredible how so many small models, not just Gemma, are achieving performance levels of must larger models from just a year or two ago. It's personally why I stay in this space.
Whisper is old and resource intensive for the accuracy it provides.
That being said if someone in the community wanted to use other encoders like siglip and plug them into Gemma270m to make it multimodal that'd be a great way to have fun over break and build up an AI Eegineer resume :)
I hope those questions make sense
I think you mean taking the results of one function call and putting it into another? We saw some promise but didn't heavily train for this use case in the base model. The thing we noticed with the 270m sized models, and the performance expectations of AI models in 2025, is that these size models perform best for _specific users_ when finetuned to that specific use case.
What I suggest is mocking some data either by hand or using some automated tool and finetuning in this kind of use case and using the finetuning colab setup.
> is there a way to give the model ability to scope action for example if actions are related to permissions
Permissions depend on your system architecture more than the model. The model itself just takes in tokens and outputs tokens. Permissions are defined by your security/system setup in which the model itself is running.
In my mind we want a very smart layer frontier model orchestrating, but not slowing everything down by doing every little thing; this seems like the opposite - a very fast layer that can be like "wait a minute, I'm too dumb for this, need some help".
My question is - does the Gemma team use any evaluation around this particular 'call a (wiser) friend' strategy? How are you thinking about this? Is this architecture flow more an accommodation to the product goal - fast local inference - or do you guys think it could be optimal?
The way we think about it is what do we think developers and users need, and is there a way we can fill that gap in a useful way. With this model we had the hypothesis you had, there are fantastic larger models out there pushing the frontier of AI capabilities, but there's also a nice for smaller customizable model that's quick to run and quick to tune.
What is optimal then ultimately falls to you and your use cases (which I'm guessing at here), you have options now between Gemini and Gemma.
* i see the the dataset Google published in this notebook https://github.com/google-gemini/gemma-cookbook/blob/main/Fu... -- from looking at the dataset on huggingface, it looks synthetically generated.
do you recommend any particular mix or focus in the dataset for finetuning this model, without losing too much generality?
Astute questions, there's sort of two ways to think about finetuning, 1. Obliterate any general functionality and train the model on your general commands 2. As you asked maintain generality trying to preserve initial model ability
For 2 typically low learning rate or LORA is a good strategy. We show an example in our the finetuning tutorial in the blog.
> 2. do you have any recommendations for how many examples per-tool? This depends on the tool complexity and the variety of user inputs. So a simple tool like turn_flashlight_on(), with no args, will get taught quickly, especially if say you're only prompting in English.
But if you have a more complex function like get_weather(lat, lon, day, region, date) and have prompts coming in in English, Chinese, Gujarati and spanish, the model needs to do a lot more "heavy lifting" to both translate a request and fill out a complex query. We know as programmers date by themselves are insanely complex in natural language (12/18/2025 vs 18/12/2025).
To get this right it'll help the model if it was trained on data that shows it the versions of variations of inputs possible.
Long answer but I hope this makes sense.
Might Gemini CLI offload some of its prompts to FunctionGemma?
The most generic thing I can say is I really do like working at Google because its one of the few (maybe only) company that has models of all sizes and capabilities. Because of this research and product development is insanely fun and feels "magical" when things just click together.
Keep following the Google Developer channels/blogs whatever. Google as a whole is pushing hard in this space and I personally think is building stuff that felt like science fiction just 3 years ago.
Going one level up you as a developer have a choice how much context you want to provide to the model. Philipp Schmid wrote a good blog post about this, titling this "context engineering". I like his idea because instead of just blindly throwing stuff into a model's context window and hoping to get good performance, it encourages folks to think more about how what's going into the context in each turn.
https://www.philschmid.de/context-engineering
Similarly I think the blog post you linked has a similar sentiment. There's nuanced approaches that can yield better results if an engineering mindset is applied.
Another hard constraint is context limit, Gemma 270m is at 32k so if the search results returned are massive then this not a great model. The larger 4b+ Gemma models have 128k, and Gemini token window is in the millions
Thank you. I felt that was a very under appreciated direction ( most of the spotlight seemed to be on 'biggest' models ).
https://github.com/n8n-io/n8n
The docs talk about the model being engineered for edge devices and small enough for mobile... apparently with some fine tuning it can run locally on a phone with half a gig of RAM... some folks mention deploying to a Pixel or iPhone and getting around fifty tokens a second... that's wild...
ALSO as someone who loves the idea of offline assistants this is exciting... having function calling in your pocket without hitting the cloud would be a game changer... any insight into when or how we'll see a polished mobile release?...
On this project I was lucky enough to work with the Google AI Edge team who have deep expertise in edge deployments on device. Check out this app they built which loads in the Gemma 270m models and runs them on your phone.
https://play.google.com/store/apps/details?id=com.google.ai....
You also can finetune your own models and load them onto device with the sameworkflow. Scroll to the bottom to see the instructions and a screenshot example https://ai.google.dev/gemma/docs/mobile-actions
1. Generate a potential solution
2. If the solution is complex, chunk it up into logical parts
3. Vote on each chunk and select those with more than k votes
By doing this you can filter out outliers (not always desirable) and pull the signal out of the noise.
Thinking helps performance scores but we'll leave it up to users to add additional tokens if they want. Our goal here was the leanest weight and token base for blazing fast performance for you all.
Great work from the Google ML teams, I’ll be trying this model out.