Function Gemma: Fast On-Device Function Calling
Key Points
- Function Gemma is a 270‑million‑parameter fine‑tuned version of Gemma 3 that adds reliable function‑calling capabilities while keeping its natural‑language abilities.
- Its small size enables fast, private, and cost‑effective inference on embedded and mobile hardware, especially when paired with accelerators like GPUs or NPUs.
- Developers can further fine‑tune Function Gemma on a specific set of APIs or tools, achieving accuracy on par with much larger models for those tasks.
- Demonstrations—including a “Mobile Actions” app that triggers on‑device functions (e.g., creating calendar events, adding contacts, turning on the flashlight) and a voice‑controlled game—showcase the model’s ability to translate user input into executable actions.
- A step‑by‑step fine‑tuning recipe and the demo apps are publicly available via the Google AI Edge Gallery on the Play Store.
Sections
- Function Gemma: On‑Device Function Calling - The speaker announces Function Gemma, a 270‑million‑parameter, fine‑tuned version of Gemma 3 that enables fast, private, on‑device translation of natural language into function calls and API actions, and can be further fine‑tuned for specialized function sets.
- Function Gemma: On‑Device Function Calling Model - The transcript promotes Function Gemma, a lightweight, fine‑tunable model that generates function calls from natural language, runs locally for privacy and cost benefits, and integrates with major AI frameworks and tools.
Full Transcript
# Function Gemma: Fast On-Device Function Calling **Source:** [https://www.youtube.com/watch?v=-Tgc_9uYJLI](https://www.youtube.com/watch?v=-Tgc_9uYJLI) **Duration:** 00:04:58 ## Summary - Function Gemma is a 270‑million‑parameter fine‑tuned version of Gemma 3 that adds reliable function‑calling capabilities while keeping its natural‑language abilities. - Its small size enables fast, private, and cost‑effective inference on embedded and mobile hardware, especially when paired with accelerators like GPUs or NPUs. - Developers can further fine‑tune Function Gemma on a specific set of APIs or tools, achieving accuracy on par with much larger models for those tasks. - Demonstrations—including a “Mobile Actions” app that triggers on‑device functions (e.g., creating calendar events, adding contacts, turning on the flashlight) and a voice‑controlled game—showcase the model’s ability to translate user input into executable actions. - A step‑by‑step fine‑tuning recipe and the demo apps are publicly available via the Google AI Edge Gallery on the Play Store. ## Sections - [00:00:00](https://www.youtube.com/watch?v=-Tgc_9uYJLI&t=0s) **Function Gemma: On‑Device Function Calling** - The speaker announces Function Gemma, a 270‑million‑parameter, fine‑tuned version of Gemma 3 that enables fast, private, on‑device translation of natural language into function calls and API actions, and can be further fine‑tuned for specialized function sets. - [00:03:26](https://www.youtube.com/watch?v=-Tgc_9uYJLI&t=206s) **Function Gemma: On‑Device Function Calling Model** - The transcript promotes Function Gemma, a lightweight, fine‑tunable model that generates function calls from natural language, runs locally for privacy and cost benefits, and integrates with major AI frameworks and tools. ## Full Transcript
[music]
When we launched Jimma 3270M, our
smallest model to date, the community
asked for tool calling capabilities. And
today I'm incredibly excited to
introduce Function Gemma, a specialized
version of our Gemma 3 270 million
parameter model that's fine-tuned for
function calling while retaining its
natural language capabilities. Function
Gemma is designed for developers who
want to build fast, private, and
cost-effective apps that can translate
natural language into function calls and
API actions. Despite its lightweight
footprint, Function Gemma is trained to
determine the right function and is
fine-tunable to be more robust on
specific tasks. For example, if you have
an app with a known set of functions,
you can fine-tune function Gemma to be
an expert on those functions. This
creates a specialized model that can
exhibit the same success rate as models
many times its size. Due to the small
size of the base model, only 270 million
parameters, the speed in which it can
process input and take actions is
significant even on embedded and mobile
hardware. With access to accelerators
such as GPUs and NPUs, this can be even
quicker. For mobile developers, Function
Gemma represents an opportunity beyond
chatbased interactions to translate
natural language into executable actions
which you can run entirely on device. To
showcase Function Gemma, we've built a
number of demos for mobile using Google
AI Edge. I'd like to hand it over to
Ravine to show you Function Gemma in
action.
This is Mobile Actions, a demo app where
users can trigger actions on their
device from a voice or text input. To
power it, we fine-tune a model based on
function Gemma on a small set of
functions that are passed to the model
as tools it can use.
Whether it's create a counter event for
lunch tomorrow, adds onto my contacts or
simply turn on the flashlight, the model
parses natural language and identifies
the correct ondevice tool to execute the
command. You can see the sequences here.
Watch how the model interprets the
commands and then requests the
appropriate function for the app to
call.
While developing function Gemma, we
noted by using the new function calling
format, we saw improvements in the
accuracy over just prompting the base
Gemma 3 27 model alone. After further
fine-tuning, we saw even more accuracy
improvements even above the base
function Gemma 27 model on tasks used in
the mobile actions demo.
We've published a step-by-step recipe
for fine-tuning your own demo. So, go
ahead and try it yourself.
This demo is available for you to try in
the Google AI Edge Gallery app that you
can find on the Google Play Store. Our
next demo shows how a fine-tuned
function gemma model can drive game
mechanics of a mobile game from user
commands. In this interactive miniame,
players use voice commands to manage a
virtual plot of land. You can say plant
sunflowers in the top row and water
them. And then the model selects the
specific app functions like plant crop
or water crop with the specific grid
coordinates and then executes the game
logic to do those in action.
Having a model that can generate
function calls and arguments from free
form input enables a wide range of use
cases such as looking up data from a
user query, routing queries to the
appropriate sub aent or providing new
modalities for interacting with games
and apps.
Function Jama is small enough to operate
responsibly on consumer hardware,
unlocking new use cases, as well as the
usual benefits of running AI on device
such as privacy, offline capabilities,
and reduce cloud costs.
These are just few examples, but with
Function Gemma, it's built for
fine-tuning, so you can create your own
specialized function calling model to
power your own AI workflows. The model
is available from all the usual places
such as hugging face, Kaggle, and
Vert.ex AI. As with all our Gemma
models, it works across popular tools
and frameworks such as hugging face
transformers, O Lama, VLM, Llama CPP,
Light RT, MLX, and more. Where and how
you tune it is up to you. Whether you
prefer using TRL, Unsloth, Vertex AI,
Function Gemma is compatible with all of
these.
For the function calling format and best
practices, check out our guides and
examples as part of the Gemma cookbook
to get started. Start tuning the model
today. We cannot wait to see what you'll
be calling with Function Gemma. [music]
[music]