News
Blazing fast on-device Gen AI with Lite RT-LM
1+ hour, 50+ min ago (403+ words) One of the most significant performance milestones in the Lite RT-LM pipeline is our native support for the Multi-Token Prediction (MTP) drafters recently launched with the Gemma 4 model family. By integrating this specialized speculative decoding architecture, Lite RT-LM bypasses traditional…...
Speeding Up AI: Bringing Google Colossus to Py Torch via GCSFS and Rapid Bucket
2+ week, 6+ day ago (349+ words) Today, we are announcing a major performance boost for AI/ML workloads using the Py Torch ecosystem on Google Cloud. By integrating Rapid Storage, powered by Google's Colossus storage architecture, directly with Py Torch via the industry-standard fsspec interface, we…...
Bring state-of-the-art agentic skills to the edge with Gemma 4
1+ mon, 2+ week ago (434+ words) We are excited to announce that you can experience Gemma 4s expansive capabilities on the edge starting today! Access Android's built-in Gemma 4 model through the new AICore Developer Preview, or leverage Google AI Edge to build agentic, in-app experiences across mobile,…...
Easy Function Gemma finetuning with Tunix on Google TPUs
3+ mon, 2+ week ago (278+ words) In this tutorial we are going to use Lo RA to do supervised finetuning on Function Gemma and run everything on free-tier Colab TPU v5e-1. We are using the same Mobile Action dataset as in the previous finetuning tutorial. First, we…...
Lite RT: The Universal Framework for On-Device AI
3+ mon, 3+ week ago (795+ words) At Google I/O "25, we shared a preview of this evolution: a high-performance runtime designed specifically for advanced hardware acceleration. Today, we are excited to announce that these advanced acceleration capabilities have fully graduated into the Lite RT production stack,…...
A Guide to Fine-Tuning Function Gemma
4+ mon, 3+ day ago (667+ words) In the world of Agentic AI, the ability to call tools is what translates natural language into executable software actions. Last month, we released Function Gemma, a specialized version of our Gemma 3 270 M model explicitly fine-tuned for function calling. It…...
Introducing Tunix: A JAX-Native Library for LLM Post-Training
7+ mon, 2+ week ago (336+ words) For developers and researchers in the JAX ecosystem, the path from a pre-trained model to a fully aligned, production-ready LLM just got a lot simpler. Today, we're excited to introduce Tunix, a new open-source, JAX-native library built specifically for LLM…...
Build and train a recommender system in 10 minutes using Keras and JAX
1+ year, 6+ day ago (342+ words) Today, we are excited to announce the launch of Keras Recommenders, a new library that puts state-of-the-art recommendation techniques at your fingertips. To help developers create performant and accurate recommender systems, Keras Recommenders (Keras RS) contains a set of APIs…...
Introducing Lang Extract: A Gemini powered information extraction library
9+ mon, 2+ week ago (304+ words) Today, we're excited to introduce Lang Extract, a new open-source Python library designed to empower developers to do just that. Lang Extract provides a lightweight interface to various LLMs such as our Gemini models for processing large volumes of unstructured…...