在笔记本电脑上运行 Gemma 4 12B:借助 Google AI Edge 解锁本地智能体工作流
2026-06-03 00:00·30天前
AI 摘要
Google DeepMind 的 Gemma 4 12B 模型可在 16GB RAM 的普通笔记本上运行,支持本地数据处理与视觉洞察生成。macOS 用户可通过 Google AI Edge Gallery 执行动态 Python 代码与可视化,通过 Google AI Edge Eloquent 实现完全离线的语音听写和文本编辑。另外,LiteRT-LM CLI 新增 serve 命令,可创建行业兼容的本地端点,驱动完全本地的 AI 工具和智能体。
原文 · 未翻译
Bringing Gemma 4 12B to your Laptop: Unlocking Local, Agentic Workflows with Google AI Edge
Facebook
Twitter
LinkedIn
Mail
Google DeepMind’s latest open model, Gemma 4 12B, is designed to bring agentic, multimodal intelligence directly to your laptop. By combining the model's strengths with the Google AI Edge stack, you can immediately get hands-on to build and experiment locally, on everyday machines (see model card for spec requirement).This model-runtime combination unlocks powerful on-device capabilities, from autonomous data processing and generating rich visual insights, to building fully functional webpages and executing everyday tool use. You can start interacting with Gemma 4 12B across Google AI Edge right now:
Explore Gemma with Google AI Edge Gallery, our local AI showcase app, now available on macOS. With the 12B model you can generate and execute scripts on the fly for tasks such as data analysis.
The Google AI Edge Eloquent on-device, voice dictation app is now available on macOS. We added the ability to interactively polish and rewrite text through voice commands, entirely on-device, powered by the new Gemma 4 12B model.
LiteRT-LM can now serve local, industry compatible endpoints directly from your terminal via the new serve command in the LiteRT-LM CLI. When used with Gemma 4 12B, this is a highly capable and efficient option to power fully-local agentic tools, harnesses, and workflows.
Coding with Google AI Edge Gallery on MacOS
The Google AI Edge Gallery app, now available on macOS, showcases Gemma 4 12B’s coding capability, allowing you to extract meaningful insights from your data right on your device. Through a seamless interface, you can simply describe your analytical goals in natural language. In the example below, we asked the model to “use a python program to render a chart png to compare the top 10 girl names born in 2024 vs 2025” given two text files containing the data. In response, the model dynamically generates Python code, executes it locally, and converts raw data into beautiful, easy-to-grasp visualizations and insights.
Google DeepMind 的 Gemma 4 12B 模型可在 16GB RAM 的普通笔记本上运行,支持本地数据处理与视觉洞察生成。macOS 用户可通过 Google AI Edge Gallery 执行动态 Python 代码与可视化,通过 Google AI Edge Eloquent 实现完全离线的语音听写和文本编辑。另外,LiteRT-LM CLI 新增 serve 命令,可创建行业兼容的本地端点,驱动完全本地的 AI 工具和智能体。
原文 · 保持原样,未翻译
Bringing Gemma 4 12B to your Laptop: Unlocking Local, Agentic Workflows with Google AI Edge
Facebook
Twitter
LinkedIn
Mail
Google DeepMind’s latest open model, Gemma 4 12B, is designed to bring agentic, multimodal intelligence directly to your laptop. By combining the model's strengths with the Google AI Edge stack, you can immediately get hands-on to build and experiment locally, on everyday machines (see model card for spec requirement).This model-runtime combination unlocks powerful on-device capabilities, from autonomous data processing and generating rich visual insights, to building fully functional webpages and executing everyday tool use. You can start interacting with Gemma 4 12B across Google AI Edge right now:
When it comes to advanced coding, Gemma 4 12B doesn't just write scripts. In a complex 3D rendering task, we observed that with just one user prompt, the model can generate a rubber duck rendering with dependency specification, generate code and self correct, all in a single turn.
Download Google AI Edge Gallery on macOS today and try local coding with Gemma 4 12B.
Dictation and Voice-Driven Editing with Google AI Edge Eloquent
Google AI Edge Eloquent, our AI powered dictation and editing app, seamlessly transforms your raw unstructured thoughts into polished text. The new MacOS desktop version runs 100% on-device across the entire feature set, ensuring a powerful, fully offline experience. Using a convenient, customizable hotkey, Eloquent enables you to use voice dictation across any application on your Mac. Additionally, Eloquent supports fully local transcription of your audio or video files.
Leveraging the advanced reasoning power of Gemma 4 12B, we are introducing Voice Edit, a new feature that allows you to simply dictate voice commands to transform any piece of text in your desktop workflow. For example, you can highlight a paragraph and say, “restructure these notes into an executive summary”, or “translate this into Hindi”. With Gemma 4 12B, we see a huge step up to prior models with superior instruction following, stricter scope adherence, and a 60%+ jump in overall quality.
Download Google AI Edge Eloquent on macOS today and experience the power of Gemma 4 12B as a fully local AI dictation and editing assistant.
Build with LiteRT-LM including Drop-in Local Serving
The LiteRT-LM CLI provides a lightweight, zero-code tool for running language models locally. We are now expanding the tool with the serve command, letting the CLI act as a drop-in local LLM server. Use this functionality with Gemma 4 12B to point any standard tool, SDK, or framework (such as OpenClaw, Hermes, OpenCode, Pi, or popular extensions like Continue and Aider) directly to your local endpoint.
Import the Gemma 4 12B model as "gemma4-12b" litert-lm import --from-huggingface-repo=litert-community/gemma-4-12B-it-litert-lm gemma-4-12B-it.litertlm gemma4-12b # Start the OpenAI-compatible server litert-lm serve
Import the Gemma 4 12B model as "gemma4-12b" litert-lm import --from-huggingface-repo=litert-community/gemma-4-12B-it-litert-lm gemma-4-12B-it.litertlm gemma4-12b # Start the OpenAI-compatible server litert-lm serve
Running Gemma 4 12B makes on-device AI powered capabilities broadly available to everyday laptops. Check out the LiteRT-LM model card for performance and memory benchmarks. By pairing the powerful capabilities of this new model with the optimized performance and ease of use of Google AI Edge you can build multi-turn local agents, analyze data in Google AI Edge Gallery, or streamline your writing with Google AI Edge Eloquent. Furthermore, your data stays on your device while maintaining reliable responsiveness, utility, and cost efficiency.
Acknowledgements
We'd like to extend a special thanks to our significant contributors for their work on this project (in alphabetical order):
Advait Jain, Alice Zheng, Alex Kanaukou, Ami Kubota, Changming Sun, Cormac Brick, Denis Daletski, Fengwu Yao, Hriday Chhabria, Jingxiao Zheng, Jingtao Zhou, Jenn Lee, Jianing Wei, Jing Jin, Lin Chen, Lu Wang, Marius Kintel, Marissa Ikonomidis, Matthias Grundmann, Mogan Shieh, Mohammadreza Heydary, Matthew Soulanille, Na Li, Qidong Zhao, Queenie Zhang, Ram Iyengar, Rishika Sinha, Sachin Kotwani, Suleman Shahid, Suril Shah, Tenghui Zhu, Wai Hon Law, Weiyi Wang, Xiaoming Hu, Xinan Cheng, Yi-Chun Kuo, Yishuang Pang, Yu-hui Chen.
Mobile
Web
AI
Announcements
Learn
Explore
The latest updates to Google Pay
DiffusionGemma: The Developer Guide
Supercharge your integration workflow with the Google Pay & Wallet Developer MCP server
Explore Gemma with Google AI Edge Gallery, our local AI showcase app, now available on macOS. With the 12B model you can generate and execute scripts on the fly for tasks such as data analysis.
The Google AI Edge Eloquent on-device, voice dictation app is now available on macOS. We added the ability to interactively polish and rewrite text through voice commands, entirely on-device, powered by the new Gemma 4 12B model.
LiteRT-LM can now serve local, industry compatible endpoints directly from your terminal via the new serve command in the LiteRT-LM CLI. When used with Gemma 4 12B, this is a highly capable and efficient option to power fully-local agentic tools, harnesses, and workflows.
Coding with Google AI Edge Gallery on MacOS
The Google AI Edge Gallery app, now available on macOS, showcases Gemma 4 12B’s coding capability, allowing you to extract meaningful insights from your data right on your device. Through a seamless interface, you can simply describe your analytical goals in natural language. In the example below, we asked the model to “use a python program to render a chart png to compare the top 10 girl names born in 2024 vs 2025” given two text files containing the data. In response, the model dynamically generates Python code, executes it locally, and converts raw data into beautiful, easy-to-grasp visualizations and insights.
When it comes to advanced coding, Gemma 4 12B doesn't just write scripts. In a complex 3D rendering task, we observed that with just one user prompt, the model can generate a rubber duck rendering with dependency specification, generate code and self correct, all in a single turn.
Download Google AI Edge Gallery on macOS today and try local coding with Gemma 4 12B.
Dictation and Voice-Driven Editing with Google AI Edge Eloquent
Google AI Edge Eloquent, our AI powered dictation and editing app, seamlessly transforms your raw unstructured thoughts into polished text. The new MacOS desktop version runs 100% on-device across the entire feature set, ensuring a powerful, fully offline experience. Using a convenient, customizable hotkey, Eloquent enables you to use voice dictation across any application on your Mac. Additionally, Eloquent supports fully local transcription of your audio or video files.
Leveraging the advanced reasoning power of Gemma 4 12B, we are introducing Voice Edit, a new feature that allows you to simply dictate voice commands to transform any piece of text in your desktop workflow. For example, you can highlight a paragraph and say, “restructure these notes into an executive summary”, or “translate this into Hindi”. With Gemma 4 12B, we see a huge step up to prior models with superior instruction following, stricter scope adherence, and a 60%+ jump in overall quality.
Download Google AI Edge Eloquent on macOS today and experience the power of Gemma 4 12B as a fully local AI dictation and editing assistant.
Build with LiteRT-LM including Drop-in Local Serving
The LiteRT-LM CLI provides a lightweight, zero-code tool for running language models locally. We are now expanding the tool with the serve command, letting the CLI act as a drop-in local LLM server. Use this functionality with Gemma 4 12B to point any standard tool, SDK, or framework (such as OpenClaw, Hermes, OpenCode, Pi, or popular extensions like Continue and Aider) directly to your local endpoint.
Import the Gemma 4 12B model as "gemma4-12b" litert-lm import --from-huggingface-repo=litert-community/gemma-4-12B-it-litert-lm gemma-4-12B-it.litertlm gemma4-12b # Start the OpenAI-compatible server litert-lm serve
Import the Gemma 4 12B model as "gemma4-12b" litert-lm import --from-huggingface-repo=litert-community/gemma-4-12B-it-litert-lm gemma-4-12B-it.litertlm gemma4-12b # Start the OpenAI-compatible server litert-lm serve
Running Gemma 4 12B makes on-device AI powered capabilities broadly available to everyday laptops. Check out the LiteRT-LM model card for performance and memory benchmarks. By pairing the powerful capabilities of this new model with the optimized performance and ease of use of Google AI Edge you can build multi-turn local agents, analyze data in Google AI Edge Gallery, or streamline your writing with Google AI Edge Eloquent. Furthermore, your data stays on your device while maintaining reliable responsiveness, utility, and cost efficiency.
Acknowledgements
We'd like to extend a special thanks to our significant contributors for their work on this project (in alphabetical order):
Advait Jain, Alice Zheng, Alex Kanaukou, Ami Kubota, Changming Sun, Cormac Brick, Denis Daletski, Fengwu Yao, Hriday Chhabria, Jingxiao Zheng, Jingtao Zhou, Jenn Lee, Jianing Wei, Jing Jin, Lin Chen, Lu Wang, Marius Kintel, Marissa Ikonomidis, Matthias Grundmann, Mogan Shieh, Mohammadreza Heydary, Matthew Soulanille, Na Li, Qidong Zhao, Queenie Zhang, Ram Iyengar, Rishika Sinha, Sachin Kotwani, Suleman Shahid, Suril Shah, Tenghui Zhu, Wai Hon Law, Weiyi Wang, Xiaoming Hu, Xinan Cheng, Yi-Chun Kuo, Yishuang Pang, Yu-hui Chen.
Mobile
Web
AI
Announcements
Learn
Explore
The latest updates to Google Pay
DiffusionGemma: The Developer Guide
Supercharge your integration workflow with the Google Pay & Wallet Developer MCP server