Gemma4提速秘籍! 一条命令速度提升23%! 不卖关子哈, 记得用推测性解码, 这次Gemma4发布的模型尺寸梯次正好适合用推测性解码, 如果你在用31B dense 觉得不够快, 可以再加上E2B(5.1B)作为草稿模型, 我实测RTX5090可以把吐字(解码)速度提升23%! 从61 token/s 提升到了76 token/s. 并且推测性解码本身是不会降智的. 等会, 你要问什么是推测性解码(投机解码, Speculative Decoding)? 简单来讲, 大模型跑得慢, 那我们就用小模型先跑, 然后把小模型的输出批量的发给大模型让大模型判断对不对, 小模型跑对了多少就保留多少, 因此最差情况都是至少第一个token是对的(原理见上图). 有同学会问了, 那这不还是要让大模型重新生成, 速度提升在哪里? 答案是, 目前大模型推理【算力】是过剩的, 【显存带宽】是不足的, 所以处理输入(预填充, prefill, 更多需要浮点性能)速度都很快. 因此小模型输出一大堆, 然后反馈给大模型判断这个过程(当作 prompt), 就是prefill, 会很快, 远超过大模型直接吐字(解码, decoding, 更多需要显存带宽)的速度. 只要小模型速度足够快, 哪怕接受率再低, 都会产生速度优势, 推测性解码就是巧妙地利用了这一点. 最后我把我测试的最佳参数放在了图3, 大家可以参考. 另外记得不要混搭, Gemma4就搭配Gemma4, 不要搭配Qwen3.5. 会出现不兼容问题. #gemma4 #llamacpp #qwen35 #本地大模型 #推测性解码

译Gemma4可通过推测性解码实现23%推理加速。实测RTX5090上，31B dense主模型搭配E2B(5.1B)草稿模型，速度从61 token/s提升至76 token/s。该技术利用大模型算力过剩而显存带宽不足的特性，由小模型快速生成候选序列，大模型通过prefill阶段批量验证，避免逐token解码的带宽瓶颈。注意需保持模型系列一致性，Gemma4应搭配同系列草稿模型，不可与Qwen3.5混用。

宝玉@dotey · 4月13日

Chrome DevTools MCP 新增了多项专用调试技能：用 Lighthouse 跑性能审计、检测内存泄漏、无障碍调试、LCP（最大内容绘制，直接影响用户感知到的页面加载速度）优化，以及一个实验性的命令行工具。

译Chrome DevTools MCP新增多项面向AI Agent的调试技能，支持通过Lighthouse执行性能审计、检测内存泄漏、无障碍调试及LCP优化。这些功能旨在为AI Agent提供自动化代码质量检查能力，帮助识别性能瓶颈与可访问性问题。同时推出实验性CLI工具，支持命令行调用各项调试能力。

Nathan Lambert@natolambert · 4月12日

a bit over 7 days out from the Gemma 4 release and it's models are outpacing (slightly) the equivalent Qwen 3.5 models on downloads. Big numbers!

译Gemma 4 发布刚满 7 天，各尺寸模型下载量已小幅超越同等级 Qwen 3.5，数据表现亮眼。

TestingCatalog News 🗞@testingcatalog · 4月12日

GOOGLE ⚡: Google is working on Voice Mode and new collaborative tools for its Mixboard experiment. Voice mode on Mixboard works similarly to Stitch, allowing users to operate their canvas boards with voice commands. It will be possible to generate and edit images, and potentially move them around. Imagine a team retrospective where everyone can just dump their complaints with voice commands! Voice notes will be supported there, too! 👀

译Google Mixboard 实验项目新增语音模式，支持语音命令生成、编辑和移动图片，以及语音笔记功能。类似 Stitch 的交互方式，适用于团队协作场景，如回顾会议中直接语音输入反馈。

TestingCatalog News 🗞@testingcatalog · 4月11日

Google is rolling out custom banners and custom summary support on NotebookLM! Stealth release 👀

译Google 悄然为 NotebookLM 上线自定义横幅和自定义摘要功能，用户现可自定义封面图及文档概述。此次更新属 stealth release，尚未正式官宣。

Google Gemini@GeminiApp · 4月11日

Your favorite pic, but make it paper ✂️ Try it out in Gemini: 1) Open Gemini on desktop or in the mobile app 2) Select “Create image” in the tools menu 3) Upload the picture you want to transform 4) Insert the prompt from the next post 5) Share your creations in the replies ↓

译Gemini 支持将上传的照片转换为剪纸/折纸风格。用户在桌面端或 App 中选择"Create image"工具，上传图片并输入特定提示词即可生成，可在回复中分享创作成果。

TestingCatalog News 🗞@testingcatalog · 4月10日

Google is working on a Style Tuner for Stitch to allow picking better-suited colors for generated designs.

译Google 正为 AI 设计工具 Stitch 开发 Style Tuner 功能，支持用户为生成设计手动选择更合适的颜色方案，改善 AI 生成结果的配色适配度。

Google Gemini@GeminiApp · 4月10日

Rolling out today, you can create longer tracks in Gemini for FREE! Select “Create music” in the tools menu and “Thinking” or “Pro” from the model picker. Give it a try, and share your creations in the replies. 👇

译Gemini 今日上线 Lyria 3 Pro，支持生成更长音乐曲目及复杂过渡效果。用户可在工具菜单选择"Create music"并切换 Thinking 或 Pro 模式免费使用，该功能已向 Google AI Plus/Pro/Ultra 用户推出。

AK@_akhaliq · 4月10日

MedGemma 1.5 Technical Report paper: https://huggingface.co/papers/2604.05081

译MedGemma 1.5 技术报告正式发布，详述该医疗多模态大模型的架构设计、训练方法与临床评估结果。论文已公开至 Hugging Face。

TestingCatalog News 🗞@testingcatalog · 4月10日

Gemini can now help visualize complex topics through interactive experiences directly in chat. "Show me the visualization" button will appear under certain questions, which could trigger this new experience. Testing time 👀

译Gemini 现可在聊天中直接生成交互式可视化内容，针对特定问题显示"Show me the visualization"按钮，点击后可调整变量、旋转 3D 模型及探索数据，以更沉浸的方式理解复杂概念。

Haider.@haider1 · 4月9日

> anthropic has internal "mythos" > openai has internal "spud" > elon says xAI is training 6T and 10T models what does google have internally? they seem to have been pretty quiet for a while in the language model race, even though one of their latest releases was "Gemma 4"

译Anthropic 内部开发 "mythos"，OpenAI 内部开发 "spud"，xAI 正训练 6T 和 10T 参数模型，而 Google 在大模型竞赛中异常安静，最新发布仅是 Gemma 4。

Jeff Dean@JeffDean · 4月9日

Great to see the reception for the very capable Gemma 4 models!

译Gemma 4 发布一周内下载量突破 1000 万次，Gemma 系列模型累计下载量已超 5 亿次。Sundar Pichai 公布数据并期待看到开发者基于该模型的创作。

Sundar Pichai@sundarpichai · 4月9日

Lots of love for Gemma 4! Team just told me it’s already had 10M+ downloads since last week’s launch. Gemma models have now been downloaded 500M+ times! Excited to see what you all are creating 👀

译Google开源模型Gemma 4发布仅一周下载量已突破1000万次，Gemma系列模型历史累计下载量更超过5亿次。这一数据反映出开发者社区对最新开源模型的热烈反响。官方对此表示欣喜，并期待看到用户基于Gemma 4开发的各类创新应用和创作成果。

Sundar Pichai@sundarpichai · 4月9日

Nothing quite like the blank pages of a fresh notebook! Notebooks are now rolling out in the @Geminiapp. Organize conversations, notes, and other sources related to a single project (and easily go back and forth between @NotebookLM to go deeper). Rolling out today, starting with Google AI Ultra, Pro + Plus subscribers on the web.

译Google Gemini应用正式上线Notebooks（笔记本）功能，用户可将对话、笔记及其他资源按项目分类整理，实现高效管理。该功能支持与NotebookLM无缝切换，便于深度研究。目前该功能已开始向Google AI Ultra、Pro及Plus订阅用户推送，首批支持网页端使用。

Demis Hassabis@demishassabis · 4月9日

Great to chat with fellow Londoner @HarryStebbings about the path to AGI and how we’re using AI today to accelerate science & medicine. Appreciated our discussion on the incredible talent & potential for deep tech here in the UK. Thanks for the kind words and for having me on!

译Demis Hassabis 做客 20VC 播客，与 Harry Stebbings 探讨 AGI 路径、AI 在科学医学中的应用及英国深科技潜力。主持人盛赞其历史地位堪比图灵、牛顿与爱因斯坦，并分享自己 11 年前在卧室无资金创办播客的创业历程。

Artificial Analysis@ArtificialAnlys · 4月8日

Announcing APEX-Agents-AA, our latest leaderboard on Artificial Analysis, evaluating AI agents on long-horizon professional services tasks with realistic application dependencies This is our implementation of the APEX-Agents benchmark - an agentic work task evaluation open-sourced by @mercor_ai. It tests AI agent ability to execute realistic tasks created by investment banking analysts, management consultants, and corporate lawyers. Mercor released extensive data to enable model evaluation and training across the community, comprising 480 tasks including tool implementations, rubrics, and grading workflows. We exclude tasks with external service dependencies and run the remaining 452 tasks for APEX-Agents-AA. Models complete tasks using Stirrup, our open-source agent harness as used in GDPval-AA, and a customized tool set based on the original benchmark implementation Results overview: 🏅 OpenAI, Anthropic and Google are in close competition at the top of the leaderboard, with 33.3% for GPT-5.4, 33.0% for Claude Opus 4.6, and 32% for Gemini 3.1 Pro Preview 📈 The overall scores on Artificial Analysis today are similar to Mercor’s testing, but some models such as GPT-5.4 nano show improvements in score using our Stirrup test harness ↻ We’ll be updating this leaderboard with key releases for agentic work use as a metric for agent capability on well-defined, long horizon work tasks APEX-Agents overview: ➤ Tasks span 3 professional domains: investment banking, management consulting, and corporate law ➤ The tasks are designed to require long-horizon work with a large number of tools, which are provided through MCP servers as would be used in many real-world deployments (including calendar, chat, spreadsheet and presentation operations, etc.) ➤ Required outputs include direct message responses (87%) and creating or modifying spreadsheets (6.6%), documents (4.8%), and presentations (1.3%) ➤ Model outputs are parsed and graded against binary rubrics using an LLM judge. Each task is run 3 times and scored pass@1 - a pass requires every rubric test to pass ➤ In our APEX-Agents-AA implementation, 452 tasks run in our open-source Stirrup harness with tool management and usage from @mercor_ai's original MCP implementation. This provides a consistent, reproducible baseline for comparing raw model capability that aligns with realistic agent deployments

译Artificial Analysis 发布 APEX-Agents-AA 排行榜，基于 Mercor 的 APEX-Agents 基准评估 AI 代理在长周期专业任务（投资银行、管理咨询、公司法）的表现。测试通过 Stirrup 框架和 MCP 工具执行 452 个任务，涵盖消息回复、文档处理等。结果显示 GPT-5.4 以 33.3% 领先，Claude Opus 4.6 (33.0%) 和 Gemini 3.1 Pro Preview (32%) 紧随其后，三强竞争激烈。评分采用 LLM 评判和 pass@1 标准。

Demis Hassabis@demishassabis · 4月8日

Thanks for the great conversation @cleoabram (and some competitive Jenga)! Really enjoyed talking about all the amazing ways AI is helping to advance science & the incredible future it will enable!

译Demis Hassabis 与 Cleo Abram 对谈，探讨 AI 推动科学发展的最佳实践、AlphaFold 背后故事、药物发现前沿、AI 创造力进化、政府军事应用等议题，并畅想了人类与 AI 共存的科幻未来愿景。

Epoch AI@EpochAIResearch · 4月8日

Who owns the world's compute? Our new Chip Ownership hub shows that Google leads, holding around 25% of all compute sold since 2022.

译Chip Ownership 最新数据显示，Google 占据2022年以来全球销售算力约25%的份额，领先市场。

swyx 🇬🇧@swyx · 4月7日

i'm being asked for oneliner descriptions of each track, so here goes (pushback/improvements welcome): 1. Claw track: This is the year of the personal agent - many people have been dreaming of a personal AI, from being a friend to an executive assistant. @steipete's OpenClaw created the category, and we've gathered maintainers and Claw competitors to preview what's next! 2. Context Engineering: LLM context lengths grow from 4000 to 1 million tokens, our jobs went from prompting to RAG to search to ever more complex context management for agents. This is the track for everyone who's watched @dexhorthy's keynotes and stressed about getting in the dumb zone. 3. Harness Engineering: The most exciting discovery in agent engineering is that harnesses are more responsible for variations in performance than the LLMs they build on. @_lopopolo ignited this category with the most extreme version of the dark factory harness we've ever seen, but here we have many of the best harness engineering ideas of 2026. 4. Evals & Observability: All serious AI engineering starts with evals & observability — you only get paid for what you can reliably maintain and improve. We are proud to feature perspectives from eval platforms like @braintrustdata_, LLM researchers like @maximelabonne, and benchmark authors. 5. Voice & Vision: The first of our multimodal AI tracks focus on voice and vision AI, the first modalities humans had before the invention of writing. This is the track to catch up on TTS, ASR, OCR, and all the other usecases from the @elevenlabs decacorn to @mistralai's new model to @meetgranola to @huggingface and more! 6. Gemini: Last but certainly not least, London is home to @googledeepmind who have an amazing team of engineers, PMs, and researchers with updates on open models, evals, agents, WebMCP, and even a special presentation on Text Diffusion models!

译AI Engineer Europe Build Day公布六大技术分论坛，聚焦AI工程前沿实践。议程涵盖Personal Agent（Claw）个人代理、Context Engineering长上下文管理、Harness Engineering代理性能优化、Evals & Observability评估体系、Voice & Vision语音视觉多模态，以及Gemini专场。从OpenClaw到Google DeepMind，内容涉及RAG、TTS、ASR、WebMCP等技术方向，呈现AI工程从提示词向复杂代理系统演进的最新趋势。

Sundar Pichai@sundarpichai · 4月7日

Most I’ve ever talked about latency budgets in a pub 😂thanks for having me @collision and @eladgil :) Cheeky Pint out tomorrow!

译Cheeky Pint 明日播出新一期，Sundar Pichai 与 Elad Gil、Collision 在酒吧录制 AI 对谈。嘉宾自嘲这是自己在酒吧谈论延迟预算最多的一次，节目氛围轻松随性。

Yuchen Jin@Yuchenj_UW · 4月7日

Crazy revenue growth at Anthropic. So they officially surpassed OpenAI’s $25B ARR reported a few days ago? The focus on coding models and enterprise clearly paid off. Once you’re locked into a year-long contract, switching to Codex isn’t easy. Claude Code shipping velocity is insane too, new feature every day. If they secure more GPUs and Google TPUs, this growth could accelerate even further.

译Anthropic 收入增速惊人，可能已超越 OpenAI 的 250 亿美元 ARR。其编程模型和企业策略成效显著，长期合同锁定用户难以转向 Codex。Claude Code 迭代速度极快，几乎日更。同时与 Google、Broadcom 签署协议，确保 2027 年起获得多千兆瓦 TPU 算力支持。

Anthropic@AnthropicAI · 4月7日

We've signed an agreement with Google and Broadcom for multiple gigawatts of next-generation TPU capacity, coming online starting in 2027, to train and serve frontier Claude models.

译与 Google、Broadcom 达成协议，锁定多千兆瓦下一代 TPU 算力，2027 年开始上线，用于训练和部署前沿 Claude 模型。

François Chollet@fchollet · 4月6日

Science went from the initial observation of radioactivity to a working atom bomb over 47 years via only about 9 distinct key experiments -- extremely few data points -- and symbolic models concise enough they would fit on a single page. This is what extreme generalization looks like, and it powered entirely by symbolic compression. Turn a handful of data points (deliberately collected) into a tractable plan to completely reshape reality, by reverse-engineering the causal symbolic rules behind the data.

译推文以原子弹研发为例，阐述极端泛化的本质：科学仅用47年、约9个关键实验便实现从放射性观察到核武器的突破。这种进步不依赖大数据，而源于符号压缩——将少量刻意收集的数据点提炼为单页纸可承载的因果符号规则。核心观点在于，通过逆向推导数据背后的因果逻辑，人类能够将极简信息转化为重塑现实的完整方案，展现符号推理在突破认知边界中的决定性作用。

François Chollet@fchollet · 4月6日

Tutorial on fine tuning Gemma on TPU v5 using Kinetic + Keras + JAX. Easiest stack to fully leverage TPUs at scale.

译关于使用 Kinetic + Keras + JAX 在 TPU v5 上微调 Gemma 的教程。

swyx 🇬🇧@swyx · 4月5日

always wanted to do one of those “is it coachella” announcements - here is our designer’s take on it!

译Google DeepMind 作为 Presenting Sponsors 回归本周伦敦 AIE Europe，以音乐节海报风格官宣演讲阵容：VP of Research Raia Hadsell 及多位产品负责人将出席，现场展示 Gemini 3.1、Embeddings 2、Veo 3、Gemma 4 等全模态技术进展。

swyx 🇬🇧@swyx · 4月4日

We have achieved agentic self improvement - i can just copy paste blogposts and tweets into @devinai and it oneshots the complete implementation wasnt actually sure this was gonna work, jaw dropped when it did. this is very out of distribution of the underlying @GoogleDeepMind Gemini Flash Lite model but it Just Worked.

译将博客或推文直接粘贴至 @devinai，即可一次性生成完整代码实现。底层 Gemini Flash Lite 模型虽超出训练分布，但效果惊人，实现智能体自我改进。

François Chollet@fchollet · 4月4日

Good tutorial on using Keras Kinetic to fine-tune LLMs on the Keras + JAX + TPU stack!

译关于在 Keras + JAX + TPU 技术栈上使用 Keras Kinetic 微调 LLM 的好教程！

Nathan Lambert@natolambert · 4月4日

People are too obsessed with benchmarks for open models. The core determining factor of success often is: 1. Immediate & long term tooling support. 2. Finetunability Tbh Gemma has struggled here in the past. Qwen has excelled at it. It's where the winners are crowned.

译开源模型成功的核心并非基准分数，而是即时且长期的工具支持与可微调性。Gemma 过去在这些方面表现挣扎，而 Qwen 则表现出色，这才是决定模型成败的关键因素。

François Chollet@fchollet · 4月4日

Perhaps the craziest thing that was introduced on the Keras community call today: Keras Kinetic, a new library that lets you run jobs on cloud TPU/GPU via a simple decorator -- like Modal but with TPU support. When you call a decorated function, Kinetic handles the entire remote execution pipeline: - Packages your function, local code, and data dependencies - Builds a container with your dependencies via Cloud Build (cached after first build) - Runs the job on a GKE cluster with the requested accelerator (TPU or GPU) - Returns the result to your local machine (logs are streamed in real time, and the function's return value is delivered back as if it ran locally)

译Keras 社区发布 Kinetic 库，开发者通过装饰器即可将函数部署至云端 TPU/GPU 运行，定位类似 Modal 但新增 TPU 支持。该工具自动完成代码打包、Cloud Build 容器构建（支持缓存）、GKE 集群调度及结果返回，实现日志实时流式传输，使远程执行体验如同本地运行。

François Chollet@fchollet · 4月4日

First update of the call, from Sachin: Gemma 4 is out now on KerasHub! Best open-source model so far for reasoning and agentic workflows.

译来自 Sachin 的会议首个更新：Gemma 4 现已在 KerasHub 上线！目前推理和智能体工作流的最佳开源模型。

Demis Hassabis@demishassabis · 4月3日

Gemma 4 outperforms models over 10x their size! (note the x-axis is log scale!)

译Gemma 4 在基准测试中性能超越体量 10 倍以上的大模型，图表 x 轴为对数坐标，凸显其极高的参数效率。

karminski-牙医@karminski3 · 4月3日72

http://x.com/i/article/2039985553492598784 # Gemma4有8个模型, 选哪个? 一文看懂! Google 刚刚发布了 Gemma4 系列开放权重模型, 之前没接触过本地模型的朋友都在问我该用哪个本地部署, 来, 这篇文让你迅无痛掌握. 首先啊, 选带"-it" 后缀的, 这个是指令微调版(Instruction Tuned) 的意思, 代表该模型经过了大规模的人类指令跟随训练和多轮对话对齐, 其他的都是基模, 是给自己要微调的同学准备的(所以举一反三, 你要是想自己微调, 就用不带-it的版本). A4B 我知道激活参数量是 4B, 那么 E4B 是啥意思? 简单来讲, 这是个专门为了移动端优化的技术——逐层嵌入(Per-Layer Embeddings), 它本身并不能省内存, 所以 Gemma-4-E2B 并不是它只需要2B参数量的内存, 它还是需要原始的5.1B的参数量的内存空间, 但是它的计算量只需要大概2B模型的计算量! (可以简单理解为把一部分矩阵运算优化为了查表, 然后用内存换计算了, 这部分表当然需要吃内存). 好的, 我们的前置知识准备完毕了! 那么接下来直接说模型选型: 本地龙虾优先选 Gemma-4-26B-A4B! 激活量4B的MoE, prefill速度也相当好, 特别适合龙虾这种系统提示词超级臃肿的场景. 写代码/写脚本/要求精确工作选 Gemma-4-31B, 选这个肯定就是要最好的效果的, 如果实在是跑不动, 可以试试5bit量化. 给大家一个参考, Apple M2Ultra 如果运行 8bit, 理论速度也就 25token/s. 我要一个本地语音助手! 选Gemma-4-E4B, 全模态输入, 你写代码让它接入有麦克风的摄像头, 剩下的场景就靠你的想象了. 并且4B激活即使CPU跑都能跑动. 我只想跑一下试试装在我的树莓派里, 选 Gemma-4-E2B, 你能体验到极致的本地模型速度, 至于质量嘛, 会比电子鹦鹉好点, 他可以做类似"帮我检查文本里有英文吗"之类的过滤工作, 另外它是全模态输入的, 也可以尝试语音输入. #Gemma4 #google #GoogleGemma #本地大模型

译Google发布的Gemma4系列开放权重模型包含多个版本，选型需结合场景。带“-it”后缀为指令微调版，开箱即用；不带后缀为基座模型，供自行微调。其中，A4B指激活参数量为4B，E4B则采用逐层嵌入技术，以内存换取计算量，优化移动端性能。选型建议：综合性能与速度选26B-A4B；追求最佳代码或任务效果选31B；开发本地全模态应用选E4B；资源受限设备体验可选E2B，但输出质量有限。

Artificial Analysis@ArtificialAnlys · 4月3日

Google has released Gemma 4, a new family of multimodal open-weight models including Gemma 4 E2B, Gemma 4 E4B, Gemma 4 31B and Gemma 4 26B A4B @GoogleDeepMind’s new Gemma 4 family introduces four multimodal models supporting text, image, and video inputs. We evaluated Gemma 4 31B (dense) and Gemma 4 26B A4B (MoE), both with a 256k context window, while the other two smaller models support up to 128k. With 31B and 26B parameters respectively, both evaluated models can run on a single H100. On GPQA Diamond, our scientific reasoning evaluation, Gemma 4 31B (Reasoning) scores 85.7%, the second highest result we have recorded for an open-weights model with fewer than 40B parameters, just behind Qwen3.5 27B (Reasoning, 85.8%). It reaches this score using only ~1.2M output tokens, fewer than Qwen3.5 27B (~1.5M) and Qwen3.5 35B A3B (~1.6M). Gemma 4 26B A4B (Reasoning) scores 79.2%, ahead of gpt-oss-120B (high, 76.2%) but behind Qwen3.5 9B (Reasoning, 80.6%). We are now running the Artificial Analysis Intelligence Index on all four Gemma 4 models and will share a full update once those results are complete.

译Google DeepMind推出Gemma 4系列四款多模态开源模型，支持文本、图像及视频输入。31B（密集架构）与26B A4B（MoE架构）拥有256k上下文窗口，可在单张H100运行；另两款较小模型支持128k上下文。GPQA Diamond测试中，Gemma 4 31B（Reasoning）获85.7%，仅次于Qwen3.5 27B，但输出token仅约1.2M，效率更优；26B A4B（Reasoning）得分79.2%，超越gpt-oss-120B。

Google Gemini@GeminiApp · 4月3日

We want to see what you’ve been making with Lyria 3 Pro in Gemini. 🎶 Share your creations in the replies 👇

译Google 官方发起创作征集，邀请用户在评论区分享使用 Gemini 内置 Lyria 3 Pro 功能生成的音乐作品，展示 AI 创作成果。

Sundar Pichai@sundarpichai · 4月3日

Gemma 4 is here, and it’s packing an incredible amount of intelligence per parameter 👇

译Gemma 4 开源模型发布，提供 31B dense、26B MoE 及有效 2B/4B 四种尺寸，分别针对性能、低延迟和边缘设备优化。Google DeepMind 称其为同尺寸最佳开源模型，强调单位参数量智能密度极高。

Demis Hassabis@demishassabis · 4月3日

Excited to launch Gemma 4: the best open models in the world for their respective sizes. Available in 4 sizes that can be fine-tuned for your specific task: 31B dense for great raw performance, 26B MoE for low latency, and effective 2B & 4B for edge device use - happy building!

译Gemma 4 开源模型发布，提供 4 种尺寸：31B dense 版追求极致性能，26B MoE 版实现低延迟，2B 与 4B 版适配边缘设备，均可针对特定任务微调。

Google DeepMind@GoogleDeepMind · 4月3日

Meet Gemma 4: our new family of open models you can run on your own hardware. Built for advanced reasoning and agentic workflows, we’re releasing them under an Apache 2.0 license. Here’s what’s new 🧵

译Google 发布 Gemma 4 开源模型系列，采用 Apache 2.0 许可证，支持在本地硬件运行，专为高级推理和 agentic 工作流设计。

Nathan Lambert@natolambert · 4月2日

Nemotron Super / Ultra Arcee Trinity Large (soon) Gemma 4 (eventually) Reflection's first models (maybe) GPT OSS 2? (maybe) Thinky? Other neolabs? Things looking up for open models built in the US in 2026. We had 0 for a bit there.

译Nemotron Super/Ultra、Arcee Trinity Large、Gemma 4 及 Reflection 首个模型都将在 2026 年发布，GPT OSS 2 和 Thinky 等也可能加入。美国开源模型此前一度挂零，如今终于迎来爆发期。

Deedy@deedydas · 4月2日

Blows my mind that we currently possess the technology for Google Maps to turn all the street view images of the entire world into a video game you can play! In the future, we'll be able to say "yeah let's check out New York City 100 years ago!"

译Google Maps 现有技术已能将全球街景图像转化为可玩视频游戏，令人震撼。未来还能借此回顾100年前的纽约等城市风貌，实现穿越时空的探索体验。

Google Gemini@GeminiApp · 4月1日

Create personalized images that are out of this realm with Nano Banana 2 in Gemini. Try it for yourself and drop yours in the replies 👇

译Gemini 上线 Nano Banana 2 图像生成功能，支持创建个性化图像。官方邀请用户尝试体验并在回复区分享作品。