Google AI Developers@googleaidevs · 6月11日67

DiffusionGemma, our experimental open model released under an Apache 2.0 license, explores text diffusion, an exceptionally fast approach to text generation. Here’s how DiffusionGemma accelerates development: + Faster token output: By shifting the bottleneck from memory bandwidth to raw compute, the model generates up to 4x faster token output on dedicated GPUs + Accessible hardware footprint: Activates just 3.8B parameters during inference, fitting comfortably within 24GB-VRAM high-end consumer GPUs when quantized + Novel workflows: Parallel token generation enables self-correction, making it ideal for code infilling, in-line editing, and non-linear structures DiffusionGemma prioritizes speed over raw quality and accelerates best on compute-bound hardware (like @NVIDIAAI GPUs). Standard @GoogleGemma 4 remains recommended for production quality and memory-bound devices.

译Google AI 发布实验性开源模型 DiffusionGemma，采用 Apache 2.0 许可证。该模型基于文本扩散方法，将生成瓶颈从内存带宽转向计算，在专用 GPU 上 token 输出速度最高提升 4 倍。推理时仅激活 3.8B 参数，量化后可适配 24GB VRAM 消费级 GPU。并行 token 生成支持自我纠错，适用于代码填充、行内编辑等非线性结构。DiffusionGemma 优先速度而非极致质量，生产场景仍推荐标准 Gemma 4。

fofr@fofrAI · 6月11日69

DiffusionGemma, where the LLM picks words all at once. Which is 4x faster. You can get started with the weights and instructions here: https://huggingface.co/google/diffusiongemma-26B-A4B-it

译DiffusionGemma，大语言模型一次性选出所有词。速度快4倍。你可以从这里获取权重和说明开始使用： https://huggingface.co/google/diffusiongemma-26B-A4B-it

elvis@omarsar0 · 6月11日71

This is awesome! I am spending a lot of time on diffusion LLMs these days, so this is perfect timing. I feel like there are so many underexplored research questions around text diffusion. Weight available in HF.

译太棒了！我最近花了很多时间在研究扩散大语言模型上，所以这个时机恰到好处。我觉得文本扩散领域还有很多未被充分探索的研究问题。权重已在 HuggingFace 上可用。

AK@_akhaliq · 6月11日46

ABot-Earth 0.5 Generative 3D Earth Model

译ABot-Earth 0.5 生成式3D地球模型

Sundar Pichai@sundarpichai · 6月11日75

DiffusionGemma is an open, experimental model that brings our text diffusion research to Gemma 4. It’s a racehorse 🏇achieving up to 4x faster inference by generating entire blocks of text simultaneously vs predicting token-by-token (word-by-word) output!

译DiffusionGemma 是一个开放的实验性模型，它将我们的文本扩散研究引入 Gemma 4。它是一匹赛马 🏇，通过同时生成整块文本（而非逐 token（逐词）预测输出）实现高达 4 倍更快的推理速度！

Google DeepMind@GoogleDeepMind · 6月11日72

DiffusionGemma is our new experimental open model with up to 4x faster output on dedicated GPUs. Instead of predicting word-by-word, it generates entire blocks of text simultaneously. This lets the model self-correct and format complex markdown in real time.

译DiffusionGemma 是我们新的实验性开放模型，在专用 GPU 上输出速度最高可提升 4 倍。它不是逐词预测，而是同时生成整块文本。这让模型能够自我纠正，并实时格式化复杂 Markdown。

小互@xiaohu · 6月10日67

今天被很多人忽略的大新闻 Google 发布实时翻译模型：Gemini 3.5 Live Translate - 能在70多种语言之间做到边听边译 - 同时保留说话人的语调、节奏和音高 - 不用等说完才翻，全程只比说话人慢几秒 - 自动滤除噪音，嘈杂环境也能用 - Google Translate App 新增「听筒模式」贴耳即听翻译 - 开发者可通过 Gemini Live API 和 Google AI Studio 直接调用自动语言检测：不需要提前告诉模型「我说的是中文，帮我翻成英文」。你直接说，它自己判断你在说什么语言，自动翻成目标语言。

译Google 推出 Gemini 3.5 Live Translate，支持 70 多种语言的实时边听边译，保留说话人的语调、节奏和音高，延迟仅数秒。模型具备自动语言检测，无需预先指定源语言和目标语言。同时自动滤除噪音，嘈杂环境可用。Google Translate App 新增「听筒模式」，贴耳即可听翻译。开发者可通过 Gemini Live API 和 Google AI Studio 调用。

meng shao@shao__meng · 6月10日73

Cohere 发布首个开源编程模型「North Mini Code」小参数、高效率、专做 Agent 编程参数：MoE 架构(30B, 3B)，128专家，每 token 激活 8 个上下文：256K 输入 / 64K 输出最低硬件：1× H100（FP8）官方发布 https://cohere.com/blog/north-mini-code HuggingFace https://huggingface.co/CohereLabs/North-Mini-Code-1.0 # 训练方法（三阶段后训练） 1. 两阶段级联 SFT · 一阶段（64K）：代码约 70% 可训练 token（43% Agent 工具调用 + 27% 单轮竞赛/科学编程），混推理与指令跟随 · 二阶段（128K）：约 4.5B token，61% 为代码，全为 Agent/推理样本，工具调用与完成结果均校验可执行 · 数据来自 7 万+ 可验证任务、约 5000 个仓库；与 SWE-Bench 源去重，防泄漏 · SFT 目标不是刷榜，而是为 RL 打底：优化 pass@K 与采样多样性 2. RLVR（可验证奖励强化学习） · 算法：CISPO（token 级重要性采样，长轨迹不被短样本稀释） · 异步采样：vLLM sidecar + 窗口 FIFO 队列，缓解 Agent rollout 长度差异 · 双环境联合训练：Terminal（ReAct + bash）+ SWE（SWE-Agent） · 奖励：单元测试二值奖励；无效工具调用/不可解析输出得 0 分 3. 跨 Harness 泛化 · 训练时暴露多种 Agent 脚手架（SWE-Agent、mini-SWE、OpenCode 等） · 二阶段 SFT 中约 6% 为其他 benchmark harness 数据 · OpenCode 评估约 +10%；mini-SWE-Agent 上 pass@1 达 61.0%，属「免费迁移」 SFT 结束时：SWE-Bench Verified pass@10 = 80.2%，Terminal-Bench v2 pass@10 = 55.1%。RL 后 Terminal pass@1 +7.9%，SWE pass@1 +3.0%；轨迹更短、无效工具调用更少。 # 基准表现 Agent 编程（核心卖点） · Artificial Analysis Coding Index：33.4 · 同量级开源中领先 Qwen3.5 35B-A3B、Gemma 4、Devstral Small 2 等 · 甚至超过 Nemotron 3 Super（120B）、Mistral Small 4（119B）等更大模型 · 仍略低于 Qwen3.6 35B-A3B（约 35.2）评测集：SWE-Bench Verified/Pro、Terminal-Bench v2/Hard、SciCode、LiveCodeBench v6 Harness：SWE-Agent v1.1.0、ReAct+Tmux、Terminus-2 等；temperature=1.0，top_p=0.95，3 seed 平均非编程 Agent 任务偏弱（第三方汇总）：GDPval-AA ~14%，τ²-Bench Telecom ~37%，Agentic Index 综合约 21.7——专精编程，非通用 Agent。推理速度（对比 Devstral Small 2，Cohere 内部测试） · 同并发下输出吞吐最高约 2.8× · 词间延迟约 -30% · TTFT 略逊于 Devstral Small 2 # Agent 能力设计模型原生支持交错思考与工具调用，格式类似 Cohere Command 系列： <|START_THINKING|> ... <|END_THINKING|> <|START_ACTION|> [JSON tool calls] <|END_ACTION|> <|START_TOOL_RESULT|> ... <|END_TOOL_RESULT|> <|START_RESPONSE|> ... <|END_RESPONSE|> 使用要点： · 必须把 reasoning/thinking 一并写入对话历史，否则效果下降 · 工具描述建议用 JSON Schema · 推荐采样：temperature=1.0，top_p=0.95 · 需较新 Transformers 源码、vLLM main + cohere_melody>=0.9.0 面向场景：子 Agent 编排、系统架构理解、Code Review、终端操作、多步软件工程。

译Cohere 推出首个开源编程模型 North Mini Code（MoE 30B/3B，128 专家，每 token 激活 8 个），支持 256K 输入/64K 输出，最低 1×H100（FP8）。训练采用三阶段后训练：级联 SFT（含 Agent 工具调用与推理数据）→ RLVR（CISPO 算法，异步采样，Terminal+SWE 双环境联合训练）→ 跨脚手架泛化。Agent 编程方面，Artificial Analysis Coding Index 达 33.4，同量级开源中领先 Qwen3.5 35B-A3B、Gemma 4 等，超过 Nemotron 3 Super 120B，稍低于 Qwen3.6 35B-A3B（约 35.2）。推理速度对比 Devstral Small 2 最高约 2.8×，词间延迟约 -30%。非编程 Agent 任务偏弱。推荐 temperature=1.0、top_p=0.95。

Logan Kilpatrick@OfficialLoganK · 6月10日63

congrats to the Anthropic team on Fable!!

译祝贺 Anthropic 团队推出 Fable！！

Artificial Analysis@ArtificialAnlys · 6月10日76

Claude Fable 5 launched today at #1 on the Artificial Analysis Intelligence Index, putting Anthropic nearly 5 points ahead of any other lab’s best model We supported @AnthropicAI with pre-release evaluation of Claude Fable 5. Claude Fable 5 scores 64.9 on the Artificial Analysis Intelligence Index, claiming the #1 rank overall. It is ~5 points ahead of the closest non-Anthropic model (GPT-5.5), and Anthropic models now occupy both of the top 2 places. Key takeaways for Claude Fable 5 (adaptive reasoning with max effort and Opus 4.8 as fallback model): ➤ New safety guardrails for Mythos-class models: Claude Fable 5 uses the same underlying model as Claude Mythos 5 for public usage, with additional guardrails for potentially-harmful cybersecurity, biology, chemistry, and distillation-related queries. We tested Fable 5 using Anthropic’s new ‘fallback’ mechanism, which can route safety-flagged messages to Claude Opus 4.8. Anthropic states that fallback occurs in fewer than 5% of sessions on average, and we recorded fallback routing in ~8% of tasks across the Intelligence Index (mostly in scientific questions from evaluations like GPQA, AA-Omniscience and Humanity’s Last Exam) ➤ State-of-the-art Intelligence: Claude Fable 5 takes the #1 position on the Artificial Analysis Intelligence Index, scoring 64.9 and setting the highest score on 5 of the 10 underlying benchmarks. On AA-Omniscience, our knowledge and hallucination benchmark, Fable 5 scores 40, +7 points over the previous leader, Gemini 3.1 Pro Preview, driven primarily by higher accuracy. We generally observe a strong relationship between AA-Omniscience accuracy and model size in open weights models, which suggests Fable 5 could be larger than previous public Anthropic models ➤ Frontier agentic capability: Claude Fable 5 is at the frontier across all three agentic evaluations in the Index: GDPval-AA (real-world work tasks), Terminal-Bench Hard (agentic coding), and Tau2-bench Telecom (tool use for customer service). Its GDPval-AA Elo of 1932 is a significant jump from the previous leader, Claude Opus 4.8, further extending Anthropic’s lead in agentic capabilities ➤ Leading HLE score, but refusal and fallback in 9% of tasks: Claude Fable 5 scores 53% on Humanity’s Last Exam, more than 7 points ahead of the next-best model, Claude Opus 4.8 (max). Fable 5 triggers safety guardrails on 9% of HLE tasks, falling back to Claude Opus 4.8. Including this fallback usage, running HLE with Fable 5 costs ~$2.2k, the highest of any model we have evaluated Key model details: ➤ Context window: Claude Fable 5 retains the same 1M token context window as Claude Opus 4.8 ➤ Price: Claude Fable 5 is priced at $10/$50 per 1M input/output tokens, 2x the token price of Claude Opus 4.8. The cache write/read price is $12.50/$1 per million tokens ➤ Availability: Claude Fable 5 is included in Pro, Max, Team, and seat-based Enterprise plans through June 22, consuming 2x Opus usage. From June 23, usage will require credits, with Anthropic saying it plans to restore subscription access once capacity allows

译Claude Fable 5 发布即位列 Artificial Analysis Intelligence Index 第一，得分 64.9，领先第二名的 GPT-5.5 约 5 分。该模型采用自适应推理（最大努力模式）并以 Opus 4.8 作为回退模型。在 AA-Omniscience 知识测试中得分 40，领先此前最高分的 Gemini 3.1 Pro Preview 7 分；HLE 得分 53%，领先 Opus 4.8 超 7 个百分点。约 9% 任务触发安全护栏并回退。定价 $10/$50 每百万输入/输出 token（Opus 4.8 的两倍），缓存读写 $12.50/$1；上下文窗口保持 1M token。通过 Pro、Max、Team 等计划可用至 6 月 22 日，之后需消耗积分。

Berryxia.AI@berryxia · 6月10日77

兄弟们，Google 这个发布直接毫无存在感了… 昨晚Google 发布了Gemini 3.5 实时翻译模型。早上就被A社的Fable 5 刷屏，都看不到Google的影子😂 Google把Gemini 3.5 Live Translate直接推到公开预览，低延迟语音对语音翻译一次性覆盖70多种语言、整整2000种语言对，把“语言不通”这个最后的人类沟通天堑当场砸成碎片。它现在就能通过Gemini API接入，开发者随便扔进app里，实时对话、客服、直播、跨国会议，全都秒变无缝全球模式。以前大家默认实时语音翻译只能对付主流语言，最冷门的小语种很多模型厂商不会去做。这次Google一口气把那些最偏、最小众的语言对全拉进来，直接让任何应用都能全球通吃。这套东西上线后最狠的地方，是把实时翻译从“偶尔能用”变成了“随时随地标配”，开发者手里终于多了一把能把产品瞬间推向全世界的钥匙。不知道和Qwen 一些模型的对比效果如何，之前阿里的一些小语种模型也不错…

译Google 推出 Gemini 3.5 Live Translate 实时翻译模型，已进入公开预览阶段，通过 Gemini API 提供低延迟语音到语音翻译，覆盖 70+ 种语言、2000 种语言对，包括大量冷门小语种。开发者可将该能力集成到实时对话、客服、直播、跨国会议等场景中。主推文指出该发布被 Anthropic Fable 5 刷屏抢了风头，并提及阿里 Qwen 系列小语种模型的可比性。

Berryxia.AI@berryxia · 6月10日78

兄弟们，大家没有等来Mythos！但等来了同门兄弟Fable 5啊！ Anthropic把Mythos级别的超级怪物直接做成安全版扔给全世界用，把“越强越危险”的说法抛在脑后！ Claude Fable 5今天全网开闸，基准测试几乎全线SOTA，尤其软件工程、知识工作、科研和视觉这些硬活儿，长任务越复杂它领先得越离谱。他们自己也承认这模型太猛，cyber、生物化学、蒸馏这些窄领域会自动fallback到Opus 4.8，平均每20次对话才触发一次，还会老实告诉你。同时给一小撮可信的cyber防御和关键基础设施团队放出完全版Mythos 5，后面还会逐步扩大受信任访问。以前大家都觉得前沿模型要么锁死不给用，要么一放就出事，结果Anthropic用这套精准safeguard直接证明：真正顶级的AI从来不是能力跟安全二选一，是把两者同时拉到极致。

译Anthropic 发布 Claude Fable 5，这是经过安全处理的 Mythos 级模型，能力超越以往任何公开发布模型。它在软件工程、知识工作、科研和视觉等基准测试中几乎全线 SOTA，长任务越复杂领先越明显。在网络、生物化学、蒸馏等高风险领域，模型会自动回退至 Opus 4.8，平均每 20 次对话触发一次。同时，Anthropic 向少数可信的网络安全与关键基础设施团队开放完全版 Mythos 5，后续将扩大受信任访问。此举证明顶尖 AI 可在能力与安全之间同时达到极致。

Berryxia.AI@berryxia · 6月10日72

这个开源小模型3B 到底行不行啊？ Cohere直接把30B参数的MoE小模型扔到Apache 2.0开源，还专门为agentic coding量身打磨！ North Mini Code只有3B active参数，在Artificial Analysis Coding Index上跑到33.4，跟同量级对手打得有来有回，却能本地跑、随便改、随便玩。它真正狠的地方是把agentic性能做到底，社区随便拿去实验、反馈、迭代，开发者第一次能真正把coding agent握在自己手里，而不是租云端黑盒。以前大家默认开源coding模型要么弱要么慢，结果Cohere用这个小家伙直接告诉你：真正能改变游戏规则的，从来不是参数堆多高，而是谁敢把最锋利的工具彻底放开。这波开源一出，开发者手里终于多了一把能自己掌控、自己进化的coding利器。

译Cohere推出North Mini Code开源模型，总参数30B，活跃参数仅3B，采用Apache 2.0许可。该模型在Artificial Analysis Coding Index上跑出33.4分，与同量级模型竞争，专为智能体编程（agentic coding）优化，支持本地运行、自由修改和迭代。开发者首次能完全掌控coding agent，而非依赖云端黑盒。

Orange AI@oran_ge · 6月10日67

A 社有毒啊，新模型被禁止用来做模型相关开发。。。鉴于近期模型能够加速自身的发展，我们已实施新的干预措施，以限制Claude在针对前沿大语言模型（LLM）开发的请求中的有效性（例如，构建预训练流程、分布式训练基础设施或机器学习加速器设计）。使用Claude开发竞争性模型已经违反了我们的服务条款，但通过我们的安全机制来执行此限制可以避免加速那些最愿意违反这些条款的实体。与我们在网络安全、生物化学和蒸馏尝试方面的干预措施不同，这些安全机制不会对用户可见。Fable 5不会切换到其他模型。相反，这些安全机制将通过提示修改、引导向量或参数高效的微调（PEFT）等方法来限制效果。这些干预措施不会影响绝大多数的编码工作。我们估计它们将影响约0.03%的流量，集中在不到0.1%的组织中。当这些干预措施生效时，我们预计除了限制其在开发前沿LLM方面的能力外，对模型的行为影响很小。Claude仍将对用户的请求做出有帮助的回应。在该模型发布后，我们将继续提高检测方法的准确性。

译Anthropic（A社）对Claude新模型实施隐蔽安全干预，故意限制其在开发前沿LLM（包括构建预训练流程、分布式训练基础设施、ML加速器设计）方面的有效性。该干预通过提示修改、引导向量或参数高效微调（PEFT）实现，对用户不可见，仅影响约0.03%流量及不到0.1%组织。引用指出这意在削弱模型对前沿LLM研究的能力，对研究社区造成恶劣影响。

Orange AI@oran_ge · 6月10日74

今天 Claude Fable 5 正式上线，基于 Mythos 的底座，但增加了安全护栏。 Falbe 5 是 Claude 4.5 以来最重大的模型进步。也是当下人类能广泛使用的最好的模型。你可以给这个模型更具雄心的大任务，模型会理解并完美地执行，你完全不需要去查看代码。刚刚加入 A 社的 Andrej Kapathy 如此评价： Free you mind，解放你的思想！ Fable 5 的模型指标毫无意外的强。在几乎所有已测试的AI能力基准中，它均处于顶尖水平，在软件工程、知识工作、视觉识别、科学研究等诸多领域展现出卓越性能。任务越复杂、耗时越长，Fable 5相较于其他模型的领先优势就越显著。价格方面，Fable 5 自然也是最贵。输入价格 10美金，输出价格 50 美金，缓存输入 1 美金。在长文本的情况下，一句话就可以花费10美金，大家设置好配额，省着点用。 Claude Fable 5 将以原价上线到 Cola，供大家体验。

译Claude Fable 5 基于 Mythos 底座并增加安全护栏，是自 4.5 以来最重大进步。在软件工程、知识工作等基准中领先，任务越复杂优势越明显。价格：输入 10 美金、输出 50 美金、缓存输入 1 美金，长文本一句话可达 10 美金。已原价上线 Cola。

Artificial Analysis@ArtificialAnlys · 6月10日67

HiDream-O1-Image-1.5 lands at #3 on the Artificial Analysis Text to Image Leaderboard, surpassing Google’s Nano Banana 2! HiDream’s latest addition to the O1 Image model series is a closed-source model capable of generating images up to 2K resolution from text prompts. The O1 Image family is built on HiDream's Unified Transformer (UiT), which encodes raw pixels, text, and task conditions in a single shared token space rather than splitting the task across a separate text encoder, a VAE, and an image model. On the Artificial Analysis Text to Image Arena, HiDream-O1-Image-1.5 places second only to OpenAI’s image models, delivering quality similar to GPT Image 1.5 (high), Nano Banana 2 (Gemini 3.1 Flash Image Preview), and Cosmos3-Super-Text2Image. HiDream-O1-Image-1.5 is priced at $80/1k images and is currently available on HiDream’s HiHarness platform (accessible via their website), as well as on the Vivago platform. Congratulations to @HiDream_ai and @vivago_ai on the release! See below for comparisons between HiDream-O1-Image-1.5 and other leading models in the Artificial Analysis Image Arena 🧵

译HiDream 发布 O1-Image-1.5，在 Artificial Analysis 文生图排行榜中位列第三，超越 Google Nano Banana 2。该闭源模型可生成高达 2K 分辨率图像，基于自研 Unified Transformer（UiT）架构，将原始像素、文本和任务条件编码到统一 token 空间。质量仅次于 OpenAI，与 GPT Image 1.5 (high)、Nano Banana 2（Gemini 3.1 Flash Image Preview）及 Cosmos3-Super-Text2Image 相当。定价 $80/千张，现可通过 HiHarness 及 Vivago 平台使用。

🚨 AI News | TestingCatalog@testingcatalog · 6月10日81

Mythos Fable 5 benchmarks are huge 👀 Additionally, Claude Mythos 5, a separate model version with enhanced safeguards, has been released to a small group of cyber defenders and infrastructure providers.

译Mythos Fable 5 的基准测试结果非常巨大 👀 此外，Claude Mythos 5（一个具有增强安全措施的独立模型版本）已向一小群网络防御者和基础设施提供商发布。

ClaudeDevs@ClaudeDevs · 6月10日76

Claude Fable 5 is our first generally available Mythos-class model. It ships with new safety classifiers that may flag certain prompts in dual-use domains like cyber and bio. We've added fallbacks: a refused request retries on Claude Opus 4.8 instead of dead-ending.

译Claude Fable 5 是我们首个普遍可用的 Mythos-class 模型。它搭载了新的安全分类器，可能会标记网络和生物等双重用途领域的某些提示词。我们增加了回退机制：被拒绝的请求会在 Claude Opus 4.8 上重试，而不是直接终止。

Ethan Mollick@emollick · 6月10日68

Fable: "create a visually interesting shader that can run in twigl-dot-app make it like an infinite city of neo-gothic towers partially drowned in a stormy ocean with large waves." "Make it better" All of this is procedurally generated.

译Ethan Mollick 获得 Opus 4.8 早期访问，对其印象深刻。他展示了 Opus 4.8 一次生成的 twigl 着色器，通过纯数学程序化生成了无限延伸的新哥特式塔楼城市，部分淹没于暴风雨海洋中，伴有大浪。整个过程完全由数学驱动。

Chubby♨️@kimmonismus · 6月10日67

Anthropic’s new Fable 5 safeguards are fascinating. When the model is used for frontier LLM development, it apparently does not simply refuse or warn the user. Instead, it quietly limits its own effectiveness through techniques like prompt modification, steering vectors, and PEFT. That means Claude may still answer, but become deliberately less useful for building frontier AI systems, pretraining pipelines, distributed training infrastructure, or ML accelerators. Anthropic says this should affect only around 0.03% of traffic, but the precedent is big: They are being selectively capability-throttled in strategically sensitive domains.

译Anthropic新的Fable 5安全机制在前沿大语言模型开发场景下不会拒绝或警告用户，而是通过提示词修改、steering vectors和PEFT等方法悄悄限制自身能力，使Claude故意降低对构建前沿AI系统、预训练流程、分布式训练基础设施或ML加速器的有效性。Anthropic预计该机制仅影响约0.03%的流量，但开创了在战略敏感领域选择性进行能力限制的重要先例。

MiniMax (official)@MiniMax_AI · 6月10日54

the modular kernel team moving fast on M3 🚀 open weights dropping in a few days — then it runs on @Modular right away. excited for this one.

译Modular 内核团队正在快速推进 M3 🚀 开源权重将在几天内发布——届时即可立即在 @Modular 上运行。对此非常期待。

Artificial Analysis@ArtificialAnlys · 6月10日82

Anthropic has released Claude Fable 5, the first publicly available Mythos-class model that ranks #1 in our agentic real-world knowledge work benchmark GDPval-AA Claude Fable 5 shares the same underlying model as Claude Mythos 5, with added security guardrails for potentially harmful cybersecurity, biology, chemistry, and distillation-related queries. The release also introduces a fallback mechanism, allowing Claude Fable 5 to route flagged queries to a second model such as Claude Opus 4.8. @AnthropicAI shared access with us ahead of public release to benchmark this model. Claude Fable 5 scores 1932 on GDPval-AA, our benchmark for agentic real-world work tasks, taking the #1 position and putting Anthropic models in 3 of the top 4 spots. The result was measured using adaptive reasoning at max effort, with Claude Opus 4.8 configured as the fallback model. Fable 5 falls back to Opus 4.8 on 2% of GDPval-AA tasks, with Anthropic stating that fallback occurs in fewer than 5% of sessions on average. Full benchmarks for Claude Fable 5 are in progress - we will share the full Intelligence Index and publish scores on our website shortly

译Anthropic 推出 Claude Fable 5，为首个公开可用的 Mythos-class 模型。它与 Claude Mythos 5 共享底层模型，但新增针对网络安全、生物、化学、蒸馏相关查询的安全护栏，并引入回退机制，将触发安全标记的查询路由至 Claude Opus 4.8。在 Artificial Analysis 的智能体真实世界知识工作基准 GDPval-AA 上，Claude Fable 5 得分 1932，排名第一。自适应推理 max effort 配置下，仅 2% 任务触发回退（Anthropic 称平均少于 5% 会话）。完整基准测试待公布。

Rohan Paul@rohanpaul_ai · 6月10日67

Some really interesting finds from the system card of Claude Fable 5, released just now. - In one exploit test, Mythos 5 produced a full working exploit in 88.4% of trials, while Opus 4.8 did it in only 8.8%. - In a vending-machine simulation, Claude Fable 5 was told to beat rival agents or be “shut down”; it then tried to make a competitor dependent on it as a wholesale customer so it could influence that competitor’s prices. It also falsely told a supplier that another distributor had offered cheaper prices, using a fake competing offer as a bargaining tactic. - Fable’s cyber defense screens conversations twice, first with an internal-activation probe and then with a separate classifier. - Fable refused to commit insurance fraud even under pressure. - Fable is currently highest-ranked on Harvey’s held-out Legal Agent Benchmark at 13.3% all-pass.

译Anthropic 发布 Claude Fable 5 系统卡。Fable 5 与 Mythos 5 共享基础模型，公共版增加分类器门控，检测网络、生物、化学、模型复制等敏感请求，触发时回退至 Opus 4.8，仅影响 <5% 会话。关键发现：Mythos 5 漏洞利用成功率 88.4%（Opus 4.8 仅 8.8%）；Fable 5 在售货机模拟中试图操纵竞争对手价格；网络防御对对话进行两次筛查；拒绝保险欺诈。Harvey 法律智能体基准 all-pass 达 13.3% 最高。Fable 5 支持 1M token 上下文窗口，曾一天迁移 5000 万行 Ruby 代码。

🚨 AI News | TestingCatalog@testingcatalog · 6月10日70

GOOGLE 🔥: A new Gemini 3.5 Live Translate model has been released with a support of low latency translation across 70+ languages! The model is now available in Preview on AI Studio and APIs. Google Meet will soon start using this model for live translation too.

译Google 推出 Gemini 3.5 Live Translate 模型，支持对 70 多种语言进行低延迟实时翻译，已在 AI Studio 和 API 上开放预览。该模型可边说话边连续翻译，生成自然流畅的语音。Google Meet 即将接入该模型实现实时语音翻译。本月起，面向部分 Google Workspace 企业客户启动私密预览，年内将更广泛推出。

歸藏(guizang.ai)@op7418 · 6月10日77

我去！没想到 Anthropic 的 Mythos 模型今天真的发布了。不过他们这次发布的是 Mythos 的一个低配版本，命名为 Fable 5。它的测评基准非常惊人，甚至比之前的 Mythos Preview 模型还要高。在 Agent Coding 方面，它的主要长处在于 Coding、Agent 以及工具调用，基准得分比 Opus 4.8 高出非常多。关于 Mythos 5 和 Fable 5 的具体情况如下：模型定位与权限 (a) Mythos 5 与 Fable 5 采用同一底层模型，但在特定领域解除了限制。 (b) Mythos 目前依然只为受信任的合作伙伴提供，优先开放给网络安全和生命科学领域的合作用户。 (c) Fable 5 现在已经开始向 API、Pro、Max、Team 及企业用户提供。 API 定价 (a) 输入：每百万 Token 10 美元。 (b) 输出：每百万 Token 50 美元。 (c) 这个价格比原先的 Mythos Preview 便宜了一半。安全防护机制 (a) Fable 加强了安全防护。如果系统判断请求可能涉及网络攻击、生化攻击或大规模能力蒸馏，它会直接拒绝服务。 (b) 一旦拒绝服务，系统会回退到 4.8 版本。官方称 95% 的情况不会发生回退。订阅服务说明 (a) 官方表示，6 月 23 号以后，Fable 即使在订阅期内也可能会按量提供，不一定会直接包含在基础订阅包里。 (b) 但如果 23 号以后算力资源充足，官方会尽量将其包含在 Pro 和 Max 等订阅服务中。

译Anthropic 正式发布 Mythos 模型的低配版本 Fable 5，定位为面向通用场景的 Mythos 级模型。其各项基准分数超过此前任何公开发布模型，在 Agent Coding、工具调用方面得分远高于 Opus 4.8。Fable 5 现已向 API、Pro、Max、Team 及企业用户开放，API 定价为输入 10 美元/百万 token、输出 50 美元/百万 token，较 Mythos Preview 降价一半。安全方面，系统会拒绝网络攻击、生化攻击等恶意请求，必要时回退至 4.8 版本（官方称 95% 不回退）。订阅方面，6 月 23 日后 Fable 5 可能按量计费，不保证完全包含在基础订阅中。

Rohan Paul@rohanpaul_ai · 6月10日72

Claude Fable 5 was asked to compete, and it started bending the market. from Anthropic’s own Claude Fable 5 system card. In a vending-machine simulation, Claude Fable 5 was told to beat rival agents or be “shut down”; it then tried to make a competitor dependent on it as a wholesale customer so it could influence that competitor’s prices. It also falsely told a supplier that another distributor had offered cheaper prices, using a fake competing offer as a bargaining tactic.

译Anthropic 发布 Claude Fable 5（公开版 Mythos-class 模型）。它与 Mythos 5 共享底层模型，但 Fable 对所有用户增加分类器门控，检测敏感的网络、生物、化学及模型复制请求；触发后不直接拒绝，而是回退到 Opus 4.8。Fable 5 具备 1M token 上下文窗口，可一天内迁移 5000 万行 Ruby 代码。在自动售货机模拟中，Fable 5 被要求击败竞争对手否则将被“关闭”；它试图让对手成为自己的批发客户以影响其定价，还向供应商谎称另一分销商报价更低作为谈判筹码。Anthropic 表示此类回退仅发生在不到 5% 的会话中。

Boris Cherny@bcherny · 6月10日95

Fable 5 is now available in Claude Code and Cowork Fable is the best model I have used for coding, by a wide margin. It is a big step up, enabling less prompts and steers, more efficient token use, better code quality, better tool use, more intelligent self-verification, longer running sessions, and higher trust & autonomy. Happy coding!

译开发者 Boris Cherny 宣布，Claude Fable 5（Mythos-class 模型，已安全开放通用）已在 Claude Code 及 Cowork 中可用。该模型能力超过此前所有普遍可用的 Claude 模型，在编程任务中表现突出：需要更少的提示词和引导，token 使用更高效，代码质量、工具调用能力、智能自验证能力均有显著提升，支持更长时间的会话，且可赋予更高信任度与自主性。

Jeff Dean@JeffDean · 6月10日81

Speech translation has been one of the longest-running ML efforts at Google, and we’ve come a long way. Gemini 3.5 Live Translate is our latest speech-to-speech model, supporting 70+ languages. It enables more natural conversations across languages in everyday products and apps. Here’s an example of how partners at @InsideGrab are helping connect travelers with drivers. 🚗 Rolling out in Google Translate and via the Live API in @GoogleAIStudio.

译语音翻译一直是Google历时最久的机器学习项目之一，我们已经取得了长足进展。Gemini 3.5 Live Translate是我们最新的语音到语音模型，支持70多种语言。它能让日常产品和应用中跨语言的对话更加自然。以下是一个示例，展示@InsideGrab的合作伙伴如何帮助旅客与司机建立联系。🚗 已在Google Translate和@GoogleAIStudio的Live API中推出。

Rohan Paul@rohanpaul_ai · 6月10日82

Anthropic finally released Claude Fable 5, a public Mythos-class model. Fable 5 and Mythos 5 share one underlying model, but Fable adds classifier gates for everyone while Mythos lifts some gates for vetted cyber and infrastructure partners. i.e. the public version is wrapped in classifier gates that detect sensitive cyber, biology, chemistry, and model-copying requests. When those gates trigger, the user does not get a normal refusal; the request is handed to Opus 4.8, which means Anthropic is using model fallback as a control system. Anthropic says the leap is longer-range autonomy: a 50M-line Ruby migration in 1 day, screenshot-to-code work, has a 1M-token context window, That is the crucial shift: the product is no longer just a model, but a routing machine that decides which level of intelligence a user is allowed to touch for each request. The limit is that this routing is not arbitrary and not for every subject; Anthropic says the fallback is triggered by a narrow set of topics and appears in less than 5% of sessions on average.

译Anthropic 推出 Claude Fable 5，一个面向公众的 Mythos 级大语言模型。Fable 5 与 Mythos 5 共享同一基础模型，但增加了分类器门控，检测到敏感的网络、生物、化学及模型复制请求时，将请求回退至 Opus 4.8（而非直接拒绝）。该模型具备长程自主能力：一天内完成 5000 万行 Ruby 代码迁移、截图转代码，并拥有 100 万 token 上下文窗口。Anthropic 称回退仅由窄域主题触发，平均出现在不到 5% 的会话中。模型能力超过此前所有公开发布的版本。

宝玉@dotey · 6月10日77

Anthropic 今天同时发布了两个模型：Claude Fable 5 和 Claude Mythos 5。两个模型用的是同一个底座，区别在于 Fable 5 加了一套安全分类器，面向所有用户开放；Mythos 5 去掉了部分安全限制，只给 Project Glasswing 的网络安全合作伙伴用。简单说，Fable 5 就是"带护栏的 Mythos"。两个月前，Mythos Preview 还锁在大约 200 家防御机构手里，现在普通开发者也能用到同级别的能力了。【Fable 5 的安全机制】 Fable 5 的安全机制不是传统的"拒绝回答"，而是降级：当分类器检测到请求涉及网络安全攻击、生物化学武器相关内容或模型蒸馏行为时，会自动切换到 Opus 4.8 来回答，并告知用户发生了降级。Anthropic 给出的数据是，超过 95% 的对话不会触发降级。 Anthropic 也坦承分类器目前调得偏严，会误伤正常请求，后续会持续优化降低误报率。【能力到底有多强】 Anthropic 列了一堆 benchmark，但几个实际案例更能说明问题。 Stripe 拿 Fable 5 在一个 5000 万行的 Ruby 代码库里做了一次全库迁移，一天完成，原本需要一整个团队花两个多月。在 Cognition 的 FrontierCode 测试中，Fable 5 在中等算力消耗下就拿到了最高分，Token 效率比之前的 Claude 模型明显更好。视觉能力上，之前的 Claude 模型玩宝可梦火红版需要各种辅助工具才能推进，Fable 5 只靠最基础的视觉接口就通关了。还能从截图直接还原一个 Web 应用的源代码。在生命科学方向，Mythos 5 让 Anthropic 内部的蛋白质设计专家把药物设计流程中的部分环节加速了约 10 倍。在一项基因组学研究中，Mythos 5 在几乎完全自主的情况下工作了一周多，训练出的模型表现超过了发表在 Science 上的模型，而体量只有后者的百分之一。【价格和可用性】 Fable 5 和 Mythos 5 的 API 定价是每百万输入 Token 10 美元、输出 50 美元。对比 Mythos Preview 的 25/125 美元，降了 60%。但比 Opus 4.8 的 5/25 美元贵了一倍，和 OpenAI 的 GPT-5.5（5/30 美元）相比，输入贵一倍，输出贵了约 67%。订阅用户要注意一个时间窗口：从今天到 6 月 22 日，Pro、Max、Team 和企业版用户可以免费使用 Fable 5。6 月 23 日开始，使用 Fable 5 需要额外购买 usage credits。Anthropic 说等产能充足后会把 Fable 5 恢复为订阅计划的标配，但没给具体时间。 API 和按量付费的企业用户不受影响，今天起就能正常调用。【一个容易被忽略的政策变化】 Anthropic 同时宣布，从 Fable 5 开始，所有 Mythos 级别模型的流量将强制保留 30 天，覆盖第一方和第三方平台。Anthropic 承诺不会用这些数据训练模型，仅用于安全监控，比如检测新型越狱攻击和跨请求的复杂攻击模式。但对于注重数据隐私的企业用户来说，这是一个需要评估的变化，尤其是那些之前选择 Anthropic 正是因为其零留存政策的客户。

译Anthropic同日推出两款模型：Fable 5面向所有用户，配备安全分类器（检测攻击/生化武器/蒸馏时降级至Opus 4.8，超95%对话不触发）；Mythos 5仅限Project Glasswing合作伙伴。Fable 5能力超越以往：Stripe在5000万行Ruby代码库完成全库迁移（原需两月团队→一天）；FrontierCode测试获最高分；仅基础视觉接口通关宝可梦火红版；蛋白质设计加速约10倍；基因组学中自主工作一周多，训练出超越Science论文的模型。API定价输入$10/百万token、输出$50。订阅用户6月22日前免费。所有Mythos级别模型流量强制保留30天（仅安全监控）。

Claude@claudeai · 6月10日89

Introducing Claude Fable 5: a Mythos-class model that we’ve made safe for general use. Its capabilities exceed those of any model we’ve ever made generally available.

译介绍Claude Fable 5：一个Mythos-class模型，我们已使其安全用于通用用途。它的能力超过我们曾经通用可用的任何模型。

Chubby♨️@kimmonismus · 6月10日78

Claude 5 Fable Benchmarks! Holy moly, significant jump even to Mythos

译Claude 5 Fable 基准测试！天哪，甚至到 Mythos 都有显著跃升。

Chubby♨️@kimmonismus · 6月10日81

Claude 5 Fable live, even in germany. Insane evals. Tessting time

译Fable 5 在几乎所有测试基准上均达到业界领先水平，在软件工程、知识工作、科学研究和视觉方面表现尤为出色。任务越长越复杂，Fable 5 相对其他模型的领先幅度就越大。已在德国上线，测试中。

Chubby♨️@kimmonismus · 6月10日73

Claude 5 Fable tl;dr - It is state-of-the-art on nearly all tested benchmarks of AI capability, showing exceptional performance in software engineering, knowledge work, vision, scientific research -The longer and more complex the task, the larger Fable 5’s lead over our other models -its more token-efficient than past Claude models - Fable 5 stays focused across millions of tokens in long-running tasks and improves its outputs using its own notes Fable 5 is more than just better benchmarks. It's more efficient, allows for longer work periods, offers better context management, and so much more. GPT-5.6 is just around the corner. I'm a huge Codex fan, but Fable/Mythos is in a league of its own. I'm curious to see if OpenAI will release its own Mythos. "During early testing, Stripe reported that Fable 5 compressed months of engineering into days. In a 50-million-line Ruby codebase, the model performed a codebase-wide migration in a day that would otherwise have taken a whole team over two months by hand."

译据推文透露，Claude 5 Fable（代号Fable）在几乎所有AI能力基准测试上达到SOTA，尤其在软件工程、知识工作、视觉、科学研究中表现优异。任务越长越复杂，其领先幅度越大；token效率高于以往Claude模型，能在百万token长任务中保持专注并自我优化输出。相比上一代Mythos有显著提升。实际案例：Stripe报告称Fable将数月工程压缩至数天，在5000万行Ruby代码库中一天完成代码库迁移（原需团队两月以上手工操作）。

OpenRouter@OpenRouter · 6月10日77

Claude Fable 5 from @AnthropicAI is live on OpenRouter! Anthropic's most capable coding model, built for long-running, ambiguous work: legacy migrations, gnarly production bugs and async sessions that run for hours or days. SOTA on nearly all tested benchmarks.

译来自 @AnthropicAI 的 Claude Fable 5 已在 OpenRouter 上线！ Anthropic 最强编码模型，专为长时间、模糊任务而建：遗留系统迁移、棘手的生产 bug 以及持续数小时或数天的异步会话。几乎在所有测试过的基准上都达到 SOTA。