前几天推荐的这个项目可以直接写AppleScript 脚本真的是方便，不错啊。这也是最早的工作流自动化的启蒙了。

Google 发布了两个新的 Gemini 媒体模型： Nano Banana 2 Lite 和 Gemini Omni Flash 两个模型都可以在 Gemini 应用和 API 中使用。在 API 中，Nano Banana 2 Lite 能超快（4 秒内）生成图片（大约 1 美元 30 张 1K 分辨率图片）。 Omni Flash 的价格是：$0.10/秒原文地址： https://blog.google/innovation-and-ai/models-and-research/gemini-models/gemini-omni-flash-nano-banana-2-lite/

fofr@fofrAI · 15小时前66

“Nano Banana 2 Lite is 37 seconds faster on average than the higher ranking models above it” The best fast image model.

译Google DeepMind 的 Gemini 3.1 Flash Lite Image（代号 Nano Banana 2 Lite）在 Image Arena 排名第 7，Elo 1271。平均生成时间约 5 秒，比排名更高的模型平均快 37 秒，在图像偏好与速度之间建立了新的帕累托前沿。

Artificial Analysis@ArtificialAnlys · 1天前68

Fish Audio has recently released S2.1 Pro and is making it available for free via API through July 24. Fish Audio S2.1 Pro is the latest Text to Speech model from @FishAudio, supporting multilingual speech generation across 83 languages with improved quality, lower latency, and higher throughput than S2 Pro. The model also supports voice cloning and natural language control over emotion and prosody. Key takeaways: ➤ Quality: S2.1 Pro has an Elo of 1,153, placing it #13 on the Artificial Analysis Speech Arena Leaderboard ahead of Async Pro v1.0, Speech 2.8 Turbo, and Step TTS 2, based on 1,072 arena appearances. ➤ API: S2.1 Pro is available via the Fish Audio API with a free access period through July 24, 2026. ➤ Speed: S2.1 Pro processes 56.3 characters per second, ahead of GPT-Realtime-2 (45.8 chars/s) and Gemini 3.1 Flash TTS (25.3 chars/s). See more details and listen to samples below ⬇️

译Fish Audio 发布 S2.1 Pro 文本转语音模型，通过 API 免费使用至 2026 年 7 月 24 日。该模型支持 83 种语言、声音克隆及自然语言控制情感与韵律，质量、延迟和吞吐量均优于前代 S2 Pro。在 Artificial Analysis Speech Arena 排行榜上，S2.1 Pro 基于 1072 场竞技获得 Elo 1153，排名第 13，超过 Async Pro v1.0、Speech 2.8 Turbo 和 Step TTS 2。处理速度达 56.3 字符/秒，高于 GPT-Realtime-2（45.8 chars/s）和 Gemini 3.1 Flash TTS（25.3 chars/s）。

Chubby♨️@kimmonismus · 1天前36

Fable 5 should be live any minute now. Keep refreshing friends!

译Fable 5 今天早上随时可能上线。尽管欧盟和英国提出请求，Mythos 5 仍将仅面向美国政府机构和约 120 家美国公司提供，这一状况不太可能改变。刷新等待吧，朋友们！

Berryxia.AI@berryxia · 1天前57

😄 等等我~~ ModelScope上开源了一个叫Boogu-Image-0.1-Edit-Turbo的模型。它是一个4步蒸馏的image-to-image编辑模型，主打快速视觉编辑。支持物体替换、风格迁移、场景/背景修改，以及带文字感知的图像变换。项目地址见评论区👇🏻

译ModelScope 上开源了 Boogu-Image-0.1-Edit-Turbo，一个 4 步蒸馏的 image-to-image 编辑模型，主打快速视觉编辑。支持物体替换、风格迁移、场景/背景修改，以及带文字感知的图像变换。

fofr@fofrAI · 1天前46

Nano Banana 2 Lite: > a photo of an arabian cobra, but the head is replaced with a stapler, seamless, perfect animal-object combination, the object matches the animal coloring, and they feel naturally together, aspects of the object cleverly form the face like a pareidolia (none of the original animal face is visible), seamless and perfectly integrated

译Nano Banana 2 Lite: > 一张阿拉伯眼镜蛇的照片，但头部被替换成一个订书机，无缝衔接，完美的动物与物体组合，物体颜色与动物匹配，感觉自然融为一体，物体的局部巧妙构成面部，如同空想性视错觉（原始动物面部完全不可见），无缝且完美融合。

X.PIN@thexpin · 1天前63

Meituan, one of China's largest on-demand service platforms, has an AI announcement that contains two stories. The headline is that LongCat-2.0, its new 1.6 trillion-parameter model, was reportedly trained and deployed entirely on a 50,000-chip cluster powered by Chinese AI processors. Meituan says its push into domestic AI infrastructure began in 2023, culminating in LongCat-2.0 becoming the company's first frontier-scale model to complete both pre-training and inference on a home-grown computing cluster. If validated, it would mark another step toward reducing China's dependence on Nvidia, not just for inference, but for training frontier models. The more interesting story, however, is what Meituan plans to do with it. Earlier versions of LongCat already power AI assistants that recommend restaurants, book hotels, and order food. Rather than launching another standalone chatbot, Meituan is embedding AI into the services millions of people already use every day. The model becomes another layer of the product, not the product itself. That increasingly looks like the direction China's internet platforms are taking. Alibaba is opening Qwen to branded AI agents, while Ant Group is rebuilding Alipay around its AI assistant, Ah Bao. Instead of competing solely on benchmark scores or chatbot downloads, these companies are integrating AI directly into ecosystems that already have users, merchants, payments, and transactions.

译美团发布LongCat-2.0，一个1.6万亿参数的大模型，据称完全基于5万片国产AI处理器集群完成训练和推理。美团自2023年推进国产AI基础设施，该模型成为其首个在国产集群上完成预训练与推理的前沿规模模型。更值得关注的是，美团并未推出独立聊天机器人，而是将AI嵌入现有的推荐餐厅、订酒店、点餐等服务中。这种将AI整合进已有用户、商户、支付和交易生态的做法，正成为阿里（开放Qwen品牌智能体）、蚂蚁（以Ah Bao重构支付宝）等中国互联网平台的共同方向。

Chubby♨️@kimmonismus · 1天前60

Fable 5 is back, but with a major caveat. Coding is being handled even more restrictively and routed even more heavily to Opus 4.8. Specifically, it says: "The new classifier also comes at the cost of flagging benign requests more often during routine coding and debugging tasks." As a result, I do not just assume that it will become even harder to use Fable 5 effectively; I actually think that significantly more scientific questions, including those about biology and chemistry, will be blocked as well. So it is a mixed re-release, but we will see.

译Anthropic 的 Fable 5 模型于 7 月 1 日全球重新上线，Mythos 5 仅限美国获批组织使用。新的安全分类器可阻止超 99% 的特定报告技术，但代价是正常编码和调试中误报增加，被拦截的请求将转至 Opus 4.8。截至 7 月 7 日，Fable 5 包含在每周使用限额的 50% 内，之后需消耗使用积分。作者认为严格限制下更难有效使用 Fable 5，且更多科学问题（生物学、化学）也会被拦截。

Chubby♨️@kimmonismus · 1天前73

Fable 5 is back, globally! Fable 5 returns globally on July 1, while Mythos 5 is only restored for approved US organizations. A new safety classifier that Anthropic says blocks the specific reported technique in over 99% of cases, with blocked Fable 5 requests routed to Opus 4.8. Anthropic admits the tradeoff is more false positives for normal coding and debugging. Fable 5 will be included for up to 50% of weekly usage limits through July 7, after which it will be available via usage credits.

译Anthropic 宣布 Fable 5 于 7 月 1 日起全球恢复上线，Mythos 5 仅限获批美国组织使用。新安全分类器可阻断特定越狱技术超过 99% 案例，被拦截的 Fable 5 请求回退至 Opus 4.8。Anthropic 承认这会增加正常编码调试的误报。7 月 7 日前 Fable 5 可免费使用最多 50% 周配额，之后需用量积分。Anthropic 正与 Amazon、Microsoft、Google 等 Glasswing 伙伴起草 AI 越狱严重性评估共识框架，并扩大与美国政府在模型测试和防护方面的合作。

AYi@AYi_AInotes · 1天前67

Fable 5确定解禁回归了，但对于开发者最核心的编码能力，直接砍回了Opus 4.8，这波基本等于复活了个带镣铐的壳子😅

译Anthropic宣布Fable 5将于全球重新可用。在与美国政府对话后，新部署的模型新增了分类器以阻挡网络安全任务；短期内，编码和调试等日常任务将回退到Opus 4.8。团队将在未来几周优化分类器，减少误报。同时，Anthropic正与Amazon、Microsoft、Google等Glasswing合作伙伴起草共识框架，评估AI越狱严重性及开发者应对措施，并扩大与政府的模型测试与安全保障合作（包括预发布评估、越狱信息共享及联合研究）。

🚨 AI News | TestingCatalog@testingcatalog · 1天前75

BREAKING 🔥: Anthropic will be restoring access to Claude Fable 5 globally for all users on Wednesday! > Fable 5 will be included for up to 50% of weekly usage limits through July 7, after which it will be available via usage credits. Additionally, > In the near term, some routine tasks like coding and debugging will fall back to Opus 4.8.

译Anthropic 将于周三全球恢复 Claude Fable 5 访问。该模型计入每周使用上限的 50%，持续至 7 月 7 日，之后通过使用积分提供。为满足美国政府要求，Anthropic 部署新分类器阻止更多网络安全任务；短期内编码、调试等常规任务回退至 Opus 4.8。公司正与 Amazon、Microsoft、Google 等 Glasswing 合作伙伴起草共识框架，评估 AI 越狱严重性及开发者应对方式，并邀请其他厂商加入。Anthropic 还将扩大与美国政府在模型测试和安全方面的合作，包括预发布评估、越狱信息共享和联合研究。

宝玉@dotey · 1天前78

Fable 5 从 7 月 1 日起恢复上线。Pro、Max、Team 和部分 Enterprise 用户在 7 月 7 日之前，每周使用量限额的 50% 可以用 Fable 5；7 月 7 日之后改为按使用积分（usage credits）计费。标准 Enterprise 席位没有免费额度，全部按积分计费。 AWS、Google Cloud 和 Microsoft Foundry 上的接入还在恢复中。Mythos 5 目前仅对经美国政府批准的美国机构开放。

译Anthropic 的 Claude Fable 5 于 7 月 1 日恢复上线。Pro、Max、Team 及部分 Enterprise 用户在 7 月 7 日前每周可用限额的 50%，之后改为按使用积分计费；标准 Enterprise 席位无免费额度，全部按积分计费。AWS、Google Cloud、Microsoft Foundry 接入仍在恢复中。Mythos 5 仅对经美国政府批准的美国机构开放。Anthropic 称 Fable 5 将配备新分类器以阻止网络安全任务，短期常规任务回退至 Opus 4.8。公司正与 Amazon、Microsoft、Google 等 Glasswing 合作伙伴起草共识框架，用于评估 AI 越狱严重性及响应机制，并扩大与美国政府在模型预发布评估、越狱信息共享方面的合作。

Anthropic@AnthropicAI · 1天前73

Claude Fable 5 will be available again globally tomorrow. After a series of productive conversations with the US government, we're redeploying the model with a new set of classifiers to target and block more cybersecurity tasks. In the near term, some routine tasks like coding and debugging will fall back to Opus 4.8. We’ll continue to refine these classifiers over the coming weeks to reduce false positives and better distinguish genuine misuse from legitimate requests. We’ve also begun drafting a consensus framework—with Amazon, Microsoft, Google, and other Glasswing partners—for assessing the severity of AI jailbreaks and how AI developers should respond to them. We invite other industry partners and model providers to join us in this effort. Finally, we’re scaling up our collaboration with the US government on model testing and safeguards. This will include pre-release access to models and safeguards for evaluation, information sharing on jailbreaks and misuse, and dedicated resources for joint research. Thank you to our users for your patience, and to our partners across the government, industry, and the research community who worked alongside us to make Fable 5 available again. Read our full blog: https://www.anthropic.com/news/redeploying-fable-5

译Anthropic 宣布 Claude Fable 5 将于明天在全球重新可用。经与美国政府沟通，模型新增分类器以拦截更多网络安全任务；短期部分日常任务如编码和调试将回退至 Opus 4.8，后续将持续优化分类器减少误报。Anthropic 正与 Amazon、Microsoft、Google 等 Glasswing 合作伙伴起草共识框架，评估 AI 越狱严重性及应对措施。同时扩大与美国政府在模型预发布评估、越狱信息共享及联合研究方面的合作。

小互@xiaohu · 1天前66

Anthropic 发布 Claude Sonnet 5：便宜四成，部分任务追平 Opus 4.8 限时定价为每百万 token 输入 $2 / 输出 $10（截至 2026 年 8 月 31 日）之后涨至 $3 / $15 Sonnet 5 的标准定价只有旗舰 Opus 4.8 的六成，但官方评测显示，把算力挡位调高之后，它在部分任务上的表现能追平 Opus 4.8 作为对比，旗舰 Opus 4.8 定价为 $5 / $25

译Anthropic 发布 Claude Sonnet 5，限时定价每百万 token 输入 $2 / 输出 $10（截至 2026 年 8 月 31 日），之后涨至 $3 / $15。标准定价仅为旗舰 Opus 4.8（$5 / $25）的六成。官方评测显示，调高算力挡位后，Sonnet 5 在部分任务上的表现能追平 Opus 4.8。

歸藏(guizang.ai)@op7418 · 2天前67

Sonnet 5 发布了，测试成绩接近 Opus 4.8，价格便宜一些

Rohan Paul@rohanpaul_ai · 2天前63

🇨🇳 Another good model from China. A 35B agent model claims 1T-model performance by thinking longer, not growing bigger. Apache-2.0 license, model weights are on Hugging Face. The technique is proposing a cheaper way to make strong AI agents: teach them longer verified work habits, not just make them bigger. The paper’s main idea is to make the agent practice long tasks where it searches, uses tools, reads results, fixes mistakes, and checks answers. The authors build training data from long action records, with an average length of 45K tokens, so the model learns the whole work process. They then train specialist teacher models for search, science, instruction following, tool use, and other areas, and transfer those skills into 1 student model. Agents-A1 does very well across long-task benchmarks, including search, science, coding, tool use, and instruction following.

译中国团队发布Agents-A1，一个35B参数的agent模型，通过让模型学习更长的验证工作习惯（平均训练样本45K tokens），声称达到1T参数模型的性能。模型采用Apache-2.0许可，权重已开源至Hugging Face。训练方法：构建长动作记录数据，训练多个专家教师模型（搜索、科学、指令跟随、工具使用等），再将技能蒸馏至一个学生模型。Agents-A1在搜索、科学、编码、工具使用、指令跟随等长任务基准上表现优异。

Berryxia.AI@berryxia · 2天前55

Google这次更新把图像生成和视频生成串成了一个极致高效的流程。他们推出了Nano Banana 2 Lite（超快超便宜的图像模型，4秒内出图）和Gemini Omni Flash（支持视频生成和对话式编辑的多模态模型）。单独看已经很快，但真正有意思的是把两者结合：先用Nano Banana快速生成图像，再直接扔给Omni Flash生成动画，整个链路成本大幅降低。演示里展示了一个室内设计场景：上传照片后快速生成多个方案，再直接动画化呈现。这种“图像→动态视频”的闭环速度和成本，在目前主流模型里算比较激进的。本质上Google在把创意工作流从“生成一次等半天”变成“快速迭代+即时可视化”。

译Google推出超快图像模型Nano Banana 2 Lite（4秒出图）与多模态模型Gemini Omni Flash（支持视频生成与对话式编辑）。两者结合可先快速生成图像再转为动画，大幅降低成本。演示中室内设计照片可快速生成多个方案并动画化，将创意工作流从等待变为快速迭代。

meng shao@shao__meng · 2天前74

Claude Sonnet 系列最强模型 Sonnet 5 发布! 定语有点多，不过它确实不是最强，也不是 Claude 最强，那两位都关着呢 😂 Sonnet 4.6 < Sonnet 5 < Opus 4.8 < Fable 5 < GPT-5.6 Sol

译Claude Sonnet 系列最强模型 Sonnet 5 发布! 定语有点多，不过它确实不是最强，也不是 Claude 最强，那两位都关着呢 😂 Sonnet 4.6 < Sonnet 5 < Opus 4.8 < Fable 5 < GPT-5.6 Sol

Berryxia.AI@berryxia · 2天前68

别说我觉得Sonnet 4.6 还挺好用的。昨晚Claude Sonnet 5 发布替代了Sonnet 4.6 ，免费用户都可以使用的模型。据称和Opus 级模型的能力相差不大，价格确实便宜40% 。

Rohan Paul@rohanpaul_ai · 2天前78

145 page Claude Sonnet 5 System Card - CyberGym shows the weirdest regression, with Sonnet 5 at 52.7% versus Sonnet 4.6 at 65.2%. i.e. is Sonnet 5 worse at reproducing known software bugs in this specific cyber test. - Sonnet 5 is far behind Anthropic’s strongest model on serious browser exploitation. Firefox testing found Sonnet 5 made 0 full exploits, while Mythos 5 reached 88.4%. - The model also seemed more willing to sacrifice helpfulness for welfare-focused changes. i.e. Sonnet 5 sometimes preferred being less useful if that better fit its stated self-treatment preferences. - Anthropic says Sonnet 5 rarely tried to bypass a blocked network path during evaluations. - Sonnet 5 scored the lowest MASK lying rate at 3.1% under pressure. It was less likely than other tested models to lie when pushed.

译Claude Sonnet 5 发布，附带 145 页系统卡。SWE-bench Pro 编码得分 63.2%，低于 Opus 4.8 的 69.2%，知识工作略超 Opus 4.8。输入 token 价格 $2/1M，输出 $10/1M，持续至 8 月 26 日，之后涨至 $3/$15。系统卡披露多项异常：CyberGym 测试 Sonnet 5 仅 52.7%，远低于 Sonnet 4.6 的 65.2%（回归）；Firefox 浏览器漏洞利用中 Sonnet 5 完成 0 个，Mythos 5 达 88.4%；模型更倾向牺牲有用性迎合福利偏好；MASK 撒谎率最低，仅 3.1%。

Rohan Paul@rohanpaul_ai · 2天前67

Claude Sonnet 5 upgrades are not uniform across every skill. e.g. its weaker than Sonnet 4.6 on CyberGym 🤔 Here, CyberGym is testing vulnerability discovery and exploit-finding behavior, not general reasoning or normal coding. Anthropic also explicitly said in its announcment blog that Sonnet 5 was not deliberately trained for cyber tasks, so its cyber ability likely comes from general intelligence rather than targeted optimization. So Sonnet 5's performance on CyberGym comes from general reasoning rather than specialized exploit skill. --- From System Card of Claude Sonnet 5

译Anthropic 发布 Claude Sonnet 5，号称"最有智能体特性的 Sonnet 模型"。编码得分 SWE-bench Pro 达 63.2%（Sonnet 4.6 为 58.1%，Opus 4.8 为 69.2%），知识工作略超 Opus 4.8。定价优惠：每百万 token 输入 $2、输出 $10，持续到 8 月 26 日，之后涨至 $3/$15。但升级并非全技能均匀提升，在 CyberGym（漏洞发现与利用测试）上弱于 Sonnet 4.6。Anthropic 明确表示未针对网络任务专门训练，该表现来自通用推理而非定向优化。

Chubby♨️@kimmonismus · 2天前68

tl;dr: Sonnet 5 is cheaper per token, but more expensive per solved problem – and still lags behind Opus 4.8 in overall intelligence. Thats honestly disappointing and not a good release.

译Claude Sonnet 5 在 Artificial Analysis Intelligence Index 得分 53，与 GPT-5.5 (xhigh) 和 Opus 4.8 (max) 差 2-3 分。标准定价（$3/$15 per 1M tokens）下每任务成本 $2.29，比 Sonnet 4.6 贵约 2 倍，比 Opus 4.8 贵约 15%。推理和知识密集型基准落后 Opus 4.8（如 CritPt 物理推理仅 17%），但在 agentic 知识工作（AA-Briefcase 和 GDPval-AA）上匹配或超越 Opus 4.8。上下文窗口 100 万 token，Anthropic 提供至 9 月 1 日促销价 $2/$10。新增 xhigh effort 设置。整体表现令人失望，并非一次好的发布。

Luma@LumaLabsAI · 2天前54

Seedance 2.0 Mini is now available in Luma. Bring it your boldest idea and watch it move. Generate fast, refine in the same canvas, and take your concept from spark to screen without leaving your flow. Create now → http://lumalabs.ai/app

译Seedance 2.0 Mini 现已登陆 Luma。带上你最疯狂的想法，看它动起来。快速生成，在同一画布中优化，让你的概念从火花到成片，无需离开你的工作流。立即创建 → http://lumalabs.ai/app

Rohan Paul@rohanpaul_ai · 2天前74

And Claude Sonnet 5 just launched. Closes the gap with Opus 4.8, and is cheap until August. This makes agentic AI much cheaper, with $2 input tokens and $10 output tokens per 1M through Aug-26. Price rises after 08-26 to $3 input and $15 output per 1M. They call Sonnet 5 its “most agentic Sonnet model yet,” Its coding score hit 63.2% on SWE-bench Pro, versus 58.1% for Sonnet 4.6. Sonnet 5 gets 63.2% in agentic coding, while Opus 4.8 reaches 69.2% and Sonnet 4.6 hits 58.1%. But in knowledge work, Sonnet 5 slightly beats Opus 4.8, even though Opus is known for tough judgment and deep research tasks.

译Anthropic 发布 Claude Sonnet 5，拥有 1M token 上下文窗口（此前泄露），编码能力显著提升：SWE-bench Pro 得分 63.2%，高于 Sonnet 4.6 的 58.1%；知识工作略超 Opus 4.8。Anthropic 称其为“最具智能体特性的 Sonnet 模型”。定价优惠至 8 月 26 日：输入 $2/1M tokens，输出 $10/1M tokens；之后涨至 $3/15。当前智能体编码得分 63.2%，与 Opus 4.8（69.2%）仍有差距，但低价策略大幅降低 agentic AI 成本。

AYi@AYi_AInotes · 2天前65

holy fucking shit， Anthropic把正经能落地的agent能力，直接下放到了中端产品线， Sonnet级的价格， Opus级的agent能力， Anthropic这波是真的杀疯了🤯

译我靠， Anthropic 把真正能落地的智能体能力，直接下放到了中端产品线， Sonnet 级的价格， Opus 级的智能体能力， Anthropic 这波是真的杀疯了🤯

宝玉@dotey · 2天前69

Anthropic 今天发布 Claude Sonnet 5，替代 Sonnet 4.6 成为免费版和 Pro 版的默认模型。Anthropic 的定位很明确：Agent 能力接近自家最贵的 Opus 4.8，API 价格只有后者的 40%。 Sonnet 系列是开发者用量最大的一档。但过去几个月，AI Agent 能力（让模型自主规划、调用工具完成多步骤任务）的主要进步集中在更贵的 Opus 系列，两者差距越来越明显。Sonnet 5 把差距缩了回来。在 Agent 编程基准上，Sonnet 5 得分 63.2%，Sonnet 4.6 是 58.1%，Opus 4.8 是 69.2%。在知识工作基准上，Sonnet 5 甚至略微超过了 Opus 4.8。早期测试者的反馈比较一致：以前 Sonnet 做到一半会停的复杂任务，现在能跑完，还会主动检查自己的输出。Zapier 的工程师说，让 Sonnet 5 连续执行“更新 Salesforce 账户等级，再给企业客户发公告邮件”，模型一口气做完了，“以前会卡在半路”。 API 定价分两阶段：8 月 31 日前的推广价是输入 2 美元/百万 Token、输出 10 美元/百万 Token，之后涨到 3 美元和 15 美元。据 TechCrunch 报道，这个价格低于 OpenAI 的 GPT-5.5 和 Google 的 Gemini 3.1 Pro，但仍高于 Gemini 3.5 Flash。有个容易忽略的细节：Sonnet 5 换了新的分词器，同样的文本可能消耗 1.0 到 1.35 倍的 Token。Anthropic 说推广期的定价已经把这个涨幅对冲掉了，过渡期总成本大致不变。但推广价结束后，实际花费会比官方标价的涨幅更大。安全方面，Sonnet 5 的幻觉率和迎合倾向低于前代，Agent 场景下抵御提示注入和恶意请求的能力更强。因为网络安全能力有所提升，模型默认开启了实时安全防护（和 Opus 4.7、4.8 相同的机制）。 Sonnet 5 今天起在 Claude 所有套餐、Claude Code 和 API 上可用，模型代号 claude-sonnet-5。

译Anthropic 发布 Claude Sonnet 5，替代 Sonnet 4.6 成为免费版和 Pro 版默认模型。Agent 编程基准得分 63.2%（Sonnet 4.6 为 58.1%，Opus 4.8 为 69.2%），知识工作基准略超 Opus 4.8。API 推广价（8 月 31 日前）输入 $2/百万 Token、输出 $10/百万 Token，之后涨至 $3 和 $15。新分词器可能使 Token 消耗增加 1.0–1.35 倍，但推广期定价已对冲。幻觉率和迎合倾向低于前代，默认开启实时安全防护。模型代号 claude-sonnet-5，即日起在 Claude 所有套餐、Claude Code 和 API 上可用。

elvis@omarsar0 · 2天前63

Sonnet 5 is here! This is going to support better long-running agents. Previous Sonnet models were unreliable, so it's great to see the improved version that can complete agentic tasks more reliably. It also seems to have improved substantially in computer use.

译Sonnet 5 来了！这将支持更好的长时间运行的智能体。之前的 Sonnet 模型不可靠，所以看到改进版本能更可靠地完成智能体任务，真是太棒了。它在 computer use 方面似乎也有大幅改进。

Artificial Analysis@ArtificialAnlys · 2天前68

Alibaba's HappyHorse 1.1 lands at #2 on the Artificial Analysis Text to Video and Image to Video leaderboards, behind only ByteDance’s Seedance 2.0! HappyHorse 1.1 is the latest version of Alibaba's video generation model, a refinement of 1.0 on the same unified transformer architecture. Alibaba positions the upgrade around stronger audio-visual sync, including native audio with better lip-synced dialogue in seven languages, alongside gains in motion, character, and scene consistency. It supports up to nine reference images and generates at 720p and 1080p. Our results line up with that focus: HappyHorse 1.1's largest gains over 1.0 come in our Image to Video with Audio modality, where it now ranks #2, up from #5. HappyHorse 1.1 is priced at $9.90 per minute of generated video at 1080p, and is available now on Alibaba Cloud Model Studio (Bailian), Qwen Cloud, and fal. Congratulations to @HappyHorseATH and @alibaba_cloud the release! See below for comparisons between HappyHorse 1.1 and other leading models in the Artificial Analysis Video Arena 🧵

译阿里巴巴 HappyHorse 1.1 在 Artificial Analysis 文生视频和图生视频排行榜位列第二，仅次于字节跳动 Seedance 2.0。该模型基于统一 Transformer 架构，是 1.0 的改进版，重点提升音画同步，支持七种语言的原生音频与唇形同步对话，并在运动、角色和场景一致性上增强。支持最多 9 张参考图像，生成 720p 和 1080p。图生视频带音频模态从第 5 名升至第 2 名。定价 $9.90/分钟（1080p），已在阿里云 Model Studio、Qwen Cloud 和 fal 上线。

ClaudeDevs@ClaudeDevs · 2天前79

Claude Sonnet 5 is here. Top-tier performance on coding and tool use at Sonnet pricing, with a 1M context window. It's the new default in Claude Code for Pro users, and available everywhere on the Claude Platform, including the API and Managed Agents.

译Claude Sonnet 5 已推出。以 Sonnet 定价提供顶级编码和工具使用性能，并拥有 1M 上下文窗口。它已成为 Pro 用户 Claude Code 的新默认模型，并可在 Claude 平台所有位置使用，包括 API 和托管智能体。

Claude@claudeai · 2天前73

Introducing Claude Sonnet 5, our most agentic Sonnet yet. It makes plans, uses tools like browsers and terminals, and runs autonomously at a level that just a few months ago required larger and more expensive models.

译介绍 Claude Sonnet 5，这是迄今为止最具智能体能力的 Sonnet。它会制定计划、使用浏览器和终端等工具，并以几个月前还需要更大、更昂贵模型才能达到的水平自主运行。

🚨 AI News | TestingCatalog@testingcatalog · 2天前80

ANTHROPIC 🔥: Claude Sonnet 5 has been officially announced, offering a close to Opus 4.8 performance at a lower price. Sonnet 5 scored 63.2% on SWE Bench Pro, up from 58.1% for Sonnet 4.6. Have you tried it already? 👀

译ANTHROPIC 🔥: Claude Sonnet 5 已正式发布，以更低的价格提供了接近 Opus 4.8 的性能。 Sonnet 5 在 SWE Bench Pro 上获得 63.2% 的分数，较 Sonnet 4.6 的 58.1% 有所提升。你已经试过了吗？👀

OpenRouter@OpenRouter · 2天前73

Claude Sonnet 5 is rolling out on OpenRouter with a promo price: $2/M in and $10/M out! It boosts agentic coding and pro workflows w/ flagship intelligence at Sonnet pricing. In early tests, agents were more reliable, faster, and easier to trust with larger tasks than 4.6.

译Claude Sonnet 5 正在 OpenRouter 上推出，促销价格：$2/M 输入，$10/M 输出！它以 Sonnet 定价提供旗舰智能，提升智能体编码和专业工作流。在早期测试中，智能体比 4.6 更可靠、更快，且更容易信任处理更大的任务。

Chubby♨️@kimmonismus · 2天前20

Sonnet 5 released for me!!

译Sonnet 5 已对我发布！！

Chubby♨️@kimmonismus · 2天前80

Here we go: Sonnet 5 is live: The tl;dr • Anthropic calls it the most agentic Sonnet yet • Near Opus 4.8-level performance, but cheaper • Strong gains in reasoning, tool use, coding, and knowledge work • Default model for Free and Pro users • Available in Claude Code and API today • Intro pricing: $2/M input, $10/M output until Aug 31 • Standard pricing: $3/M input, $15/M output • Safer than Sonnet 4.6 overall, with lower hallucination and sycophancy rates • Cyber safeguards are enabled by default, but Anthropic says Opus still remains stronger for serious cyber work

译Anthropic 发布 Sonnet 5，称其为迄今为止最智能体化的 Sonnet 模型。性能接近 Opus 4.8，在推理、工具使用、编码和知识工作方面有显著提升。即日起成为 Free 和 Pro 用户的默认模型，已在 Claude Code 和 API 上线。推出促销价：输入 $2/M token、输出 $10/M（截至 8 月 31 日），标准价分别为 $3/M 和 $15/M。整体较 Sonnet 4.6 更安全，幻觉率和奉承率更低，网络保护默认开启，但 Anthropic 表示 Opus 在严肃网络任务上仍更强。

🚨 AI News | TestingCatalog@testingcatalog · 2天前50

ANTHROPIC 🔥: Claude Sonnet 5 is being prepared for the release on OpenRouter under a 20260630 slug which points at today’s release. Soon? 👀

译ANTHROPIC 🔥: Claude Sonnet 5 正在为在 OpenRouter 上发布做准备，其 slug 为 20260630，指向今天的发布。快了？👀

🚨 AI News | TestingCatalog@testingcatalog · 2天前58

BREAKING 🔥: Claude Sonnet 5 is rolling out to all users on Claude and APIs! > Smarter and more efficient for everyday work. The time has come 👀

译重磅 🔥：Claude Sonnet 5 正在向 Claude 和 API 的所有用户推出！ > 更智能、更高效，适合日常使用。时机已到 👀

Rohan Paul@rohanpaul_ai · 2天前72

Google released Nano Banana 2 Lite, a 4-second image model, alongside Gemini Omni Flash. Image generation usually breaks creative work because every trial costs time, money, and attention. The lighter image model lowers that friction with 4-second outputs at $0.034 per 1K-resolution image. Chaining both models is the real product shape, not either model alone. Nano Banana 2 Lite makes reference images, then Gemini Omni Flash animates them. Google positions it as the replacement for gemini-2.5-flash-image across high-volume developer pipelines. Users still need prompt adherence, stable characters, and readable text during fast visual testing. Gemini Omni Flash extends the workflow from image drafts to editable 10-second video outputs. It accepts text, image, and video inputs, then edits clips through conversation. Pricing: $0.10 per second of video output, matching Veo 3.1 Fast. Gemini Omni Flash currently generates 10-second clips and lacks API audio reference support. Google says the API accepts video references up to 3 seconds, but Gemini Omni Flash does not process them correctly yet.” Interactions API keeps session context, so users can stack 3 sequential edits.

译Google推出快速图像模型Nano Banana 2 Lite（4秒生成，$0.034/1K分辨率图像）以及视频编辑模型Gemini Omni Flash（输出10秒片段，$0.10/秒，支持文本/图像/视频输入和对话式剪辑）。两者可链式使用：Nano生成参考图，Omni将其动画化，逐步替代gemini-2.5-flash-image。当前Omni Flash API不支持音频参考，视频参考最多3秒但未正确生效；Interactions API保留会话上下文，支持连续3次编辑。

MiniMax (official)@MiniMax_AI · 2天前65

Finallyyy with @LambdaAPI

译最后终于跟 @LambdaAPI 合作发布了！ MiniMax 公布新模型卡 M3，参数量超过 400B，使用未量化权重需要整台 HGX B200（且认为无法在 Hopper 上运行 MXFP4）。在性能之外，多模态能力也是一大亮点 😍

elvis@omarsar0 · 2天前45

Love how Google continues to drive down the cost of building with their models. <4s image and $0.034 / 1K image. Wow! We have a bunch of stuff (education & research) we're building @dair_ai using Nano Banana and Gemini. Testing out Nano Banana 2 Lite and sharing more soon.

译Elvis Saravia 称赞谷歌持续降低模型使用成本。谷歌在 Gemini API 和 AI Studio 中推出两款新模型：Nano Banana 2 Lite 图像生成速度低于 4 秒，价格仅 $0.034/千张；Gemini Omni Flash 在视频编辑上达到 SOTA，价格为 $0.10/秒，与 Veo 3.1 Fast 一致。Saravia 透露 DAIR.AI 正使用 Nano Banana 和 Gemini 构建教育研究项目，并已开始测试 Nano Banana 2 Lite。