OpenAI’s new GPT-5.5-Cyber just beat Mythos 5 on CyberGym. CyberGym measures whether an agent can reproduce known software vulnerabilities, so this is quite a strong signal for defensive vulnerability analysis of models. OpenAI also launched a major push to use GPT-5.5-Cyber and human security teams to fix open source bugs before AI bug-hunting tools flood maintainers with low-quality reports. Vulnerability discovery is becoming much easier, so the scarce part is now remediation, which means confirming the bug, proving reachability, writing a fix, testing it, and giving humans enough evidence to merge safely. OpenAI’s initiative is to use GPT-5.5-Cyber as a defensive security worker inside Codex. It scans code, checks whether a vulnerability is real and reachable, writes a patch, tests the patch, and gives humans evidence to approve it. Daybreak is OpenAI’s new cybersecurity initiative to help trusted defenders find, verify, and patch vulnerable software much faster using AI. The new checkpoint of GPT-5.5-Cyber, are all part of the company's limited “Trusted Access for Cyber” program and do not involve a public release.

译OpenAI 新模型 GPT-5.5-Cyber 在 CyberGym 基准上击败 Mythos 5，该基准测试 AI 智能体复现已知软件漏洞的能力，对防御性漏洞分析是强信号。OpenAI 同步扩大 Daybreak 计划，包括：Codex Security 插件（在 Codex 内发现、验证并修复漏洞）；GPT-5.5-Cyber 完整版（供受信任防御者使用）；Cyber Partner Program（赋能安全公司构建基于 OpenAI 能力的安防产品）；Patch the Planet（与维护者合作保护关键开源项目）。本轮模型和计划属于“Trusted Access for Cyber”项目，不公开发布。OpenAI 旨在用 GPT-5.5-Cyber 作为 Codex 内的防御性安全工人，自动扫描代码、确认漏洞真实可达、编写补丁并测试，

Tibo@thsottiaux · 6月23日57

Let's Patch The Planet. Updates to codex security and a new GPT-5.5-Cyber. A day of celebration for cyber defense acceleration. https://openai.com/index/daybreak-securing-the-world/

译Let's Patch The Planet. Codex 安全更新和新 GPT-5.5-Cyber。网络防御加速的庆祝日。

elvis@omarsar0 · 6月23日52

Guess which is Fugu Ultra? This is how recent models compare when generating endless procedural terrain (using Three.js). All of these are one-shotted! Just wild! Trying a few more examples. Will share soon!

译Sakana AI 推出 Fugu 多智能体编排系统，通过单个模型 API 即可访问。其 'Fugu Ultra' 模型性能匹配 Fable 和 Mythos，提供前沿能力且无出口管制风险。在生成程序化地形（Three.js）的对比中，Fugu Ultra 在一次生成（one-shotted）下表现突出。更多示例即将分享。

Sam Altman@sama · 6月23日45

We want to help all companies be secure, working with the USG and the security ecosystem. *The full version of GPT-5.5-Cyber is here; state of the art performance on CyberGym. *Patch The Planet and Codex Security will help solve security problems instead of just finding them.

译我们希望帮助所有公司变得安全，与美国政府和安全生态系统合作。 *GPT-5.5-Cyber完整版已发布；在CyberGym上达到最先进性能。 *Patch The Planet 和 Codex Security 将帮助解决安全问题，而不仅仅是发现它们。

Berryxia.AI@berryxia · 6月23日66

这速度真特么离谱啊！卧槽！最新开源的Unlimited-OCR能一次性处理几百页文档，而且速度还很稳。而这个模型来自百度刚刚在hugging face 发布，其核心创新是R-SWA（Reference Sliding Window Attention）。它让模型在解码时KV Cache保持恒定，不会随着文档页数增加而爆炸式增长。结果就是：一张图或者一本多页PDF，直接扔进去就能一次性解析完，速度和稳定性都比传统逐页处理的方式好很多。在OmniDocBench上拿到了93分，比DeepSeek-OCR高出6个百分点。这已经不是简单的准确率提升，而是把长文档OCR的工作流从“分块+外部调度器拼接”变成了真正的端到端一镜到底。以前做多页文档最头疼的就是上下文断裂和格式不一致，现在模型能一次性看到整篇文档的结构、布局和逻辑关系，输出质量自然上了一个台阶。这其实是把OCR从“认字工具”往“长文档理解引擎”又往前推了一大步。技术路线很清晰，也很实用。果然百度现在OCR独树一帜，遥遥领先了。模型地址见评论区～ 👇

译百度PaddlePaddle在HuggingFace发布Unlimited-OCR，核心创新R-SWA（Reference Sliding Window Attention）使解码时KV Cache保持恒定，避免随页数爆炸。该模型可一次性处理数百页文档，速度和稳定性优于逐页处理。在OmniDocBench上得分93%，比DeepSeek-OCR高出6个百分点。这使长文档OCR从“分块+拼接”变为端到端一镜到底，直接理解整篇文档结构与布局。

Nathan Lambert@natolambert · 6月22日56

GLM-5.2 should be “DeepSeek moment” for agents. We enter a new world where the top end of agentic capabilities are available in open models. If you care about open, now is the time to inform regulators on how we should build a world with safe, frontier, open intelligence.

译GLM-5.2 应该是智能体的“DeepSeek 时刻”。我们进入一个新世界，开放模型中拥有了顶尖智能体能力。如果你关心开放，现在就是向监管者说明我们应该如何构建一个安全、前沿、开放智能世界的时候。

Chubby♨️@kimmonismus · 6月22日55

It looks like we’re getting a whole range of new GPT models this Thursday: GPT-5.6, 5.6 Pro, and a new bidirectional voice model. Initial tests of the voice model were outstanding, this is exactly what I had hoped for two years ago!

译据X用户Kim消息，本周四将发布多个新GPT模型，包括GPT-5.6、5.6 Pro以及双向语音模型GPT-Bidi-1。早期测试显示语音模型表现卓越。引用推文指出，5.6 Pro在正确提示词下可完成任意任务，GPT-Bidi-1知识截止于2025年8月，自GPT-4o时代以来备受期待。其余GPT-5.6模型此前以kindle alpha版本测试，预计将推出新checkpoint。

Chubby♨️@kimmonismus · 6月22日38

It seems the first tests with Sonnet 5 are already underway. If this is confirmed, we're in for a great release!

译Sonnet 5 首次亮相。模型速度极快，且未使用参考图。看来下周会很忙。Kim 评论称，若测试确认，这将是一次很棒的发版。

Alibaba Cloud@alibaba_cloud · 6月22日48

🚀 Introducing HappyHorse 1.1 — now officially live on Alibaba Cloud Model Studio! All HappyHorse 1.1 capabilities are available via API, providing enterprise customers and developers with a complete integration solution. This release delivers production-ready video synthesis systematically optimized across core content generation scenarios. 🔥 Launch Promotion: Enjoy a 40% OFF sitewide discount for the first 2 weeks! Optimize your integration costs today.

译🚀 推出 HappyHorse 1.1 — 现已正式在阿里云模型工作室上线！所有 HappyHorse 1.1 功能均可通过 API 获取，为企业客户和开发者提供完整的集成解决方案。此次发布带来了生产级视频合成，已在核心内容生成场景中系统优化。 🔥 发布促销：前两周享受全场 40% 折扣！立即优化您的集成成本。

🚨 AI News | TestingCatalog@testingcatalog · 6月22日64

BREAKING 🔥: Sakana AI announced the Sakana Fugu and Sakana Fugu Ultra systems, which perform on par with Claude Fable 5 and Mythos 5 across many benchmarks. > Sakana AI is an AI lab from Japan, and Fugu is an orchestration model trained to operate other LLMs. > It is available as an API but not yet accessible in the EEA region. That's a natural evolution. Orchestration multi-model systems will outperform single-model systems, and they will become much more accessible for smaller labs and companies to build. Big players will have to consider building orchestrating systems that rely on models built by competitors. It is already happening at Meta, Apple, and Microsoft, and will likely catch Google, Anthropic, and OpenAI as well eventually.

译Sakana AI 宣布推出 Fugu 和 Fugu Ultra 系统。Fugu 是一个多智能体编排模型，训练用于操控其他 LLM，通过单一模型 API 访问。其中 Fugu Ultra 在多项基准测试中性能匹敌 Claude Fable 5 和 Mythos 5，并宣称提供前沿能力且规避出口管制风险。该系统目前通过 API 提供服务，但暂不支持 EEA 地区。推文指出，编排式多模型系统将超越单一模型，使小型实验室和企业更易构建，并已促使 Meta、Apple、微软等巨头考虑采用竞争对手的模型搭建编排系统。

Rohan Paul@rohanpaul_ai · 6月21日50

The video where @mntruell ( Michael Truell, co-founder and CEO of Cursor) announced Cursor’s new Composer model at Compile: Cursor now has 10 to 20X more compute than they previously had, allowing them to train this GPT-size model from scratch.

译@mntruell（Michael Truell，Cursor联合创始人兼CEO）在Compile上宣布了Cursor的新Composer模型： Cursor现在的算力是此前的10到20倍，让他们能够从头训练这个GPT规模的模型。

Chubby♨️@kimmonismus · 6月21日67

Even the Vercel CEO is impressed/shocked at how good GLM-5.2 in coding is. open source, open weights.

译就连 Vercel CEO 都对 GLM-5.2 在编码上的出色表现感到印象深刻/震惊。开源，开放权重。

Chubby♨️@kimmonismus · 6月21日44

I have a feeling that GPT-5.6 will be a big, positive surprise. Let's recall the information on GPT-5.6: "The company is separately preparing to release a new AI model, codenamed 5.6, which will be a “meaningful improvement” over the current flagship, GPT-5.5, OpenAI’s chief scientist, Jakub Pachocki, wrote in a message to staff."

译我预感 GPT-5.6 会是一个巨大的正面惊喜。让我们回忆一下关于 GPT-5.6 的信息： “该公司正单独准备发布一款新的 AI 模型，代号为 5.6，它将是当前旗舰模型 GPT-5.5 的‘有意义的改进’，OpenAI 首席科学家 Jakub Pachocki 在一份给员工的备忘录中写道。”

小互@xiaohu · 6月19日65

豆包实时语音模型3.0 API 上线看演示还是很牛P的，能干不少事情了全双工：能同时听和说，像真人聊天那样可以随时插话端到端：语音进、语音出，不进行转录，更快、更自然。精准遵循 + 适时参与：你可以一句话给它定规矩，比如多人聊天时说「现在先别出声，聊到世界杯时再加入」，它就安静待命，等话题真到了再主动接话最关键的一步升级：它支持自定义工具，能在实时对话里直接调用工具完成任务，预定日历、发邮件、总结文档、发起查询，一句话语音就在对话流里办完。这等于从「语音助手」往「语音 Agent」迈了一步

译豆包实时语音模型3.0 API正式上线。支持全双工（同时听和说，可随时插话）和端到端（语音进、语音出，无转录），交互更快速自然。具备精准遵循指令能力，如设定“先不出声，聊到世界杯再加入”后安静待命。关键升级是支持自定义工具，可在实时对话中直接调用工具完成任务（预定日历、发邮件、总结文档、发起查询等），从“语音助手”向“语音 Agent”迈进。

Z.ai@Zai_org · 6月19日54

Long-horizon is more than a concept. It should live in real-world scenarios, empowering AI builders to solve the problems that matter. And more scenarios are on the way.

译智谱 GLM-5.2 在内部 35 项挑战性移动开发任务（共 70 次试验）中完成率达 48/70，较 GLM-5.1 的 21/70 提升超两倍；同期 Claude Fable 5 为 56/70。主推文指出长程能力应落地真实场景，更多场景即将推出。

Chubby♨️@kimmonismus · 6月19日45

Nice, sounds like next thursday is gonna be big: GPT-5.6 release incoming

译不错，看来下周四将有大动作：GPT-5.6 即将发布

歸藏(guizang.ai)@op7418 · 6月19日31

GPT-5.6 快来了

译OpenAI 正在准备 GPT-5.6 模型系列的发布，测试中已发现 GPT-5.6-Pro。很快就能看到。

xAI@xai · 6月19日66

Grok TTS delivers the most human-like speech

译xAI 的 Grok TTS 模型在 @Vapi_AI 的 Humanness Index 盲测中以 96 分（真人 100 分）位居榜首。该指数选取同一声音和引文，经各模型克隆后由听众盲评。

🚨 AI News | TestingCatalog@testingcatalog · 6月19日40

OPENAI 🔥: GPT-5.6 and GPT-5.6-Pro models may potentially arrive as soon as next week. Really soon 👀

译OPENAI 🔥: GPT-5.6 和 GPT-5.6-Pro 模型可能最快下周就会到来。非常快 👀

🚨 AI News | TestingCatalog@testingcatalog · 6月19日45

OPENAI 🔥: GPT-5.6 model family is being prepared for the upcoming release, as GPT-5.6-Pro has been spotted in testing. Soon 👀

译OPENAI 🔥：GPT-5.6 模型系列正在为即将到来的发布做准备，因为 GPT-5.6-Pro 已在测试中被发现。很快 👀

AYi@AYi_AInotes · 6月19日74

把 1.5TB 的模型剁掉 84% 的体积，塞进本地跑，还剩 82% 的功力，这就是GLM-5.2，最强开源模型，现在缩骨到了 238GB，256GB 的 Mac 或者同档 RAM/VRAM 的机器就能带起来了技术博客：http://z.ai/blog/glm-5.2 权重：http://huggingface.co/zai-org/GLM-5.2 API：https://docs.z.ai/guides/llm/glm-5.2 编码计划：http://z.ai/subscribe

译GLM-5.2 发布开源权重，MIT 许可。原 1.5TB 模型经 84% 压缩至 238GB，可在 256GB Mac 或同档硬件本地运行，保留 82% 性能。拥有 1M 上下文窗口，编码和智能体任务显著提升。提供两种推理力度：GLM-5.2 (max) 极限推理，GLM-5.2 (high) 平衡性能与 token 效率。API 定价与 GLM-5.1 相同。

SenseTime@SenseTime_AI · 6月18日43

Speed matters — so we built an 𝟴-𝘀𝘁𝗲𝗽 𝗱𝗶𝘀𝘁𝗶𝗹𝗹𝗲𝗱 𝗟𝗼𝗥𝗔 of 𝗦𝗲𝗻𝘀𝗲𝗡𝗼𝘃𝗮-𝗨𝟭-𝟴𝗕-𝗠𝗼𝗧-𝗜𝗻𝗳𝗼𝗴𝗿𝗮𝗽𝗵𝗶𝗰 for you. ⚡️ 𝟭𝟮.𝟱𝘅 𝗶𝗻𝗳𝗲𝗿𝗲𝗻𝗰𝗲 𝘀𝗽𝗲𝗲𝗱𝘂𝗽 🎨 Infographic quality mostly on par with the base model Get started with SenseNova-U1-8B-MoT-Infographic-LoRA-8step-V1.0: 💻Github: https://github.com/OpenSenseNova/SenseNova-U1/blob/main/docs/base_vs_distill.md#run-base-and-distilled-model 🤗https://huggingface.co/sensenova/SenseNova-U1-8B-MoT-LoRAs/blob/main/SenseNova-U1-8B-MoT-Infographic-LoRA-8step-V1.0.safetensors 👾Discord: http://discord.gg/BuTXPHmQub

译商汤推出 SenseNova-U1-8B-MoT-Infographic 模型的 8-step 蒸馏 LoRA（SenseNova-U1-8B-MoT-Infographic-LoRA-8step-V1.0），实现 12.5 倍推理加速，信息图（infographic）生成质量基本与基模型持平。模型权重已开源至 HuggingFace，GitHub 提供使用文档。

Chubby♨️@kimmonismus · 6月18日47

Anthropics founder and co founder are working hard to get fable 5 back for everyone. Looking good, security issues are being addressed. Via Bloomberg

译Anthropic 的创始人和联合创始人正在努力让 Fable 5 重新为所有人可用。看起来不错，安全问题正在解决。Via Bloomberg

Alibaba Cloud@alibaba_cloud · 6月18日45

See Qwen‑Robot Suite in action! 🤖 Bridging language and physical action, Qwen‑RobotNav, Qwen‑RobotManip, and Qwen‑RobotWorld redefine robotics with seamless instruction generalization and adherence to physical laws.

译看看 Qwen-Robot Suite 的实际表现吧！🤖 桥接语言与物理动作，Qwen-RobotNav、Qwen-RobotManip 和 Qwen-RobotWorld 通过无缝的指令泛化与遵循物理定律，重新定义了机器人技术。

🚨 AI News | TestingCatalog@testingcatalog · 6月18日64

Catnip has introduced MaineCoon, a new real-time interactive audio-visual model that puts a live AI character on screen. > This is a 22B streaming model built for real-time processing, that keeps the character alive rather than pausing to render. > The first frame lands in under a second, and the generation runs up to 7x faster than existing audio-visual models, holding around 47.5 FPS on a single H100.

译Catnip 发布 MaineCoon，一款 22B 参数的流式实时交互音频-视觉模型，可在屏幕上呈现活生生的 AI 角色。首帧生成不到 1 秒，推理速度达 47.5 FPS（单张 H100），比现有音视频模型快 7 倍。该模型支持无限时长交互，强调 AI 持续在场而非轮流回复，旨在将被动视频升级为实时 AI 存在感。

SemiAnalysis@SemiAnalysis_ · 6月18日60

Great work to @vllm_project team and @NVIDIA on smooth, out-of-the-box day 0 @MiniMax_AI M3 experience with @inferact EAGLE3 spec decode. Here are the details of ongoing M3 workstream: NVIDIA, Inferact and SemiAnalysis are working hard on enabling disaggregated inferencing (PR 45879), and the Inferact team is working on enabling FlashInfer M3 MoE kernels (PR 45723). Performance should be much better once those PRs land. Huge shoutout to @rogerw0108 & @mgoin_ and the maintainers for the rapid review and mentorship here!

译vLLM 团队与 NVIDIA 合作，为 MiniMax M3 模型提供开箱即用的 day 0 体验，并集成 Inferact 的 EAGLE3 推测解码。当前工作包括：NVIDIA、Inferact 与 SemiAnalysis 推动拆分推理（PR 45879），Inferact 团队启用 FlashInfer M3 MoE 内核（PR 45723），落地后性能将显著提升。NVIDIA 表示 M3 已加入 DeepSeek V4 和 Kimi-K2.6 等前沿开放智能体模型行列。NVIDIA Blackwell Ultra 在 M3 上比 Hopper 实现最高 5 倍 AI 工厂吞吐量，并超过 300 TPS/user。未来通过优化内核、NVFP4 及 NVIDIA Dynamo 拆分推理等，性能有望进一步提升。

Chubby♨️@kimmonismus · 6月18日40

Holy Sh*t: Seedance 2.5 coming early July. And still no text-to-video model has even come close to Seedance.

译Holy Sh*t: Seedance 2.5 七月初发布。并且仍然没有任何文生视频模型能接近 Seedance。

Artificial Analysis@ArtificialAnlys · 6月17日65

Soniox has released Soniox v5 Real-Time: a low latency streaming Speech to Text model on the Pareto frontier for accuracy and latency, at the lowest price of any proprietary model tested Soniox v5 Real-Time is @soniox_ai's latest streaming Speech to Text (STT) model, joining Soniox v5 Async, their non-streaming model released last week. On AA-WER Streaming it occupies the middle of the Pareto frontier: faster than the most accurate models (Cartesia Ink-2, ElevenLabs Scribe v2 Realtime) and more accurate than the fastest (Deepgram Flux, Nova-3), while at a lower price than all of them. AA-WER Streaming Overview AA-WER Streaming reports WER and latency as a pair, measured from Silero VAD-detected end of speech on the same ~8 hours of audio as our non-streaming STT benchmark, AA-WER v2.0. We report both at two points: First Final (first final-denoted transcript, best for accuracy) and First Partial (first transcript-bearing event, best for when speed matters most). Key takeaways ➤ First Final Transcription: Soniox v5 Real-Time achieves a 4.5% WER at 0.05s after end of speech, more accurate than the faster Deepgram Flux (7.4%, 0.02s) and Deepgram Nova-3 Realtime (6.7%, 0.06s), and faster than the more accurate Cartesia Ink-2 external endpoints (3.7%, 0.09s) and ElevenLabs Scribe v2 Realtime (3.6%, 0.14s) ➤ First Partial Transcription: The model achieves a 4.7% WER at 0.05s after end of speech, behind only Cartesia Ink-2 external endpoints (4.3%, 0.07s) and ElevenLabs Scribe v2 Realtime (3.6%, 0.13s) on accuracy, while faster than both ➤ Price: The model costs $2 per 1,000 minutes representing the lowest of any proprietary streaming model tested, below Cartesia Ink-2 ($4), Deepgram Nova-3 Realtime ($4.80) and ElevenLabs Scribe v2 Realtime ($6.50) ➤ Language support: The model supports over 60 languages, providing language identification and real-time translation across multilingual conversation. See more details below ⬇️

译Soniox 发布 v5 Real-Time 流式 STT 模型，在 AA-WER Streaming 基准上处于准确率与延迟的帕累托前沿。First Final 转录 WER 4.5%（延迟 0.05s），比 Deepgram Flux (7.4%, 0.02s) 和 Nova-3 Realtime (6.7%, 0.06s) 更准确，比 Cartesia Ink-2 (3.7%, 0.09s) 和 ElevenLabs Scribe v2 Realtime (3.6%, 0.14s) 更快。First Partial 转录 WER 4.7%（延迟 0.05s），准确率仅次于上述两款模型但速度更快。价格 $2/1000 分钟，为所有测试专有流式模型最低。支持 60+ 语言及实时翻译。

SiliconFlow@SiliconFlowAI · 6月17日72

Just dropped the entire War and Peace (~750K tokens) into GLM-5.2. Then asked it to analyze the book and build an interactive 3D character universe. The result: · 27 characters, 9 factions · ~50 relationships mapped across 66,000 lines No drift, no confusion, still had room to think GLM-5.2 is now live on SiliconFlow🔥 Time to give it a try and show us what you build👇

译智谱 GLM-5.2 已在硅基流动上线，完全开源。该模型将《战争与和平》（约750K tokens）完整输入后，成功分析并构建出包含27个角色、9个派系、约50组关系映射的交互式3D角色宇宙（66,000行代码），无漂移无混淆。GLM-5.2 在 CodeArena 排名第一的可用模型；支持1M上下文窗口，生产级编码能力与 Opus 4.8 相当；提供双思考模式（max 深度、high 质量-成本平衡）。定价：输入缓存/输入/输出分别为 $0.26/1.40/4.40 每百万 token。

歸藏(guizang.ai)@op7418 · 6月17日39

即梦上了 Seedance 2.0 Mini，便宜了不少可以玩玩了

🚨 AI News | TestingCatalog@testingcatalog · 6月17日59

XAI 🔥: Grok Imagine 1.5 Fast has been rolled out! It features a better quality and faster generation time. > 720p videos now render in about 25 seconds, down from 40+ in our previous model.

译XAI 🔥: Grok Imagine 1.5 Fast 已推出！它带来了更好的质量和更快的生成速度。 > 720p 视频现在只需约 25 秒即可渲染，而上一代模型需要 40 秒以上。

karminski-牙医@karminski3 · 6月17日73

GLM-5.2 刚刚正式发布! 给大家带来实测! 直接说结论本次测试中, 提升最大的是Agent能力, 而且是有质的变化! 测试中GLM-5.2 完全不用搜索附近的位置, 就能直接去想要到达的地方. 这一切竟然是它在一开始把地图背下来了! 这在我测试的20多个模型中之前是没有一个模型能做到的, 比如之前的模型想去换电站, 那么都要搜一下附近有哪些换电站(这就会浪费一次tool_call), 而GLM-5.2直接就知道换电站的位置! 从来没用过搜索函数. 这种一开始就把需要的数据内化到上下文中, 并且能够贯穿整个1M上下文进行推理的能力真的是叹为观止. 除此之外, 本次测试后端代码的 Agentic Coding 能力也有提升, 来到了总榜的第二名. 而本次测试暴露出最大的短板则是空间理解. 其实成也萧何败也萧何, 它虽然把换电站的位置都背下来了, 但是去的换电站却不是最近的, 所以虽然记住了, 但是记住了之后在用之前再根据自己当前所在位置推理一下, 他还是没有做到的, 这也是最大的短板了, 强烈建议官方优化一波. #GLM52 #智谱 #智谱AI #AgenticCoding #长上下文能力

译GLM-5.2 正式发布，实测显示其 Agent 能力有质的变化。该模型能将地图数据内化到 1M 上下文中，直接知道换电站位置，全程未调用搜索函数，在测试的 20 多个模型中唯一能做到。后端 Agentic Coding 能力提升至总榜第二名。短板是空间理解：虽记住换电站位置，但无法根据当前位置推理最近站点。

🚨 AI News | TestingCatalog@testingcatalog · 6月17日80

ZAI 🔥: GLM-5.2 by @Zai_org scored 51 point on Artificial Analysis Intelligence Index and got placed on the 4th spot! This made GLM-5.2 a new SOTA open-weight model. Besides that, GLM-5.2 got ranked second on Frontend Code Arena, after currently unavailable Claude Fable 5. Should be ZOTA! 👀

译Z ai 推出 GLM-5.2，在 Artificial Analysis Intelligence Index 上得 51 分排名第四，成为开源权重 SOTA。模型规模同 GLM-5.1（744B 总/40B 活跃参数），智能指数 v4.1 提升 11 分。科学推理显著增强：CritPt +16% 至 21%，HLE +12% 至 40%，GPQA Diamond +3% 至 89%。上下文窗口升至 1M tokens。API 定价 $1.4/$4.4/$0.26 每 1M 输入/输出/缓存命中 token，每任务成本约 $0.46，处智能 vs 成本帕累托前沿。MIT 许可证，已上线 DeepInfra 等第三方平台。

数字生命卡兹克@Khazix0918 · 6月17日56

智谱 YYDS！官方评分也终于出来了，真是真的可以跟 Opus 4.8 掰掰手腕了

译智谱发布GLM-5.2，开源模型（MIT许可），在编码和智能体任务上有显著提升，支持1M上下文窗口。提供两种推理努力级别：GLM-5.2 (max) 极限模式、GLM-5.2 (high) 性能与token效率平衡。API定价与GLM-5.1保持不变。官方评测显示其性能已可与Opus 4.8竞争。

DogeDesigner@cb_doge · 6月17日49

Grok Imagine Video 1.5 Fast nearly doubles video generation speed. It can create a 6-second, 720p video in around 25 seconds, down from over 40 seconds with the previous model. That’s a massive speed upgrade. Here's the comparison:

译Grok Imagine Video 1.5 Fast 的视频生成速度几乎翻倍。它可在约25秒内生成一段6秒720p视频，而上一代模型需要40秒以上。这是一次巨大的速度升级。以下是对比：

Orange AI@oran_ge · 6月17日71

智谱发布的 GLM 5.2 今日正式开源它的的意义在于 GLM 5.2 是首个编程 coding 能力达到 Opus 水平的开源模型我们已经在第一时间将其接入 Cola，作为 beta 模型供大家测试。模型定价与官方相同欢迎大家体验和反馈

译智谱今日正式开源 GLM 5.2，这是首个编程 coding 能力达到 Opus 水平的开源模型。目前该模型已接入 Cola 作为 beta 模型开放测试，定价与官方一致，欢迎体验和反馈。

DogeDesigner@cb_doge · 6月17日45

All these videos were created using Grok Imagine 1.5 Big upgrade. Huge jump in quality. 🚀

译所有这些视频都是用 Grok Imagine 1.5 创建的。重大升级。质量大幅跃升。🚀

歸藏(guizang.ai)@op7418 · 6月17日72

智谱 GLM-5.2 可以在 Codepilot 模型管理里面自行添加哈

译智谱 GLM-5.2 正式发布并开源，定位处理长周期任务。模型具备稳定的100万上下文窗口，并引入思考力度控制。架构上采用 IndexShare 机制，每四层稀疏注意力共享同一个 indexer，在百万 token 上下文中将每 token 计算量降低约 2.9 倍。用户现可在 Codepilot 模型管理中添加使用 GLM-5.2。

SiliconFlow@SiliconFlowAI · 6月17日42

Code like a real G😎 Congrats to @Zai_org 's GLM 5.2 ranks #1 as available model on CodeArena 💪 SiliconFlow is proud to be T+0 launch partner🔥 💰 Input Cache/Input/Output: $ 0.26/1.40/4.40 per 1M tokens 📚 Usable 1M context for entire codebases and project-scale workflows ⚙️ Reliable long-horizon execution that stays on track through complex tasks 💪 Production-grade coding on par with Opus 4.8 🧠 Dual thinking modes: max for depth, high for quality-cost balance And it's still fully open-source. Big shoutout to @Zai_org for keeping frontier model accessible to builders and the community 🙌 Get started today 👇

译智谱 GLM 5.2 在编码评测 CodeArena 的可用模型中排名第一。硅基流动同步首发，定价 Input Cache/Input/Output 分别为 $0.26/1.40/4.40 每百万 token，支持 1M 上下文，具备可靠的长时间任务执行能力，编码性能与 Opus 4.8 持平。提供双思考模式：max 侧重深度，high 侧重质量成本平衡。模型完全开源。