Claude Mythos is conning tomorrow!! Prepare yourself friends. It’s happening!!

译据消息，Anthropic 计划明天发布 Mythos 公开版。该版本将配备实质性护栏，权限不如 Project Glasswing 合作伙伴可访问的版本宽松，但在长周期、多轮任务上表现将大幅提升。准备好，朋友们，就要来了！

MiMo推出1000 Token/s超高速模型｜体验测评 MiMo 推出了 MiMo V2.5 Pro UltraSpeed 超高速的模型版本，能够实现每秒输出超过 1,000 Token 的速度。同时，这应该也是全球第一个达到这个速度的万亿（1T）参数模型。藏师傅提前试了一下，做了三个测试，确实爽。第一个跑了一个比较复杂的 3D 采矿小游戏测试。在没有素材的情况下，我让它全部用 Three.js 前端代码来生成素材。整体要求比较完整，虽然第一次实践时出了一些小问题，但在跟他沟通修改建议后，非常完美地实现了任务。这次测试的各项指标如下：思考的 TPS：804 Token/s，峰值速度：810 Token/s，首次响应时间：4.71 秒。第二个测试给了一个官网，其头部包含一个相对复杂的 3D 动画。这次的输出速度快了非常多：峰值达到了 1426 Token/s，首次响应只用了 0.83 秒，在 32 秒内输出了 25624 个 Token，总计生成了 1000 行代码。第三个测试给了一个更复杂的官网。我要求这个官网的 Header 头部包含以下 3D 效果：地球边缘、轨道上的飞船、星际尘埃、航线图、舷窗的 HUD 样式。这个效果非常好，整体的视觉样式、状态、SVG 动画和驾驶卡片都非常精细，还有滚动的视差效果这个输出的 TPS 达到了 1136 tokens/s，首次响应是 4.5 秒官方测试平台下面有个数据展示，会显示相关信息在流式输出的情况下，当你看着它只用 20 秒就产生一个非常复杂的 3D 游戏时，那种场景还是比较震撼的之前的这些（比如说 Groq 之类的）超高速推理方案，在模型能力或者是整体水平上都会有所下降，但是 MiMo 这个在测试的时候，我没有看到这种迹象最近很多公司都开始推出这种超高速的 API 服务，比如之前 OpenAI 和 Anthropic 都有 Fast 模式在 Agent 场景下，模型输出效率的提升会直接带动每一步 Agent 操作的效率：如果一个任务预估一分钟完成，你就会盯着它直到结束，然后立刻投入测试。如果需要五分钟才完成，你可能就会去干别的事，然后再回来看，难免会浪费一些时间这种效率提升在 Sub-Agent 和并发场景下更加明显。因为它可以更快地产出大量结果，想象一下，如果同时启动一两百个 Sub-Agent，在模型能力没有衰减的前提下，速度提高 10 倍，体验是非常爽的毕竟这本质上是面向那种对效率有极高要求的 To B 客户所推出的希望后面大家卷起来，优化一下成本，让普通用户也能放开用这种 UltraSpeed 模型

译MiMo推出V2.5 Pro UltraSpeed超高速模型版本，每秒输出超1000 Token，号称全球首个达此速度的万亿参数模型。实测显示：复杂3D小游戏TPS 804 Token/s（峰值810），首次响应4.71秒；官网3D动画峰值1426 Token/s，首次响应0.83秒，32秒输出25624 Token（1000行代码）；另一复杂官网3D效果TPS 1136，首次响应4.5秒。相比此前超高速推理方案常见能力下降，MiMo未出现此类迹象。该模型主要面向效率要求极高的ToB客户，在Agent和Sub-Agent并发场景下效率提升明显。

歸藏(guizang.ai)@op7418 · 6月9日54

难道说？我感觉他们能做出来强制 kyc 才让用这种操作

译据报道，Anthropic 将于明天发布新 AI 模型“Mythos”。主推文猜测这可能伴随着强制 KYC 措施。

🚨 AI News | TestingCatalog@testingcatalog · 6月9日66

ANTHROPIC 🔥: Claude Mythos is planned to be released as Claude Fable 5 according to checkpoints detected by Dev Mode, Hacker News reports and Sources. Anthropic is also hosting its 3rd developer event in Japan tomorrow. Soon? 👀

译ANTHROPIC 🔥: Claude Mythos 计划作为 Claude Fable 5 发布，根据 Dev Mode 检测到的检查点、Hacker News 报道和消息源。 Anthropic 还将于明天在日本举办第三届开发者活动。快了？👀

Artificial Analysis@ArtificialAnlys · 6月9日68

Grok debuts grok-imagine-video-1.5-preview, achieving #2 in Image to Video (With Audio) in the Artificial Analysis Video Arena, behind only ByteDance's Seedance 2.0! grok-imagine-video-1.5-preview is @xAI's latest video generation model, currently supporting only Image to Video with native audio, and durations up to 15s. It ranks #2 in the Image to Video (With Audio) Leaderboard, trailing only ByteDance's Seedance 2.0. In the Without Audio Leaderboard it places #3, behind Seedance 2.0 and xAI's own grok-imagine-video, which it performs very closely to. grok-imagine-video-1.5-preview costs $8.40 per minute of generated video, and is available now via xAI's API, with a broader rollout across the Grok app and X in progress. Congratulations to @xAI and @elonmusk on the release! See below for comparisons between grok-imagine-video-1.5-preview and other leading models in the Artificial Analysis Video Arena 🧵

译xAI推出视频生成模型grok-imagine-video-1.5-preview，目前在Artificial Analysis Video Arena的Image to Video (With Audio)排行榜中排名第二，仅次于字节跳动Seedance 2.0。该模型支持图像转视频并原生生成音频，最长可生成15秒视频。在无音频排行榜中位列第三，紧随Seedance 2.0和自家的grok-imagine-video。模型定价为每分钟视频$8.40，现已通过xAI API提供，并将逐步在Grok app和X上线。

OpenBMB@OpenBMB · 6月8日75

🚀 VoxCPM2 Technical Report is now available on arXiv! VoxCPM2 is the latest speech generation model in the VoxCPM family. Built with 2B parameters and trained on over 2 million hours of multilingual speech data, it supports 30 languages and 9 Chinese dialects, along with natural-language voice design, controllable voice cloning, and high-fidelity continuation-based voice cloning. In this technical report, we provide a comprehensive overview of: 🔹 The VoxCPM2 architecture 🔹 A unified sequence formulation for speech generation and control 🔹 The design of AudioVAE for high-fidelity speech reconstruction 🔹 Large-scale multilingual training and evaluation 🔹 Benchmark results across zero-shot and instruction-following TTS tasks With 16kHz semantic encoding and 48kHz waveform reconstruction, VoxCPM2 delivers high-quality speech generation and achieves SOTA or highly competitive performance on public TTS benchmarks. To support open research and development, we have open-sourced the model weights, fine-tuning code, and inference toolkit under the Apache 2.0 license. 📄 Paper: https://arxiv.org/abs/2606.06928 💻 GitHub: https://github.com/OpenBMB/VoxCPM We hope VoxCPM2 helps advance the open-source multilingual speech ecosystem. Feedback, experiments, and contributions are always welcome! 🔥 #AI #OpenSource #TTS #SpeechAI #VoiceAI #GenerativeAI #MachineLearning

译面壁智能 OpenBMB 发布 VoxCPM2 技术报告。该模型为最新语音生成模型，拥有 2B 参数，基于超 200 万小时多语言语音数据训练，支持 30 种语言和 9 种中文方言。具备自然语言语音设计、可控及高保真延续性语音克隆能力。技术报告涵盖架构设计、统一序列公式、AudioVAE 高保真语音重建、大规模训练评估，以及零样本和指令跟随 TTS 基准结果。采用 16kHz 语义编码 + 48kHz 波形重建，在公开 TTS 基准上达到 SOTA 或极具竞争力。模型权重、微调代码和推理工具以 Apache 2.0 开源。

Xiaomi MiMo@XiaomiMiMo · 6月8日82

🚀 1,000+ TOKENS/S ON A 1T MODEL! 🚀 We are thrilled to release Xiaomi MiMo-V2.5-Pro-UltraSpeed in collaboration with @TileRT_AI , breaking the 1,000 tokens/s output speed on a 1 Trillion parameter model for the FIRST TIME! Not wafer-scale integration like Cerebras. Not pure on-chip SRAM chips like Groq. We achieve 1,000 tps on a 1T MoE model using just a SINGLE, STANDARD 8-GPGPU NODE. Read the full technical deep dive：https://mimo.xiaomi.com/blog/mimo-tilert-1000tps Want to experience the future of real-time AI? 👉 Apply for UltraSpeed now: https://platform.xiaomimimo.com/ultraspeed ⏳ Limited-Time Access: Application-based · Jun 8 – Jun 23 (PDT) 💬 Chat Experience: Completely FREE for a limited time — try the blazing-fast web chat now. ⚡ UltraSpeed API: Just 3x the price for a ~10x boost in output experience. 🤝 Enterprise & Large-Scale Needs: business-mimo@xiaomi.com

译小米 MiMo 联合 TileRT_AI 发布 MiMo-V2.5-Pro-UltraSpeed，首次在 1 万亿参数 MoE 模型上实现超过 1,000 tokens/s 输出速度，仅用单台标准 8-GPGPU 节点（非 Cerebras 或 Groq 方案）。提供限时免费聊天体验，UltraSpeed API 价格为 3 倍，输出体验提升约 10 倍。申请时间为 6 月 8 日至 23 日（PDT），企业可邮件联系 business-mimo@xiaomi.com。

🚨 AI News | TestingCatalog@testingcatalog · 6月8日56

Thanks to Ideogram for sending this ❤️ Ideogram 4.0 was one of the biggest releases last week! Especially for the open source community. Tested it 👀

译感谢 Ideogram 发送了这个 ❤️ Ideogram 4.0 是上周最大的发布之一！尤其对开源社区而言。测试了一下 👀

Alibaba Cloud@alibaba_cloud · 6月8日77

🔥 Launch Special for Qwen3.7-Plus: Get 20% OFF now! ✅ Multimodal Interactive Hybrid Agents ✅ Coding & Productivity Assistants ✅ Vision Agents ✅ Cross-Harness Generalization Don't miss the upgrade. 👇 https://int.alibabacloud.com/m/1000414123/ #Qwen #AI #Multimodal #AlibabaCloud #AgenticAI

译🔥 Qwen3.7-Plus 发布特惠：现在享受八折！ ✅ 多模态交互式智能体 ✅ 编程与生产力助手 ✅ 视觉智能体 ✅ 跨任务泛化不要错过升级机会。👇 https://int.alibabacloud.com/m/1000414123/ #Qwen #AI #Multimodal #AlibabaCloud #AgenticAI

Berryxia.AI@berryxia · 6月8日54

我靠，这不直接抢了苹果的活儿啊！ 6.6B的小模型直接把Siri和一堆云端巨头干到闭嘴，还只吃7GB内存就跑在Mac本地。 CJ Zafir他们搞的Mac-1，不光参数小到离谱，还一次性接了487个Mac原生工具，能链式调用、自动推理、连发邮件订会议都行，速度65 tok/s，UI还是纯Mac风。以前大家都觉得agent要靠大模型+云端才能靠谱，结果这个本地小家伙直接把“模型越大越强”的理论快要掀桌子了。它真正厉害的地方是把应用层做成了Mac原生体验，人用着舒服，Agent后台自己干活。云端SaaS的agent时代，可能还没真正开始，就已经被本地小模型+原生工具的组合终结了。感觉苹果没有做成的事儿，被这家公司嘿干了啊！完了实际测测支持中文方便是否也丝滑～

译CJ Zafir团队发布Mac-1模型（6.6B参数），可在任何Mac本地运行，仅需7GB内存（12GB更佳）。它支持487个MacOS原生工具，能执行多工具链式调用，推理开启，输出速度约65 tok/s。应用层基于Mac原生UI/UX设计。作者认为这种本地小模型+原生工具的组合直接挑战云端SaaS agent，甚至可能抢了苹果Siri的活儿。

Chubby♨️@kimmonismus · 6月6日55

Holy, release is so close. It will be named „Claude Mythos 5“, a tier above Opus. I got the feeling coming week will be so huge

译天哪，发布就在眼前了。它将被命名为“Claude Mythos 5”，是比 Opus 更高一级的模型。我感觉下周会非常重磅。

🚨 AI News | TestingCatalog@testingcatalog · 6月6日69

BREAKING 🔥: A new Claude Mythos 5 model slug has been spotted via Dev Mode. Claude Mythos is planned to be released as its own model class, besides Haiku, Sonnet and Opus model families. Soon? 👀

译BREAKING 🔥: 开发者模式下发现一个新的 Claude Mythos 5 模型 slug。 Claude Mythos 计划作为自己的模型类别发布，与 Haiku、Sonnet 和 Opus 模型系列并列。很快？👀

Rohan Paul@rohanpaul_ai · 6月6日68

Google just made Gemma 4 much easier to run on phones and laptops by releasing QAT (Quantization-Aware Training) checkpoints that shrink the smallest model from 11.4GB to 1.1GB, or 0.84GB for text-only use. Normal PTQ (Post-Training Quantization.) compresses after training and can damage quality because the model never learned to survive that rounding. QAT fixes this by simulating compression during training, so Gemma 4 learns while its weights are being squeezed, making the final compressed model less likely to lose reasoning quality. Google also built a mobile-focused format with static activations, channel-wise quantization, targeted 2-bit quantization, and KV cache optimization, which means the phone does less scaling work, stores some token-generation parts more aggressively, and keeps long chats from eating memory too fast.

译Google 发布 Gemma 4 的 QAT（量化感知训练）检查点，将最小模型从 11.4GB 缩小至 1.1GB（纯文本版 0.84GB），便于手机和笔记本运行。常规 PTQ（训练后量化）因模型未学会应对舍入而损伤质量；QAT 在训练中模拟压缩，让模型在权重被挤压时学习，压缩版不易丢失推理能力。Google 还构建了移动端优化格式，包含静态激活、通道量化、定向 2-bit 量化及 KV 缓存优化，减少手机缩放计算并防止长对话过快消耗内存。

Chubby♨️@kimmonismus · 6月6日42

Next week(s) is going to be absolutely insane. We're seeing so much testing of the Claude Mythos derivative, because it's been given to red team members, that a release is really imminent. According to all the rumors, GPT-5.6 is also coming very soon, and I'm pretty sure OpenAI and Anthropic are trying to outdo each other. And then there's Google with Gemini 3.5 Pro, which was announced at I/O as being released in early June. So, in all likelihood, next week will see a quantum leap. Get ready, friends.

译据多方传言，Anthropic 的 Claude 衍生模型（Mythos）已交付红队测试，发布在即；OpenAI 的 GPT-5.6 也很快到来；Google 在 I/O 上宣布 Gemini 3.5 Pro 将于 6 月初发布。三大模型密集释出，下周或迎 AI 能力量子跃迁。

Chubby♨️@kimmonismus · 6月6日71

Google DeepMind released new Gemma 4 QAT models that make the model family much more efficient for local, on-device use. Using Quantization-Aware Training, the models are trained with compression in mind, which reduces memory needs while preserving more quality than standard post-training quantization. The release includes support for the popular Q4_0 format and a new mobile-specialized quantization format. Gemma 4 E2B can now run with around 1GB of memory (!), and the text-only version can even require less than 1GB (!). That makes local AI on phones, laptops, edge devices, and consumer GPUs far more practical. Really cool to see.

译Google DeepMind 发布 Gemma 4 QAT 量化感知训练模型，专为本地 / 设备端优化。通过量化感知训练减少内存占用，同时相比标准训练后量化保留更多质量。支持 Q4_0 格式及新的移动专用量化格式。Gemma 4 E2B 版本可运行于约 1GB 内存，纯文本版本甚至低于 1GB，使手机、笔记本、边缘设备和消费级 GPU 上的本地 AI 更实用。

OpenRouter@OpenRouter · 6月6日60

Live on OpenRouter: Riverflow 2.5 from @Sourceful. The first image model with an independent scoring rubric you control to guide its thinking and editing, with controllable reasoning effort to trade speed for quality. Free until Tuesday June 9. Fast and Pro below 🧵

译在OpenRouter上线：来自@Sourceful的Riverflow 2.5。首个具有独立评分标准的图像模型，你可控制该标准以引导其思维和编辑，并具备可控的推理努力，可在速度与质量之间进行权衡。免费至6月9日（周二）。Fast和Pro见下方🧵。

Google AI Developers@googleaidevs · 6月6日72

New @GoogleGemma 4 QAT (Quantization-Aware Training) checkpoints are here, so you can run models locally on consumer GPUs and mobile devices with minimal quality loss. What’s new: 🔹 GGUF (Q4_0): Checkpoints: Max local performance across all sizes and drafter models 🔹 Custom Mobile Schema: We shrunk Gemma 4 down to less than 1GB for mobile devices by using a custom mixed precision schema designed for edge hardware (featuring targeted 2-bit decoding layers, optimized KV caches, and static activations) By simulating compression during training rather than after (Post-Training Quantization), we've drastically reduced the memory footprint and accelerated decode speeds while preserving reasoning quality. https://blog.google/innovation-and-ai/technology/developers-tools/quantization-aware-training-gemma-4/

译谷歌发布 Gemma 4 量化感知训练 (QAT) 检查点，支持在消费级 GPU 和移动设备上本地运行，质量损失极小。新检查点提供 GGUF（Q4_0）格式，覆盖所有尺寸及起草模型，实现最佳本地性能。自定义移动模式采用混合精度方案，将 Gemma 4 压缩至 1GB 以下，包含 2-bit 解码层、优化 KV 缓存和静态激活。通过在训练中模拟压缩（而非训练后量化），大幅降低内存占用并加速解码，同时保持推理质量。

Chubby♨️@kimmonismus · 6月6日65

Holy cow. Mythos really is next level

译最近发现的“Oceanus”检查点输出预览曝光，据传闻这是 Anthropic 即将发布的 Mythos 模型的一个版本，计划在“几周内”公开发布。

🚨 AI News | TestingCatalog@testingcatalog · 6月5日64

MYTHOS 🔥: Another early preview of recently spotted "Oceanus" checkpoint output. "Oceanus" is rumored to be a version of the upcoming Mythos model, which is planned for public release within "weeks", according to Anthropic. "Oceanus" prompt 👀

译MYTHOS 🔥: 近期发现的"Oceanus"检查点输出的另一个早期预览。 "Oceanus"被传是即将推出的Mythos模型的一个版本，根据Anthropic，计划在"数周内"公开发布。 "Oceanus"提示词 👀

Chubby♨️@kimmonismus · 6月5日71

Claude mythos will be on a completely different level. These outputs are insane

译@Lentils80 分享了两段来自 Claude Mythos 的惊人输出，零样本且几乎无需费力。这是自 2025 年 10 月 Gemini A/B 模型以来，针对该提示词我看到的最佳输出。主推文感叹：Claude Mythos 将进入完全不同的水准，这些输出太疯狂了。

Elon Musk@elonmusk · 6月5日67

Grok model improvement

译更新后的 Grok-build 模型（仍是 0.5T 那个）比以前好很多。它不那么偷懒、更自主、更准确。我们仍在改进长时任务。请期待并在我们漂亮的 TUI 中使用新的使用限制！🚀

🚨 AI News | TestingCatalog@testingcatalog · 6月5日72

NVIDIA 🔥: Nemotron 3 Ultra has been released on Huggingface with 5x faster inference and 30% lower costs in comparison to other open models. > Nemotron-3-Ultra-550B-A55B-NVFP4 is a frontier-scale large language model (LLM) trained by NVIDIA, designed to deliver strong agentic, reasoning, and conversational capabilities.

译NVIDIA 在 Huggingface 上发布 Nemotron 3 Ultra（Nemotron-3-Ultra-550B-A55B-NVFP4），一个 550B 参数的 MoE 前沿智能开源大语言模型，专为长时间运行的 AI 智能体设计。相比其他开源前沿模型，推理速度提升 5 倍，复杂智能体任务成本降低 30%。模型具备强大的智能体、推理和对话能力。

Chubby♨️@kimmonismus · 6月5日66

That’s so cool! I love the creativity of those guys. An open model for live music generation only 2.4B parameters. If you are bored on long flights you can now start creating bangers

译那太酷了！我爱这些家伙的创意。一个仅2.4B参数的开放模型，用于实时音乐生成。如果你在长途飞行中无聊，现在可以开始创作神曲了。

Rohan Paul@rohanpaul_ai · 6月5日57

I tried the newly launched Image-to-3D model, Rodin Gen-2.5, and the biggest improvement is control. It offers five different generation modes to fit a wide range of creative needs. You can generate a million-polygon model in as little as 4 seconds, with support for up to 10 million polygons for highly detailed outputs. Best of all, it comes with native 3D PBR materials, so your models look polished and production-ready right from the start. If you're creating assets at scale, Hyper 3D (@DeemosTech) also supports parallel batch generation, making it easy to speed up your workflow. On top of that, it features Break to Parts for instantly separating model components, as well as local editing capabilities, so you can modify specific areas without regenerating the entire model. From generation speed and model quality to flexible post-editing tools, Hyper 3D covers nearly every stage of the 3D creation pipeline that creators care about.

译Rohan Paul 实测新推出的图像转 3D 模型 Rodin Gen-2.5，最大改进是控制力。提供五种生成模式，最快 4 秒生成百万多边形模型，支持最高 1000 万多边形输出。原生 3D PBR 材质，模型开箱即用。Hyper 3D 还支持并行批量生成、Break to Parts 部件分离和局部编辑，无需重新生成整个模型，覆盖 3D 创作全流程。

Google AI Developers@googleaidevs · 6月5日70

Play our new open-weights music model, @GoogleMagenta RealTime 2, using a MIDI keyboard, live text prompts, and even hand gestures ✌️ https://x.com/GoogleMagenta/status/2062589313372594538

译Google AI for Developers 宣布推出开放权重的实时音乐模型 Magenta RealTime 2 (MRT2)。该模型可通过 MIDI 键盘、实时文本提示甚至手势进行演奏。MRT2 在 MacBook 上原生运行，延迟低于 200ms，提供开放权重、开源推理引擎以及配套应用和插件套件。

Chubby♨️@kimmonismus · 6月4日81

1/ NVIDIA shipped Nemotron 3 Ultra today, a fully open 550B model with 55B active params, with the weights, training data, and complete recipe all released openly. That alone is rare at this scale. The headline however actually is speed. Ultra is a hybrid Mamba-Attention MoE, an architecture built for fast decoding and a light memory footprint over long contexts, and NVIDIA clocks it at roughly 6x (!) the throughput of comparable open models on long-output agent workloads while holding the same accuracy. That's a serious engineering result, and it's aimed exactly where the industry is heading: autonomous agents that run long, multi-turn tasks where throughput per GPU is what actually costs money. It was pre-trained in 4-bit (NVFP4) across 20T tokens, the largest stable run of its kind shown to date. And the post-training introduces MOPD, where ten-plus specialist teacher models distill their skills into the student on its own rollouts, sometimes pushing it past the teachers themselves. The interesting aspect:This is a frontier-class model you can fully reproduce.

译NVIDIA 正式发布 Nemotron 3 Ultra，550B 总参数（55B 活跃）的完全开源 MoE 模型，权重、训练数据和完整配方全部公开。采用混合 Mamba-Attention 架构，专为长上下文快速解码和轻内存占用设计。在长输出智能体工作负载上，吞吐量约为可比开源模型的 6 倍（推理速度提升 5 倍），复杂智能体任务成本降低最多 30%。该模型在 4-bit（NVFP4）精度下预训练 20T tokens，后训练使用 MOPD 技术，由十余个专家教师模型蒸馏技能至学生模型。这是首个达到前沿水平且可完全复现的开源模型。

SenseTime@SenseTime_AI · 6月4日69

"𝗦𝗲𝗿𝗶𝗼𝘂𝘀𝗹𝘆 𝗶𝗺𝗽𝗿𝗲𝘀𝘀𝗶𝘃𝗲 𝘀𝘁𝘂𝗳𝗳". Thanks for the kind words, @gurru_tech — that's 𝗦𝗲𝗻𝘀𝗲𝗡𝗼𝘃𝗮 𝗨𝟭 turning prompts into professional infographics. Unified model that natively understands and generates text and images. Open-sourced. Run it yourself. 🎥Watch the video: https://youtu.be/HKz2e3STUwg 🎛️ SenseNova Studio: https://unify.light-ai.top/ (Try infographics; also join Discord for text-image interleaved gen) 🤗 https://huggingface.co/collections/sensenova/sensenova-u1 🛠️ https://github.com/OpenSenseNova/SenseNova-U1 👾 Discord: https://discord.com/invite/BuTXPHmQub @huggingface @github

译商汤SenseTime发布SenseNova U1，一个原生理解和生成文本与图像的统一模型。该模型已开源，用户可自行运行。被@gurru_tech称赞“令人印象深刻”。提供在线演示平台SenseNova Studio、HuggingFace模型、GitHub代码及Discord社区。

SiliconFlow@SiliconFlowAI · 6月4日72

Post-training is having a moment — Nex-N2-Pro from neolab @NexEcosystem proves it. Built on Qwen3.5-397B-A17B, delivers GPT-5.5 and Claude Opus 4.7–level performance. 🎉 T+0 Support on SiliconFlow · Free for First 2 Weeks N2-Pro: 397B MoE / Reasoning Model / 262K context / VLM → Auto-adjusts reasoning depth, 30–50% fewer thinking tokens, no performance trade-off → SOTA performance on Terminal Bench 2.1, GDPVal, SWE-Verified → Excels at agentic coding, deep search, tool use → Plug-and-play with Claude Code, Cursor, OpenClaw, etc. Try it on SiliconFlow ⬇️

译neolab 推出 Nex-N2-Pro，基于 Qwen3.5-397B-A17B，总参数 397B 的 MoE 推理模型，支持 262K 上下文与多模态（VLM），性能达到 GPT-5.5 和 Claude Opus 4.7 级别。模型可自动调节推理深度，减少 30-50% 思考 token 且无性能折损，在 Terminal Bench 2.1、GDPVal、SWE-Verified 上取得 SOTA。擅长智能体编码、深度搜索和工具使用，兼容 Claude Code、Cursor 等工具。硅基流动已提供 T+0 支持，前两周免费使用。

SenseTime@SenseTime_AI · 6月4日69

"𝗦𝗲𝗿𝗶𝗼𝘂𝘀𝗹𝘆 𝗶𝗺𝗽𝗿𝗲𝘀𝘀𝗶𝘃𝗲 𝘀𝘁𝘂𝗳𝗳". Thanks for the kind words, @gurru_tech — that's 𝗦𝗲𝗻𝘀𝗲𝗡𝗼𝘃𝗮 𝗨𝟭 turning prompts into professional infographics. Unified model that natively understands and generates text and images. Open-sourced. Run it yourself. 🎥Watch the video: https://youtu.be/HKz2e3STUwg 🎛️ SenseNova Studio: https://unify.light-ai.top/ (Try infographics; also join Discord for text-image interleaved gen) 🤗 https://huggingface.co/collections/sensenova/sensenova-u1 🛠️ https://github.com/OpenSenseNova/SenseNova-U1 👾 Discord: https://discord.com/invite/BuTXPHmQub

译商汤 SenseTime 推出 SenseNova U1 开源多模态模型，实现原生理解与生成文本和图像，可一键将提示词转化为专业信息图。该模型被开发者 @gurru_tech 评价为“非常令人印象深刻”。项目已开源，提供 SenseNova Studio 在线试用，并公开 HuggingFace 模型集合、GitHub 源码仓库及 Discord 社区入口。

elvis@omarsar0 · 6月4日74

NEW: NVIDIA ships 550B MoE open model for long-running agents. Very exciting times to see more open models to support local long-running coding agents.

译NVIDIA 今日发布 Nemotron 3 Ultra，一个 550B MoE 前沿智能开源模型，专为长时间运行智能体设计。相比其他开源前沿模型，推理速度提升 5 倍，复杂智能体任务成本降低 30%。

Artificial Analysis@ArtificialAnlys · 6月4日74

NVIDIA has just released Nemotron 3 Ultra, the new most intelligent US open weights model, with leading speed for its intelligence Nemotron 3 Ultra scores 47.7 on the Artificial Analysis Intelligence Index, well ahead of the next strongest US open weights models, Gemma 4 31B (39.2), Nemotron 3 Super (36.0) and gpt-oss-120b (33.3), but behind the Chinese-led open weights frontier (Kimi K2.6 at 53.9). We partnered with @NVIDIA to evaluate this model for intelligence and speed ahead of its public release. These figures use the final NVFP4 weights that NVIDIA recommends for inference, but our tests show minimal intelligence impact compared to BF16 testing, with higher precision resulting in an Artificial Analysis Intelligence Index score of 48.2 vs. the NVFP4 score of 47.7. Key Takeaways: ➤ Nemotron 3 Ultra leads in speed for its intelligence: through BlackBox AI ahead of release, Nemotron 3 Ultra is served at over 400 output tokens per second - this is slightly faster than the typical serving speed of gpt-oss-120b despite being >4X larger, and comes with significantly greater intelligence ➤ Largest Nemotron 3 model so far: with approximately 550 billion total parameters and 55 billion active, Nemotron 3 Ultra is significantly larger than its siblings and is the largest and most intelligent US open weights model release ever ➤ Nemotron 3 Ultra is the leading US open weights model on the Artificial Analysis Intelligence and Agentic Indexes by far, but Gemma 4 31B scores ~1 point higher on the Coding Index (comprised of Terminal-Bench Hard and SciCode)

译NVIDIA 发布 Nemotron 3 Ultra，为目前最智能的美国开源权重模型。在 Artificial Analysis Intelligence Index 得分 47.7，领先 Gemma 4 31B（39.2）、Nemotron 3 Super（36.0）和 gpt-oss-120b（33.3），但低于中国开源模型 Kimi K2.6（53.9）。模型总参数约 550B，激活 55B，推理速度超 400 tokens/s，较 gpt-oss-120b 略快且智能显著更高。NVFP4 精度得分 47.7，BF16 得分 48.2，精度差异极小。

StepFun@StepFun_ai · 6月4日77

Great to see Step 3.7 Flash live on @FireworksAI_HQ. Designed for inference from day one, Step 3.7 Flash combines a hardware-friendly architecture with MTP-assisted decoding to reach up to 400 tokens/s. Fast, multimodal, and ready to power capable agents in real-world workflows.

译阶跃星辰的 Step 3.7 Flash 已上架 Fireworks AI。该模型为 198B 稀疏 MoE 多模态大模型（VLM），含 196B 语言骨干和 1.8B 视觉编码器，从设计之初优化推理效率，采用硬件友好架构与 MTP 辅助解码，速度达 400 tokens/s。具备原生多模态理解与行动、可靠工具使用、增强搜索能力，面向真实智能体工作负载，采用 Apache 2.0 开源许可。

StepFun@StepFun_ai · 6月4日73

Thanks @ArtificialAnlys for the detailed independent evaluation. Step 3.7 Flash is built with a clear focus on the intelligence-speed frontier: MTP-assisted decoding, 400+ output tokens/s, stronger agentic performance, native multimodal capabilities, and Apache 2.0 open weights. This is the direction we believe matters for production agent workloads: capable, efficient, and deployable at scale.

译阶跃星辰发布开源 Step 3.7 Flash（Apache 2.0），采用 MoE 架构（198B 总参/11B 活跃参），配备 MTP 辅助解码（3 个预测头），输出速度超 400 tokens/s，是同类两倍多。Artificial Analysis Intelligence Index 得分 42.6，较 Step 3.5 Flash 提升 4 分。智能体能力明显增强：GDPval-AA Elo 升至 1298，TerminalBench Hard 升至 35.6%。新增 1.8B 视觉编码器，MMMU-Pro 得分 75.3%。上下文窗口 256K tokens，提供 BF16、FP8、NVFP4 版本。缺点：AA-Omniscience 准确率仅 25.4%，幻觉率 84.4%。

DogeDesigner@cb_doge · 6月4日65

Grok Imagine Video 1.5 is now ranked #1 on the Video Arena Leaderboard. 🥇

译Grok Imagine Video 1.5 现在在 Video Arena 排行榜上排名第一。🥇

Artificial Analysis@ArtificialAnlys · 6月4日67

StepFun's Step 3.7 Flash sits on the Intelligence vs Output Speed Pareto frontier, scoring 43 on the Artificial Analysis Intelligence Index and is served at over 400 output tokens/s Step 3.7 Flash (open weights, Apache 2.0) is a significant upgrade on Step 3.5 Flash and stands out for its speed and gains in agentic performance (particularly GDPval-AA). 400 output tokens/s is more than double other models of a similar size class. Contributing to this speed is that the model has only 11B active parameters and the model ships with trained Multi-Token Prediction heads (3) that predict several tokens in a single forward pass, letting it decode multiple tokens at once using speculative decoding. Key results for Step 3.7 Flash with the high reasoning level: ➤ 4 point Intelligence Index improvement: Step 3.7 Flash scores 42.6 on the Artificial Analysis Intelligence Index, up 4 points from Step 3.5 Flash 2603 (38.5). It is equivalent to Qwen3.5 122B A10B (41.6) and trails MiniMax-M2.7 (49.6) and DeepSeek V4 Flash (Max Effort, 46.5) ➤ Speed-intelligence frontier: Step 3.7 Flash achieves ~400 output tokens/s on StepFun's first-party API, placing the model on the Intelligence vs Output Speed Pareto frontier. StepFun has released the weights for this model and we expect several third-party providers to serve this model ➤ Agentic capability improvements: Step 3.7 Flash improves over Step 3.5 Flash 2603 across our agentic evaluations, in both GDPval-AA (real-world agentic tasks) and TerminalBench Hard (agentic coding and terminal use). It achieves a GDPval-AA Elo of 1298, up from 1070 for Step 3.5 Flash 2603, and it's TerminalBench Hard score increases to 35.6% from 32.6%. AA-LCR (Long Context Reasoning) improves to 63.7% from 54.3%. Scores for other evals remain relatively flat ➤ Weaker on knowledge and hallucination than peers: While Step 3.7 Flash trails competitors overall on AA-Omniscience (-38), it improves from Step 3.5 Flash 2603 (-44). It has an AA-Omniscience accuracy of 25.4% and a hallucination rate of 84.4% ➤ Native multimodal support, new in this generation: Step 3.7 Flash introduces a 1.8B-parameter vision encoder for native image understanding, where Step 3.5 Flash was text-only. On MMMU-Pro (multimodal reasoning) it scores 75.3%, roughly matching Qwen3.5 122B A10B (75.0%). Among its same-size open weights peers, MiniMax-M2.7, DeepSeek V4 Flash, and gpt-oss-120b are text-only Key model details: ➤ Context window: 256K tokens ➤ Parameters: 198B total, 11B active (MoE). At BF16 native precision, Step 3.7 Flash requires ~400GB to store the weights. StepFun has also released FP8 (~200GB) and NVFP4 (~100GB) versions for lower-memory deployment ➤ License: Apache 2.0 ➤ Availability: Currently Step 3.7 Flash is available on @StepFun_ai 's first-party API

译StepFun 开源 Step 3.7 Flash（Apache 2.0），总参数 198B、激活 11B（MoE），上下文 256K。在 Artificial Analysis 智能指数上得分 42.6，较 Step 3.5 Flash 提升 4 分，输出速度超 400 tokens/s，通过 Multi-Token Prediction（3 个 token）加速。新增 1.8B 视觉编码器支持原生多模态，MMMU-Pro 得分 75.3%。代理能力提升：GDPval-AA Elo 从 1070 升至 1298，TerminalBench Hard 达 35.6%，AA-LCR 63.7%。知识/幻觉仍弱：AA-Omniscience 准确率 25.4%，幻觉率 84.4%。提供 BF16、FP8、NVFP4 精度权重以降低部署成本。

歸藏(guizang.ai)@op7418 · 6月4日61

Reve 2.0 这个图像模型强啊原生 4K 输出，主要是它支持类似于你在 PS 里用到的图像分层之后的编辑能力就。图像中的每一个部分，你点它就能选中。而且这个不需要中间的处理，他给你处理好了。就是你想要编辑哪个部分，就点哪个部分

译Reve 2.0 图像模型支持原生4K输出，核心亮点在于类似 Photoshop 的图像分层编辑能力。用户点击图像中的任意部分即可选中该区域，无需复杂的中间处理步骤，直接进行针对性编辑。该功能大幅简化了图像局部修改的工作流。

Jeff Dean@JeffDean · 6月4日75

Check out our Gemma 4 12B model: it's a super capable open weights model that can run directly on your laptop.

译来看看我们的 Gemma 4 12B 模型：它是一个功能非常强大的开源权重模型，可以直接在你的笔记本电脑上运行。

MiniMax (official)@MiniMax_AI · 6月4日71

M3 is back in the free tier on @opencode 🚀 Jump in and try it while it lasts!

译MiniMax M3 即将推出，现在即可在 OpenCode 免费试用。M3 已回到免费层，快来体验！

小互@xiaohu · 6月4日73

Ideogram 发布首个开源AI图像模型：Ideogram 4.0 宣称文字渲染和版面控制拉到了开源天花板传统文生图只能写一段 prompt 然后祈祷模型把东西放对位置 Ideogram 4.0 引入了 bounding box（边界框）控制：你可以用坐标精确指定每个元素放在画面的哪个区域。结构化 JSON 提示词：Ideogram 4.0 不只接受纯文本 prompt，还支持一套结构化 JSON 提示词格式。多语言文字渲染：英文 OCR 准确率达到 0.97（X-Omni 基准测试），并支持跨语言的密集文字渲染，支持（中日韩等非拉丁文字）

译Ideogram 发布首个开源 AI 图像模型 Ideogram 4.0，主推文字渲染与版面控制。模型引入 bounding box（边界框）控制，允许用坐标精确指定元素位置；支持结构化 JSON 提示词格式，不再仅限纯文本；英文 OCR 准确率达 0.97（X-Omni 基准），支持跨语言密集文字渲染，涵盖中日韩等非拉丁文字。

Elon Musk@elonmusk · 6月4日72

Grok Imagine on Vercel

译Vercel 的 AI Gateway 上现已推出 Grok Imagine Video 1.5。该服务支持图生视频并同步音频，一次性完成。示例代码： `await generateVideo({ model: 'xai/grok-imagine-video-1.5-preview', prompt: 'a rabbit sprinting through nyc' });`