GPT-5.3 Instant in ChatGPT is now rolling out to everyone. More accurate, less cringe. https://openai.com/index/gpt-5-3-instant/

译GPT-5.3 Instant 现已向所有 ChatGPT 用户推出，响应准确性提升，且减少了令人尴尬的 AI 味。

Google DeepMind@GoogleDeepMind · 3月2日

Nano Banana 2 makes sophisticated visual creation faster, cheaper, and accessible to everyone. 🍌 Tap on each photo to see the details 👀

译Nano Banana 2 让复杂的视觉创作更快、更便宜，且人人可及。🍌 点击每张照片查看详情 👀

Google DeepMind@GoogleDeepMind · 2月27日

We’re launching Nano Banana 2, built on the latest Gemini Flash model. 🍌 It’s state-of-the-art for creating and editing images, combining Pro-level capabilities with lightning-fast speed. 🧵

译我们推出 Nano Banana 2，基于最新的 Gemini Flash 模型构建。🍌 它在创建和编辑图像方面达到最先进水平，将专业级功能与闪电般的速度相结合。🧵

Jim Fan@DrJimFan · 2月25日

What can half of GPT-1 do? We trained a 42M transformer called SONIC to control the body of a humanoid robot. It takes a remarkable amount of subconscious processing for us humans to squat, turn, crawl, sprint. SONIC captures this "System 1" - the fast, reactive whole-body intelligence - in a single model that translates any motion command into stable, natural motor signals. And it's all open-source!! The key insight: motion tracking is the one, true scalable task for whole body control. Instead of hand-engineering rewards for every new skill, we use dense, frame-by-frame supervision from human mocap data. The data itself encodes the reward function: "configure your limbs in any human-like position while maintaining balance". We scaled humanoid motion RL to an unprecedented scale: 100M+ mocap frames and 500,000+ parallel robots across 128 GPUs. NVIDIA Isaac Lab allows us to accelerate physics at 10,000x faster tick, giving robots many years of virtual experience in only hours of wall clock time. After 3 days of training, the neural net transfers zero-shot to the real G1 robot with no finetuning. 100% success rate across 50 diverse real-world motion sequences. One SONIC policy supports all of the following: - VR whole-body teleoperation - Human video. Just point a webcam to live stream motions. - Text prompts. "Walk sideways", "dance like a monkey", "kick your left foot", etc. - Music audio. The robot dances to the beat, adapting to tempo and rhythm. - VLA foundation models. We plugged in GR00T N1.5 and achieved 95% success on mobile tasks. We open-source the code and model checkpoints!! Deep dive in thread:

译SONIC是一个4200万参数的Transformer模型（规模仅半个GPT-1），通过1亿+动作捕捉帧和50万+并行机器人在NVIDIA Isaac Lab中训练，以密集帧级监督替代手工奖励函数。训练3天后零样本迁移至真实G1机器人，在50种动作序列上达100%成功率。单一策略支持VR遥操作、视频动捕、文本指令、音乐响应及VLA模型控制。项目已完全开源。

Claude@claudeai · 9月30日

Introducing Claude Sonnet 4.5—the best coding model in the world. It's the strongest model for building complex agents. It's the best model at using computers. And it shows substantial gains on tests of reasoning and math.

译Anthropic 发布 Claude Sonnet 4.5，称其为全球最佳编程模型。该模型在构建复杂智能体与计算机使用方面表现最强，推理和数学测试成绩也有显著提升。

Jim Fan@DrJimFan · 9月26日

Go check out @yukez’s talk at CoRL! Project GR00T is cooking 🍳

译@yukez 在 CoRL 2025 分享 Project GR00T 最新研究，发布 NVIDIA Isaac GR00T 平台更新，探讨人形机器人基础模型的技术挑战与新机遇。

Jeff Dean@JeffDean · 9月18日

Very excited to see our Gemini models getting better and better at coding! An advanced version of Gemini 2.5 Deep Think at the 2025 International Collegiate Programming Contest (ICPC) World Finals achieved gold-medal level performance! 🎉 https://deepmind.google/discover/blog/gemini-achieves-gold-level-performance-at-the-international-collegiate-programming-contest-world-finals/

译Gemini 2.5 Deep Think 高级版本在 2025 年 ICPC 世界总决赛中取得金牌级别成绩，标志着 Gemini 模型编程能力持续精进，在竞赛级编程任务中表现卓越。

Noam Brown@polynoamial · 9月16日

GPT-5-Codex is 10x faster for the easiest queries, and will think 2x longer for the hardest queries that benefit most from more compute.

译OpenAI 发布 GPT-5-Codex，针对智能体编程优化的 GPT-5 版本。简单查询响应速度提升10倍，复杂查询思考时间延长2倍以获得更多算力支持。已上线 Codex CLI、IDE 插件、网页、移动端及 GitHub 代码审查。

Hao AI Lab@haoailab · 8月7日81

[Lmgame Bench] 🔥 OpenAI has just released two open‑weight reasoning models: gpt‑oss‑120B (~117 B) and gpt‑oss‑20B (~21 B),They are the first OpenAI models with open weights since GPT‑2. We tested both in Lmgame Bench, across 4 interactive games: 🧱 Sokoban | 🟦 Tetris | 🔢 2048 | 🍬 Candy Crush Here’s how they ranked (out of 25): → gpt‑oss‑120b → #12 → gpt‑oss‑20b → #13

译[Lmgame Bench] 🔥 OpenAI 刚刚发布了两款开放权重的推理模型：gpt-oss-120B（约1170亿参数）和 gpt-oss-20B（约210亿参数），它们是自 GPT-2 以来首批开放权重的 OpenAI 模型。我们在 Lmgame Bench 中对两者进行了测试，涵盖4款互动游戏： 🧱 推箱子 | 🟦 俄罗斯方块 | 🔢 2048 | 🍬 糖果传奇以下是它们的排名（满分25分）： → gpt-oss-120b → 第12名 → gpt-oss-20b → 第13名

Jim Fan@DrJimFan · 8月7日

Would love to see the FSD Scaling Law, as it’s the only physical data flywheel at planetary scale. What’s the “emergent ability threshold” for model/data size?

译关注 FSD Scaling Law 及涌现能力阈值，这是全球唯一的物理数据飞轮。Tesla 正训练参数量约 10 倍的新 FSD 模型，视频压缩损失大幅改进，顺利的话下月底发布。

Noam Brown@polynoamial · 8月6日

Our new @OpenAI open models

译OpenAI 发布两款新的开放模型（open models），官方推文称"Both of them"已上线，详见 openai.com/open-models。

Hao AI Lab@haoailab · 8月5日67

Try FastWan at https://fastwan.fastvideo.org/!

译FastVideo团队推出FastWan系列快速视频生成模型。该模型采用名为“稀疏蒸馏”的新训练方法，能将视频去噪速度提升70倍。在单块H200 GPU上，仅需5秒即可生成一段5秒的视频。团队提供了在线演示，并依据Apache-2.0许可证完全开源了模型、代码和数据。

Hao AI Lab@haoailab · 8月5日

(1/n) 🚀 With FastVideo, you can now generate a 5-second video in 5 seconds on a single H200 GPU! Introducing FastWan series, a family of fast video generation models trained via a new recipe we term as “sparse distillation”, to speed up video denoising time by 70X! 🖥️ Live demo: https://fastwan.fastvideo.org/ (Thanks to @gmicloud for the support!) 🔗 Blog: https://hao-ai-lab.github.io/blogs/fastvideo_post_training/ 🔓 We fully open-source our models, code, and data with Apache-2.0 licenses

译(1/n) 🚀 借助 FastVideo，你现在可以在单张 H200 GPU 上用 5 秒生成一段 5 秒视频！

DeepSeek@deepseek_ai · 5月29日68

🚀 DeepSeek-R1-0528 is here! 🔹 Improved benchmark performance 🔹 Enhanced front-end capabilities 🔹 Reduced hallucinations 🔹 Supports JSON output & function calling ✅ Try it now: https://chat.deepseek.com/ 🔌 No change to API usage — docs here: https://api-docs.deepseek.com/guides/reasoning_model 🔗 Open-source weights: https://huggingface.co/deepseek-ai/DeepSeek-R1-0528

译🚀 DeepSeek-R1-0528 现已发布！ 🔹 基准测试性能提升 🔹 前端能力增强 🔹 减少幻觉现象 🔹 支持 JSON 输出与函数调用 ✅ 立即试用：https://chat.deepseek.com/ 🔌 API 使用方式不变 — 文档在此：https://api-docs.deepseek.com/guides/reasoning_model 🔗 开源权重：https://huggingface.co/deepseek-ai/DeepSeek-R1-0528

Jim Fan@DrJimFan · 3月21日

We got lots of great community feedback on our open-source GR00T N1! Check out our Github, star, fork, contribute back! Let's solve generally intelligent robots together, one commit at a time. https://github.com/NVIDIA/Isaac-GR00T/

译NVIDIA 发布世界首个开源人形机器人基础模型 GR00T N1，仅 2B 参数，采用 VLM 加 Diffusion Transformer 架构实现端到端控制。模型基于真实遥操作、30 万+仿真轨迹及合成神经轨迹训练，在 GR1、1X Neo 等机器人上任务性能提升 30%，并可跨具身部署至百元级开源机械臂。

DeepSeek@deepseek_ai · 12月13日

🎉 DeepSeek-VL2 is here! Our next-gen vision-language model enters the MoE era. 🤖 DeepSeek-MoE arch + dynamic image tilling ⚡ 3B/16B/27B sizes for flexible use 🏆 Outstanding performance across all benchmarks 🧵 1/n

译🎉 DeepSeek-VL2 来了！我们的下一代视觉-语言模型进入 MoE 时代。 🤖 DeepSeek-MoE 架构 + 动态图像分块 ⚡ 3B/16B/27B 规模，灵活使用 🏆 在所有基准测试中表现优异 🧵 1/n

Lilian Weng@lilianweng · 9月13日

🍓 Finally o1 is out - our first model with general reasoning capabilities. Not only it achieves impressive results on hard, scientific tasks, but also it gets significantly improved on safety and robustness. https://openai.com/index/learning-to-reason-with-llms/ We found reasoning in context about safety rules is a super efficient way for teaching models human values and principles. Truly, capability and safety are not two conflicting goals. 🤝

译🍓 终于 o1 发布了——我们首个具备通用推理能力的模型。它不仅在困难的科学任务上取得了令人瞩目的成果，而且在安全性和鲁棒性方面也有显著提升。