# another scientific exploration from @TongPetersb， @DavidJFan， and @__JohnNguyen__ that might teach y…

- 来源：Saining Xie (@sainingxie)
- 发布时间：2026-03-05 07:55
- AIHOT 链接：https://aihot.virxact.com/items/cmnz6dpf302acsl0fqq0hc80e
- 原文链接：https://x.com/sainingxie/status/2029345069833257165

## AI 摘要

来自 @TongPetersb、@DavidJFan 和 @__JohnNguyen__ 的又一项科学探索，即使你身处前沿实验室，也可能会让你学到新东西

这里有很多有趣的观察，但我只强调一点：
- 尝试用 MoE 扩展 DiTs 大多徒劳无功，这算是行业公开的秘密。
- 但 RAE 与 MoE 之间意外却直观的协同作用，可能真的会改变这一点。

[引用 @TongPetersb]：超越语言训练。我们押注视觉世界，将其作为与语言建模并行且超越它的关键下一步。因此，我们研究了从零开始用视觉构建基础模型。我们分享我们的探索：视觉表征、数据、世界建模、架构和扩展行为！[1/9]

## 正文

another scientific exploration from @TongPetersb， @DavidJFan， and @__JohnNguyen__ that might teach you something new， even if you're in a frontier lab

lots of interesting observations here， but I'll highlight just one：
- it's kind of an open industry secret that trying to scale DiTs with MoE has mostly been fruitless.
- the unexpected， yet intuitive， synergy between RAE and MoE might actually change that.

### 引用推文

> Peter Tong：Train Beyond Language. We bet on the visual world as the critical next step alongside and beyond language modeling. So, we studied building foundation models fr...
