这是一场关于AI架构的辩论。Transformer阵营指出,其凭借简单、硬件友好、可扩展的优势主导当下,核心是基于键值存储的记忆与注意力机制,并强调任何替代架构必须能在扩展性上与之匹敌,且需达到约10倍优势才能颠覆现有技术栈。Post-Transformer阵营则认为,当前大语言模型的推理更像是后置的文本步骤,真正的突破在于实现模型内部的“潜在推理”与持续学习能力,并指出长上下文不等于真正记忆,未来可能是混合架构。辩论还提到,当前公开基准测试易被优化,而困惑度(Perplexity)仍是评估前沿模型的有效指标。最后指出,尽管Transformer仍占主导,但前沿正在拓宽,并列举了Pathway的BDH、Sakana AI的CTMs和Liquid AI的LFMs等新兴架构作为例证。
This is probably the most entertaining way to understand one of AI's hardest AI debates.
Transformer vs Post-Transformer, argued by leading researchers, inside a real physical boxing ring.
Both technically deep and genuinely entertaining.
I was glued for the entire 1 hour 20 minutes. So many super cool points to learn.
🥊 Transformers