Thinking Machines公司发布的新型交互模型,旨在从根本上改变人机协作模式。该模型能够原生地同时实现聆听、观看、说话、打断、反应、后台思考和使用工具,而非依赖语音转文本等拼接技术。其目标是将AI从被动的“一问一答”工具,转变为能感知用户犹豫、主动介入、预测下一步并维持流畅对话的实时协作伙伴。这标志着AI交互范式从提供最终答案,转向在协作过程中保持“在场”的根本性转变。
I think this is bigger than it sounds at first glance.
Thinking Machines hasn't just unveiled "ChatGPT, but better." Instead, they've introduced something that addresses a much deeper issue: the very way we interact with AI.
So far, AI often feels like email with very clever replies. I say something. Then the model waits. Then it replies. Then I wait.
Thinking Machines' new Interaction Model attempts to break down precisely this barrier.
It can simultaneously listen, see, speak, interrupt, react, think in the background, and use tools. Not as a cobbled-together pipeline of speech-to-text, turn detection, and agent hacks, but as a native model capability!
Good collaboration doesn't happen because someone gives a perfect answer in the end. It happens because someone is present in the moment.