阶跃星辰Step 3.7 Flash发布，专为高效推理设计

StepFun@StepFun_ai

2026-06-02 11:45·19天前

AI 摘要

阶跃星辰发布其推理优化型模型Step 3.7 Flash。该模型为196B MoE架构，从设计之初就专注于推理效率。其采用多矩阵分解注意力机制，使KV-cache成本仅为DeepSeek模型的约22%；同时通过注意力与FFN解耦技术，实现了硬件优化的高效服务。该模型已通过Fireworks AI提供，采用Apache 2.0许可，并可用于构建智能体应用。

This is exactly the philosophy： don't bolt on efficiency， design for it from day one.

MFA + AFD aren't tricks. They're what lets Step 3.7 Flash serve at a fraction of the KV-cache cost.

Huge thanks to @FireworksAI_HQ for making Step 3.7 Flash one-click to run.

Go build something agentic with it.

Fireworks AIMany research labs only consider inference efficiency after the fact. Step 3.7 Flash is a 196B MoE model, and built for inference from the start by @StepFun_ai....

智能体开源/仓库推理模型发布

在 X 查看原推

StepFun@StepFun_ai · X