推文指出,AI智能体的强弱不只取决于模型,更依赖于模型周围的系统约束(harness)。该系统决定了模型的输入、可用工具、记忆及操作验证。核心进步应来自扩展此系统,尤其要提升上下文控制、记忆可信度以及工具或子智能体的路由能力。文中强调,长上下文不等于可用上下文,记忆多不等于可信,工具多不等于会用。这使得当前仅凭单次benchmark分数的评估方式显得薄弱。未来前沿在于扩展围绕智能体的系统约束,而不仅仅是扩展模型本身。相关论文标题为《From Model Scaling to System Scaling: Scaling the Harness in Agentic AI》。
Stronger agents will not come only from larger models, but from better systems around them.
The problem is that many AI agents are judged as if the model alone did the work, even though the real behavior also depends on memory, tools, context, routing, checks, and permissions.
This surrounding setup around the agent is called harness, meaning the system that decides what the model sees, what tools it can use, what it remembers, and what actions get checked.
Progress should come from scaling this harness, especially 3 parts: better context control, more trustworthy memory, and better routing to tools or helper agents.