AI智能体编码任务成本高昂且难预测

elvis@omarsar0

2026-04-28 00:33·54天前

AI 摘要

一项针对AI智能体在编码任务中token消耗成本的系统性研究发现，其消耗量可达聊天或代码推理的约1000倍，且相同任务在不同运行中的消耗差异高达30倍。更高的token支出并不直接带来更高的准确性，性能在中等成本时达到峰值后趋于饱和。模型自身也难以预测其token使用量，自我预测相关性最高仅0.39。不同模型在相同任务上可能多消耗150万token而并无质量提升。这表明智能体的运行时成本具有高方差、与质量关联弱、甚至模型自身也无法预测的特性，这将影响团队的预算规划、模型间路由策略以及终止任务运行的决策。

How do AI agents spend your money：

DAIR.AIHow do AI Agents spend your money? Most teams treat agent token costs as a rounding error even though the data says they shouldn't. New paper presents the first...

智能体论文/研究部署/工程

在 X 查看原推

elvis@omarsar0 · X

2026-04-28 00:33·54天前

AI 摘要

How do AI agents spend your money：

DAIR.AIHow do AI Agents spend your money? Most teams treat agent token costs as a rounding error even though the data says they shouldn't. New paper presents the first...

智能体论文/研究部署/工程

在 X 查看原推x.com