中国团队发布Agents-A1,一个35B参数的agent模型,通过让模型学习更长的验证工作习惯(平均训练样本45K tokens),声称达到1T参数模型的性能。模型采用Apache-2.0许可,权重已开源至Hugging Face。训练方法:构建长动作记录数据,训练多个专家教师模型(搜索、科学、指令跟随、工具使用等),再将技能蒸馏至一个学生模型。Agents-A1在搜索、科学、编码、工具使用、指令跟随等长任务基准上表现优异。
🇨🇳 Another good model from China.
A 35B agent model claims 1T-model performance by thinking longer, not growing bigger.
Apache-2.0 license, model weights are on Hugging Face.
The technique is proposing a cheaper way to make strong AI agents: teach them longer verified work habits, not just make them bigger.
The paper's main idea is to make the agent practice long tasks where it searches, uses tools, reads results, fixes mistakes, and checks answers.
The authors build training data from long action records, with an average length of 45K tokens, so the model learns the whole work process.