返回精选
AI 精选动态 智能评分 62

中国团队发布35B参数Agent模型Agents-A1

来源: twitter关注列表
作者: Rohan Paul (@rohanpaul_ai)
发布于: 2026-07-01
收录于: 2026-07-01
AI 推荐理由
该方法通过长序列训练和专家蒸馏在小模型上实现高Agent性能,可启发低成本Agent构建路径。
核心解读
中国团队发布Agents-A1,一个35B参数的Agent模型,声称通过更长的验证思考流程达到1T参数模型性能。模型采用Apache-2.0许可证并开源权重,训练数据平均长度为45K tokens,通过专家教师蒸馏将多领域技能注入统一模型,在长任务基准测试中表现优异。
全文
🇨🇳 Another good model from China. A 35B agent model claims 1T-model performance by thinking longer, not growing bigger. Apache-2.0 license, model weights are on Hugging Face. The technique is proposing a cheaper way to make strong AI agents: teach them longer verified work habits, not just make them bigger. The paper’s main idea is to make the agent practice long tasks where it searches, uses tools, reads results, fixes mistakes, and checks answers. The authors build training data from long action records, with an average length of 45K tokens, so the model learns the whole work process. They then train specialist teacher models for search, science, instruction following, tool use, and other areas, and transfer those skills into 1 student model. Agents-A1 does very well across long-task benchmarks, including search, science, coding, tool use, and instruction following. ![photo](https://pbs.twimg.com/media/HMGRKboboAALHTa.jpg) Rohan Paul (@rohanpaul_ai): Agents-A1 turns raw sources like web pages, papers, code, and databases into training trails that record each action, tool result, mistake, and verification step. The big idea is that the model learns how to work through long tasks by practicing checked action paths, not just by memorizing final answers. https://t.co/3giViv68CD Rohan Paul (@rohanpaul_ai): The pipeline first turns many task types, like search, science, engineering, agent work, and instruction following, into long training examples that show not just answers, but the actions, tool calls, checks, and fixes used to reach them. Then it trains one broad model with supervised fine-tuning, so the model learns the basic pattern of doing long tasks across many domains. Next, it trains separate expert teacher models for each domain, because search, science, tools, and instruction following need different habits. Finally, it makes 1 unified model learn from those teachers while using its own rollouts, so it can keep the specialists’ strengths without needing to deploy many separate models.
#模型发布#技术突破