AI 精选动态智能评分 60

tufalabs 赢 ARC-AGI-3 里程碑

来源: twitter关注列表

作者: François Chollet (@fchollet)

发布于: 2026-07-01

收录于: 2026-07-01

AI 推荐理由

差异点：该方案通过在基准中融入语言和代理架构突破 ARC-AGI-3 的无规则推理难度，值得关注其框架设计细节。

核心解读

tufalabs 使用 27B 参数开放权重模型 Qwen 3.6 和自研代理框架 'The Duck'，赢得 ARC-AGI-3 基准第一个里程碑。ARC-AGI-3 比前代更难，无规则、无显式目标，需在推理过程中发现。

全文

François Chollet (@fchollet) 转发了 Machine Learning Street Talk (@MLStreetTalk) 的帖子： ARC-AGI-3 is built different, it has dumbfounded almost all regular attempts so far because it's so much harder than anything that came before. It has no rules, it's agentic and has no explicit goals, they need to be discovered. @tufalabs won the first milestone of @arcprize > There is no language built into the benchmark, but these guys "put the language back in", because in their view - it's the best way to climb up the notional "abstraction mountain" and effectively use many of the abstractions which have evolved over millions of years of language evolution. > They built a novel harness "The Duck" around a 27B open weights model (Qwen 3.6) to solve extremely challenging and novel reasoning problems that require abstraction. > This is the launch video of their winning agentic harness, "The Duck". We have also released an exclusive interview with them on MLST, just dropped. > The million dollar question is: what will @fchollet think about how they've done it, and is this a step towards AGI? https://video.twimg.com/amplify_video/2072315772462854145/vid/avc1/3840x2160/Vj1ixOGUWa4DB1g0.mp4?tag=28

#AI#技术突破#模型

阅读原始全文