AI 精选动态
智能评分 65
New Frontier Red Team blog: Phase 2 of Project Fetch
AI 推荐理由
该测试揭示了 AI 在机器人编程中的速度优势与物理执行间的差距,值得关注后续改进。核心解读
Anthropic 的 New Frontier Red Team 在 Project Fetch Phase 2 中测试了 Claude 编程机器狗的能力。Opus 4.7 比去年最佳人类团队(辅以 Opus 4.1)速度快约20倍,但机器狗仍未成功取到沙滩球。
全文
New Frontier Red Team blog: Phase 2 of Project Fetch, where we test how well Claude can program a robodog.
Opus 4.7, on its own, was ~20x faster than last year's best human team aided by Opus 4.1. (The robodog, alas, still failed to fetch a beach ball.)
https://www.anthropic.com/research/project-fetch-phase-two