返回精选
AI 精选动态 智能评分 75

Anthropic 重新上线 Claude Fable 5,部署新安全分类器

来源: twitter关注列表
作者: Thariq (@trq212)
发布于: 2026-07-01
收录于: 2026-07-01
AI 推荐理由
Anthropic 将常规代码/调试任务回退到 Opus 4.8,并牵头建立云巨头共识框架评估 jailbreaks 严重性,是行业中首次明确的治理方案。
核心解读
Anthropic 宣布 Claude Fable 5 将于明日全球可用,新增针对网络安全攻击的分类器 routine 代码/调试任务将回退到 Opus 4.8,公司与 Amazon、Microsoft、Google 等 Glasswing 合作制定 AI jailbreaks 严重性评估框架,并扩大美国政府在模型测试、安全预审、信息共享及联合研究方面的合作。
全文
Have seen some questions about the updated classifiers and wanted to clarify. As with the original classifiers, a small fraction of routine coding and debugging tasks will be flagged and fall back to Opus. We're excited for guys to get access back tomorrow. > **引用原帖 Anthropic (@AnthropicAI):** > Claude Fable 5 will be available again globally tomorrow. > After a series of productive conversations with the US government, we're redeploying the model with a new set of classifiers to target and block more cybersecurity tasks. In the near term, some routine tasks like coding and debugging will fall back to Opus 4.8. We’ll continue to refine these classifiers over the coming weeks to reduce false positives and better distinguish genuine misuse from legitimate requests. > We’ve also begun drafting a consensus framework—with Amazon, Microsoft, Google, and other Glasswing partners—for assessing the severity of AI jailbreaks and how AI developers should respond to them. We invite other industry partners and model providers to join us in this effort. > Finally, we’re scaling up our collaboration with the US government on model testing and safeguards. This will include pre-release access to models and safeguards for evaluation, information sharing on jailbreaks and misuse, and dedicated resources for joint research. > Thank you to our users for your patience, and to our partners across the government, industry, and the research community who worked alongside us to make Fable 5 available again. > Read our full blog: https://t.co/VHyum831ri > https://x.com/AnthropicAI/status/2072163884430229756 Thariq (@trq212): And as we say in our blog, we're continuing to refine these safeguards to better distinguish genuine misuse from legitimate requests and reduce false positives.
#AI产业#技术更新#AI安全