AI 精选动态智能评分 80

GLM-5.2 成为开源权重新 SOTA

来源: twitter关注列表

作者: 🚨 AI News | TestingCatalog (@testingcatalog)

发布于: 2026-06-17

收录于: 2026-06-17

AI 推荐理由

新增了 GLM-5.2 的具体基准分数和成本对比，值得关注其与 GLM-5.1 的改进细节及开源许可。

核心解读

Zai_org 发布的 GLM-5.2 在 Artificial Analysis Intelligence Index 上获得 51 分，排名第四，成为新的开源权重 SOTA 模型。该模型总参数 744B，活跃参数 40B，上下文窗口 1M，MIT 许可证，定价 $1.4/$0.26/$4.4 每百万 tokens。相比 GLM-5.1，智能指数得分提升 11 点，并在科学推理等多项基准上取得显著进步。

全文

ZAI 🔥: GLM-5.2 by @Zai_org scored 51 point on Artificial Analysis Intelligence Index and got placed on the 4th spot! This made GLM-5.2 a new SOTA open-weight model. Besides that, GLM-5.2 got ranked second on Frontend Code Arena, after currently unavailable Claude Fable 5. Should be ZOTA! 👀 ![photo](https://pbs.twimg.com/media/HK_30yuXgAA5rWg.jpg) ![photo](https://pbs.twimg.com/media/HK_30yWXcAAK8qV.jpg) > **引用原帖 Artificial Analysis (@ArtificialAnlys):** > Z ai’s GLM-5.2 is the new leading open weights model on the Artificial Analysis Intelligence Index scoring 51 and it sits on the Pareto frontier of Intelligence vs Cost per Task > @Zai_org’s GLM-5.2 is the same size as GLM-5.1 (744B total / 40B active parameters) but scores 11 points higher on the Intelligence Index v4.1, placing ahead of MiniMax-M3 (44) and DeepSeek V4 Pro (max, 44). On the first-party API it is priced in line with GLM-5.1 at $1.4/$4.4/$0.26 per 1M input/output/cache hit tokens > Key results: > ➤ GLM-5.2 is the leading open weights model on the Intelligence Index v4.1. At 51, it leads MiniMax-M3 (44), DeepSeek V4 Pro (max, 44) and Kimi K2.6 (43) > ➤ Improvements across most evaluations, particularly scientific reasoning: GLM-5.2 gains over GLM-5.1 on most evaluations, led by scientific reasoning on CritPt (+16 points to 21%) and HLE (+12 points to 40%), alongside AA-LCR (+9 points to 71%), tau3 banking (+15 points to 27%) and SciCode (+7 points to 50%). TerminalBench v2.1 also improves (+16 points to 78%) and GPQA Diamond gains 3 points to 89% > ➤ Leading open weights model on GDPval-AA v2 and competitive with proprietary models: GLM-5.2 scores 1524 on GDPval-AA v2, ahead of MiniMax-M3 (1418) and DeepSeek V4 Pro (max, 1328). This impressive result places GLM-5.2 in-line with proprietary models including GPT-5.5 (xhigh reasoning). GDPval-AA v2 builds on the original GDPval-AA by baselining Elo to human performance at 1000, introducing a rotating panel of frontier-model judges, and raising the turn limit from 100 to 250 for longer-horizon agent trajectories > ➤ GLM-5.2 uses more output tokens per task than other leading open weights models: the model uses 43k output tokens per Intelligence Index task, up from GLM-5.1 (26k) and above MiniMax-M3 (24k), Kimi K2.6 (35k) and DeepSeek V4 Pro (max, 37k) > ➤ On the Intelligence vs. Cost per Task Pareto Frontier: GLM-5.2 is on the Pareto frontier of the Intelligence vs Cost per Task chart, with the lowest cost per task among models at its intelligence level. GLM-5.2 costs ~$0.46 per task, compared to GLM-5.1 ($0.25), Kimi K2.6 ($0.31), MiniMax-M3 ($0.18) and DeepSeek V4 Pro (max, $0.05) > Additional Model Details: > ➤ License: MIT > ➤ Size: 744B total parameters, 40B active parameters, equivalent to GLM-5.1 > ➤ Context window: 1M tokens, up from 200K on GLM-5.1 > ➤ Pricing: $1.4/$0.26/$4.4 per 1M input/cache hit/output tokens > ➤ Availability: Alongside Z ai's first-party API, GLM-5.2 is available across third-party providers including @DeepInfra, @novita_labs, @nebiusai, @parasailnetwork , @SiliconFlowAI , @gmi_cloud , @Baseten and @FireworksAI_HQ > https://x.com/ArtificialAnlys/status/2067135640249209175

#模型发布#大模型#技术突破

阅读原始全文