AI 精选动态
智能评分 80
GLM-5.2 成为开源权重新 SOTA
AI 推荐理由
新增了 GLM-5.2 的具体基准分数和成本对比,值得关注其与 GLM-5.1 的改进细节及开源许可。核心解读
Zai_org 发布的 GLM-5.2 在 Artificial Analysis Intelligence Index 上获得 51 分,排名第四,成为新的开源权重 SOTA 模型。该模型总参数 744B,活跃参数 40B,上下文窗口 1M,MIT 许可证,定价 $1.4/$0.26/$4.4 每百万 tokens。相比 GLM-5.1,智能指数得分提升 11 点,并在科学推理等多项基准上取得显著进步。
全文
ZAI 🔥: GLM-5.2 by @Zai_org scored 51 point on Artificial Analysis Intelligence Index and got placed on the 4th spot! This made GLM-5.2 a new SOTA open-weight model.
Besides that, GLM-5.2 got ranked second on Frontend Code Arena, after currently unavailable Claude Fable 5.
Should be ZOTA! 👀


> **引用原帖 Artificial Analysis (@ArtificialAnlys):**
> Z ai’s GLM-5.2 is the new leading open weights model on the Artificial Analysis Intelligence Index scoring 51 and it sits on the Pareto frontier of Intelligence vs Cost per Task
> @Zai_org’s GLM-5.2 is the same size as GLM-5.1 (744B total / 40B active parameters) but scores 11 points higher on the Intelligence Index v4.1, placing ahead of MiniMax-M3 (44) and DeepSeek V4 Pro (max, 44). On the first-party API it is priced in line with GLM-5.1 at $1.4/$4.4/$0.26 per 1M input/output/cache hit tokens
> Key results:
> ➤ GLM-5.2 is the leading open weights model on the Intelligence Index v4.1. At 51, it leads MiniMax-M3 (44), DeepSeek V4 Pro (max, 44) and Kimi K2.6 (43)
> ➤ Improvements across most evaluations, particularly scientific reasoning: GLM-5.2 gains over GLM-5.1 on most evaluations, led by scientific reasoning on CritPt (+16 points to 21%) and HLE (+12 points to 40%), alongside AA-LCR (+9 points to 71%), tau3 banking (+15 points to 27%) and SciCode (+7 points to 50%). TerminalBench v2.1 also improves (+16 points to 78%) and GPQA Diamond gains 3 points to 89%
> ➤ Leading open weights model on GDPval-AA v2 and competitive with proprietary models: GLM-5.2 scores 1524 on GDPval-AA v2, ahead of MiniMax-M3 (1418) and DeepSeek V4 Pro (max, 1328). This impressive result places GLM-5.2 in-line with proprietary models including GPT-5.5 (xhigh reasoning). GDPval-AA v2 builds on the original GDPval-AA by baselining Elo to human performance at 1000, introducing a rotating panel of frontier-model judges, and raising the turn limit from 100 to 250 for longer-horizon agent trajectories
> ➤ GLM-5.2 uses more output tokens per task than other leading open weights models: the model uses 43k output tokens per Intelligence Index task, up from GLM-5.1 (26k) and above MiniMax-M3 (24k), Kimi K2.6 (35k) and DeepSeek V4 Pro (max, 37k)
> ➤ On the Intelligence vs. Cost per Task Pareto Frontier: GLM-5.2 is on the Pareto frontier of the Intelligence vs Cost per Task chart, with the lowest cost per task among models at its intelligence level. GLM-5.2 costs ~$0.46 per task, compared to GLM-5.1 ($0.25), Kimi K2.6 ($0.31), MiniMax-M3 ($0.18) and DeepSeek V4 Pro (max, $0.05)
> Additional Model Details:
> ➤ License: MIT
> ➤ Size: 744B total parameters, 40B active parameters, equivalent to GLM-5.1
> ➤ Context window: 1M tokens, up from 200K on GLM-5.1
> ➤ Pricing: $1.4/$0.26/$4.4 per 1M input/cache hit/output tokens
> ➤ Availability: Alongside Z ai's first-party API, GLM-5.2 is available across third-party providers including @DeepInfra, @novita_labs, @nebiusai, @parasailnetwork , @SiliconFlowAI , @gmi_cloud , @Baseten and @FireworksAI_HQ
> https://x.com/ArtificialAnlys/status/2067135640249209175