AI 精选动态智能评分 60

GLM-5.2移动开发能力大幅提升

来源: twitter关注列表

作者: Z.ai (@Zai_org)

发布于: 2026-06-19

收录于: 2026-06-19

AI 推荐理由

该数据首次量化了GLM-5.2在复杂移动开发任务上的能力提升，值得关注。

核心解读

智谱GLM-5.2在移动开发基准测试中任务完成数从21/70提升至48/70，接近Claude Fable 5的56/70，实现两倍以上提升。

全文

Z.ai (@Zai_org) 转发了 Cunxiang Wang (@CunxiangWang) 的帖子： GLM-5.2 is not only stronger on benchmarks, but also much better in real app development scenarios — iOS, Android, WeChat Mini Programs, and more. Behind this jump is a full loop from environment construction, evaluation, data optimization, reward design, to training. Real tasks, real execution, real improvement. > **引用原帖 Zixuan Li (@ZixuanLi_):** > GLM-5.2 delivers a substantial leap in app development capabilities, which also represent demanding long-horizon tasks. > Results: > - GLM-5.1: 21/70 > - GLM-5.2: 48/70 > - Claude Fable 5: 56/70 > That's more than a twofold improvement from GLM-5.1 to GLM-5.2. > These come from an internal benchmark of 35 challenging mobile development tasks, each run twice for a total of 70 trials. We measured task completion, defined as core features working without major issues. > https://x.com/ZixuanLi_/status/2067803136283005393

#模型发布#大模型#基准测试

阅读原始全文