返回精选
AI 精选动态 智能评分 60

GLM-5.2移动开发能力大幅提升

来源: twitter关注列表
作者: Z.ai (@Zai_org)
发布于: 2026-06-19
收录于: 2026-06-19
AI 推荐理由
该数据首次量化了GLM-5.2在复杂移动开发任务上的能力提升,值得关注。
核心解读
智谱GLM-5.2在移动开发基准测试中任务完成数从21/70提升至48/70,接近Claude Fable 5的56/70,实现两倍以上提升。
全文
Z.ai (@Zai_org) 转发了 Cunxiang Wang (@CunxiangWang) 的帖子: GLM-5.2 is not only stronger on benchmarks, but also much better in real app development scenarios — iOS, Android, WeChat Mini Programs, and more. Behind this jump is a full loop from environment construction, evaluation, data optimization, reward design, to training. Real tasks, real execution, real improvement. > **引用原帖 Zixuan Li (@ZixuanLi_):** > GLM-5.2 delivers a substantial leap in app development capabilities, which also represent demanding long-horizon tasks. > Results: > - GLM-5.1: 21/70 > - GLM-5.2: 48/70 > - Claude Fable 5: 56/70 > That's more than a twofold improvement from GLM-5.1 to GLM-5.2. > These come from an internal benchmark of 35 challenging mobile development tasks, each run twice for a total of 70 trials. We measured task completion, defined as core features working without major issues. > https://x.com/ZixuanLi_/status/2067803136283005393
#模型发布#大模型#基准测试