AI 精选动态智能评分 60

SVG生成基准测试发布

来源: twitter关注列表

作者: ModelScope (@ModelScope2022)

发布于: 2026-06-25

收录于: 2026-06-25

AI 推荐理由

该基准测试聚焦LLM生成SVG的能力，采用大规模人工评分，为多模态生成评估提供新参考。

核心解读

Rapidata 在 ModelScope 发布 SVG 基准测试，比较 30 个前沿 LLM 的静态 SVG 生成能力。人工评估包含 188,754 次对比、500 个提示和 1,355,161 条人类响应。Claude Fable 5 Thinking 以 1232.9 ELO 排名第一。

全文

Rapidata SVG Benchmark just landed on ModelScope, comparing 30 frontier LLMs on static SVG generation from text prompts, with 1.35M+ human votes across preference, coherence, and prompt alignment. 🚀 🤖 https://t.co/DUNsYVKVHY 📊 Scale: 188,754 head-to-head comparisons, 500 prompts, 14,872 rasterized SVG images, and 1,355,161 human responses 🎨 Evaluation target: raw SVG markup generated by LLMs, rendered to 768x768 PNGs, then ranked by humans instead of automated metrics 🏆 Overall ranking: Claude Fable 5 Thinking leads with 1232.9 ELO, followed by Claude Fable 5 and Gemini 3.1 Pro Preview License: CC-BY-4.0 for the benchmark prompts, with generated outputs governed by each model provider's terms.

#基准测试#技术#模型发布

阅读原始全文