Fugu-MT 論文翻訳(概要): VectorGym: A Multitask Benchmark for SVG Code Generation, Sketching, and Editing

論文の概要: VectorGym: A Multitask Benchmark for SVG Code Generation, Sketching, and Editing

arxiv url: http://arxiv.org/abs/2603.29852v1
Date: Sun, 22 Feb 2026 10:39:14 GMT
ステータス: 翻訳完了
システム内更新日: 2026-04-06 02:36:13.155309
Title: VectorGym: A Multitask Benchmark for SVG Code Generation, Sketching, and Editing
Title（参考訳）: VectorGym:SVGコード生成、スケッチ、編集のためのマルチタスクベンチマーク
Authors: Juan Rodriguez, Haotian Zhang, Abhay Puri, Tianyang Zhang, Rishav Pramanik, Meng Lin, Xiaoqing Xie, Marco Terral, Darsh Kaushik, Aly Shariff, Perouz Taslakian, Spandana Gella, Sai Rajeswar, David Vazquez, Christopher Pal, Marco Pedersoli,
Abstract要約: 我々は、スケーラブルベクトルグラフィックス(SVG)のための包括的なベンチマークスイートであるVectorGymを紹介する。 VectorGymは、プロの設計に合わせた現実的で挑戦的なベンチマークの欠如に対処する。評価の結果,VectorGymを視覚的コード生成のための厳格なフレームワークとして位置づけるなど,重要なパフォーマンスギャップが明らかになった。
参考スコア（独自算出の注目度）: 28.04909245044009
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We introduce VectorGym, a comprehensive benchmark suite for Scalable Vector Graphics (SVG) that spans generation from text and sketches, complex editing, and visual understanding. VectorGym addresses the lack of realistic, challenging benchmarks aligned with professional design workflows. Our benchmark comprises four tasks with expert human-authored annotations: the novel Sketch2SVG task (VG-Sketch); a new SVG editing dataset (VG-Edit) featuring complex, multi-step edits with higher-order primitives; Text2SVG generation (VG-Text); and SVG captioning (VG-Cap). Unlike prior benchmarks that rely on synthetic edits, VectorGym provides gold-standard human annotations that require semantic understanding and design intent. We also propose a multi-task reinforcement learning approach that jointly optimizes across all four tasks using rendering-based rewards. Our method, built on GRPO with curriculum learning, trains a Qwen3-VL 8B model that achieves state-of-the-art performance among open-source models, surpassing much larger models including Qwen3-VL 235B and matching GPT-4o. We also introduce a VLM-as-a-Judge metric for SVG generation, validated through human correlation studies. Our evaluation of frontier VLMs reveals significant performance gaps, positioning VectorGym as a rigorous framework for advancing visual code generation. VectorGym is publicly available on huggingface.co/datasets/ServiceNow/VectorGym.
Abstract（参考訳）: 本稿では,テキストやスケッチの生成,複雑な編集,視覚的理解にまたがる,スケーラブルベクトルグラフィックス(SVG)の包括的なベンチマークスイートであるVectorGymを紹介する。 VectorGymは、プロの設計ワークフローに沿った現実的で挑戦的なベンチマークの欠如に対処する。 Sketch2SVGタスク(VG-Sketch)、高階プリミティブを用いた複雑な多段階編集を含むSVG編集データセット(VG-Edit)、Text2SVG生成(VG-Text)、SVGキャプション(VG-Cap)の4つのタスクからなる。合成編集に依存する以前のベンチマークとは異なり、VectorGymは意味的理解と設計意図を必要とするゴールドスタンダードのヒューマンアノテーションを提供する。また、レンダリングに基づく報酬を用いて、4つのタスク全てを共同で最適化するマルチタスク強化学習手法を提案する。本手法は,カリキュラム学習を伴うGRPO上に構築され,Qwen3-VL 235B や GPT-4o など,オープンソースモデル間の最先端性能を実現する Qwen3-VL 8B モデルを訓練する。また,SVG 生成のための VLM-as-a-Judge 尺度を導入し,人間相関による検証を行った。我々のフロンティアVLMの評価では、VectorGymを視覚的コード生成のための厳格なフレームワークとして位置づけるなど、大きなパフォーマンスギャップが明らかになっている。 VectorGym は huggingface.co/datasets/ServiceNow/VectorGym で公開されている。

論文の概要: VectorGym: A Multitask Benchmark for SVG Code Generation, Sketching, and Editing

関連論文リスト