Fugu-MT 論文翻訳(概要): GeoSVG-RL: Geometry-Aware Reinforcement Learning for Layout-Constrained Text-to-SVG Diagram Generation

論文の概要: GeoSVG-RL: Geometry-Aware Reinforcement Learning for Layout-Constrained Text-to-SVG Diagram Generation

arxiv url: http://arxiv.org/abs/2605.25447v1
Date: Mon, 25 May 2026 05:56:44 GMT
ステータス: 翻訳完了
システム内更新日: 2026-05-26 19:50:19.335335
Title: GeoSVG-RL: Geometry-Aware Reinforcement Learning for Layout-Constrained Text-to-SVG Diagram Generation
Title（参考訳）: GeoSVG-RL:レイアウト制約付きテキスト-SVGダイアグラム生成のための幾何認識強化学習
Authors: Sifan Li, Yujun Cai, Hongkai Chen, Yiwei Wang,
Abstract要約: レイアウト制約付きテキスト・ツー・SVG生成のための特殊強化学習フレームワークGeoSVG-RLを紹介する。モデルはまず、SVGコードの後の世代のための幾何学的契約として機能する構造化レイアウト計画を生成する。 GeoSVG-RLは、特にアローアンカー精度とテキスト・イン・ボックスレートにおいて、構造的信頼性を大幅に向上させる。
参考スコア（独自算出の注目度）: 29.64540884592851
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Generating structured, editable diagrams remains a significant challenge for contemporary large language models, despite their proficiency in general-purpose vector code generation. The primary difficulty lies in the structural fragility of the output; minor errors such as misaligned connector endpoints, text labels overlapping borders, or complex layouts drifting beyond the canvas boundaries render the resulting SVG files functionally unusable for professional applications. To address these issues, we introduce GeoSVG-RL, a specialized reinforcement learning framework designed for layout-constrained text-to-SVG generation. Unlike standard training objectives that rely solely on maximizing token-level likelihood, our approach optimizes the policy against explicit, executable geometric feedback. The model first produces a structured layout plan that serves as a geometric contract for the subsequent generation of the SVG code. This code is then rendered through a browser-backed verifier, enabling the calculation of fine-grained rewards across six critical dimensions: rendering validity, canvas fitting, precise anchor placement, text containment, graph consistency, and code cleanliness. We utilize Group Relative Policy Optimization (GRPO) to refine the model, sampling multiple candidates per prompt to facilitate updates based on relative quality. Starting from a supervised warm-start phase on synthetic data, GeoSVG-RL achieves substantial gains in structural reliability, particularly in arrow-anchor accuracy and text-in-box rates. Quantitative evaluations demonstrate that our method consistently outperforms current state-of-the-art systems in local geometric precision and the preservation of graph connectivity, providing a robust pathway toward automated yet reliable technical illustration.
Abstract（参考訳）: 構造化された編集可能な図を生成することは、汎用ベクトルコード生成の習熟性にもかかわらず、現代の大規模言語モデルにとって重要な課題である。主な難しさは出力の構造的脆弱性にある; 接続エンドポイント、テキストラベルの重なり合う境界、あるいはキャンバス境界を越えてドリフトする複雑なレイアウトなどの小さなエラーは、プロのアプリケーションで機能的に使用できないSVGファイルを生成する。このような問題に対処するため,GeoSVG-RLは,レイアウト制約付きテキスト-SVG生成用に設計された強化学習フレームワークである。トークンレベルの可能性の最大化にのみ依存する標準的なトレーニング目標とは異なり、我々のアプローチは明示的で実行可能な幾何学的フィードバックに対するポリシーを最適化する。モデルはまず、SVGコードの後の世代のための幾何学的契約として機能する構造化レイアウト計画を生成する。このコードはブラウザが支援する検証器を通じてレンダリングされ、レンダリングの妥当性、キャンバスの適合性、正確なアンカー配置、テキストの封じ込め、グラフの一貫性、コードクリーン化の6つの重要な次元にわたる微妙な報酬の計算が可能になる。グループ相対政策最適化(GRPO)を用いてモデルを洗練し、プロンプト毎に複数の候補をサンプリングし、相対的な品質に基づく更新を容易にする。 GeoSVG-RLは、合成データ上の監視されたウォームスタートフェーズから始まり、特にアロー・アンカー精度とテキスト・イン・ボックスレートにおいて、構造的信頼性を大幅に向上させる。定量的評価により,本手法は局所的幾何精度とグラフ接続の保存において常に最先端のシステムより優れており,自動化された信頼性の高い技術図面への堅牢な経路を提供する。

論文の概要: GeoSVG-RL: Geometry-Aware Reinforcement Learning for Layout-Constrained Text-to-SVG Diagram Generation

関連論文リスト