Fugu-MT 論文翻訳(概要): DiagramRAG: A Lightweight Framework to Retrieve Scientific Diagram for Figure Generation

論文の概要: DiagramRAG: A Lightweight Framework to Retrieve Scientific Diagram for Figure Generation

arxiv url: http://arxiv.org/abs/2605.27931v1
Date: Wed, 27 May 2026 04:03:22 GMT
ステータス: 翻訳完了
システム内更新日: 2026-05-28 17:38:55.737465
Title: DiagramRAG: A Lightweight Framework to Retrieve Scientific Diagram for Figure Generation
Title（参考訳）: DiagramRAG: 図生成のための科学図を検索する軽量フレームワーク
Authors: Xinjiang Yu, Junyi Han, Zhuofan Chen, Chi Zhang, Xiangyu Fu, Jingyuan Tan, Zirui You, Yixiang Jian, Yu-Ping Wang, Chengliang Chai,
Abstract要約: スケッチに基づく科学図作成のための軽量な検索拡張フレームワークであるDiagramRAGを紹介する。ユーザスケッチが与えられたら、DiagramRAGは、スケッチの内容に意味的に関連し、その構造とトポロジ的に互換性のある参照ダイアグラムを検索する。実験の結果,DigramRAG は DiagramBank と FigureBench でそれぞれ 0.848 と 0.802 の F1 スコアを獲得し,VLM-as-a-Judge スコア 7.170 で生成品質を向上させることがわかった。
参考スコア（独自算出の注目度）: 9.701968199439387
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Scientific diagrams are essential for communicating complex methodologies in academic papers. A natural way for researchers to specify such diagrams is through rough sketches, where text labels, connectors, and spatial arrangements express early semantic and topological intentions. However, sketches are usually incomplete, making them insufficient for directly producing publication-quality diagrams. Existing sketch-based generation methods mainly reconstruct the sketch itself, while recent text-driven diagram generation frameworks rely on textual semantics and do not fully exploit the topological structure contained in sketches. In this paper, we introduce DiagramRAG, a lightweight retrieval-augmented framework for sketch-based scientific diagram completion. Given a user sketch, DiagramRAG retrieves reference diagrams that are both semantically relevant to the sketch content and topologically compatible with its structure, and uses them to guide downstream diagram generation. To enable efficient structure-aware retrieval, we represent diagrams as knowledge graphs, synthesize sketch variants at different simplification levels, and train an embedding model to align sketches with compatible diagrams in a shared space. The retrieved references further provide content, topology, and visual priors for completing and rendering the final diagram. Experiments show that DiagramRAG achieves F1-scores of 0.848 and 0.802 on DiagramBank and FigureBench, respectively, and improves generation quality with the best VLM-as-a-Judge score of 7.170, while reducing inference latency to 35.48 seconds per sample. Our code and data are available at https://anonymous.4open.science/r/DiagramRAG-A262 and https://huggingface.co/datasets/anonymous-review-a262/DiagramSketch.
Abstract（参考訳）: 科学図は学術論文における複雑な方法論の伝達に不可欠である。研究者がそのような図を指定するための自然な方法は、テキストラベル、コネクタ、空間配置が初期の意味論と位相的意図を表現するような粗いスケッチである。しかし、スケッチは通常不完全であり、出版品質の図を直接作成するには不十分である。既存のスケッチベースの生成手法は主にスケッチ自体を再構築するが、最近のテキスト駆動図生成フレームワークはテキスト意味論に依存しており、スケッチに含まれるトポロジ構造を完全に活用していない。本稿では,スケッチに基づく科学図作成のための軽量な検索拡張フレームワークであるDiagramRAGを紹介する。ユーザスケッチが与えられたら、DiagramRAGはスケッチの内容に意味的に関連し、その構造とトポロジ的に互換性のある参照ダイアグラムを検索し、下流ダイアグラムの生成をガイドする。効率的な構造認識検索を実現するため、図を知識グラフとして表現し、異なる単純化レベルでスケッチ変種を合成し、埋め込みモデルを訓練して、共有空間における互換性のある図と整合させる。検索された参照はさらに、最終図の完成とレンダリングのためのコンテンツ、トポロジ、視覚的事前情報を提供する。実験の結果、DigramRAGはDiagramBankとFinancialBenchでそれぞれ0.848と0.802のF1スコアを獲得し、最高のVLM-as-a-Judgeスコア7.170で生成品質を向上し、推論遅延をサンプルあたり35.48秒に短縮した。私たちのコードとデータはhttps://anonymous.4open.science/r/DiagramRAG-A262とhttps://huggingface.co/datasets/anonymous-review-a262/DiagramSketchで利用可能です。

論文の概要: DiagramRAG: A Lightweight Framework to Retrieve Scientific Diagram for Figure Generation

関連論文リスト