Fugu-MT 論文翻訳(概要): Automatic Slide Updating with User-Defined Dynamic Templates and Natural Language Instructions

論文の概要: Automatic Slide Updating with User-Defined Dynamic Templates and Natural Language Instructions

arxiv url: http://arxiv.org/abs/2604.17894v1
Date: Mon, 20 Apr 2026 07:11:24 GMT
ステータス: 翻訳完了
システム内更新日: 2026-04-21 21:52:52.743618
Title: Automatic Slide Updating with User-Defined Dynamic Templates and Natural Language Instructions
Title（参考訳）: ユーザ定義動的テンプレートと自然言語命令による自動スライド更新
Authors: Kun Zhou, Jiakai He, Wenmian Yang, Zhensheng Wang, Yiquan Zhang, Weijia Jia,
Abstract要約: 既存の自動化方法は、主に固定されたテンプレートフィリングに従っており、多様なユーザによるスライドデッキの動的更新をサポートできない。我々は,2,036個の実世界の命令実行トリプル(ソーススライド,ユーザ命令,ターゲットスライド)を共有外部データベースに格納した大規模ベンチマークであるDynaSlideを紹介した。 SlideAgentはマルチモーダルなスライド解析、自然言語命令のグラウンド化、テーブル、チャート、テキストの結論に対するツール拡張推論を組み合わせたエージェントベースのフレームワークである。
参考スコア（独自算出の注目度）: 22.596430902964272
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Presentation slides are a primary medium for data-driven reporting, yet keeping complex, analytics-style decks up to date remains labor-intensive. Existing automation methods mostly follow fixed template filling and cannot support dynamic updates for diverse, user-authored slide decks. We therefore define "Dynamic Slide Update via Natural Language Instructions on User-provided Templates" and introduce DynaSlide, a large-scale benchmark with 20,036 real-world instruction-execution triples (source slide, user instruction, target slide) grounded in a shared external database and built from business reporting slides under bring-your-own-template (BYO-template) conditions. To tackle this task, we propose SlideAgent, an agent-based framework that combines multimodal slide parsing, natural language instruction grounding, and tool-augmented reasoning for tables, charts, and textual conclusions. SlideAgent updates content while preserving layout and style, providing a strong reference baseline on DynaSlide. We further design end-to-end and component-level evaluation protocols that reveal key challenges and opportunities for future research. The dataset and code are available at https://github.com/XiaoZhou2024/SlideAgent.
Abstract（参考訳）: プレゼンテーションスライドは、データ駆動レポートの主要な媒体であるが、複雑な分析スタイルのデッキを最新に維持することは、労働集約的だ。既存の自動化方法は、主に固定されたテンプレートフィリングに従っており、多様なユーザによるスライドデッキの動的更新をサポートできない。そこで我々は、“ユーザ提供テンプレートの自然言語命令による動的スライド更新”を定義し,20,036個の実世界の命令実行トリプル(ソーススライド,ユーザ命令,ターゲットスライド)を備えた大規模ベンチマークであるDynaSlideを紹介した。この課題に対処するためにSlideAgentというエージェントベースのフレームワークを提案する。これはマルチモーダルなスライド解析、自然言語命令の接地、テーブル、チャート、テキストの結論に対するツール拡張推論を組み合わせたフレームワークである。 SlideAgentはレイアウトとスタイルを維持しながらコンテンツを更新し、DynaSlideの強力なリファレンスベースラインを提供する。今後の研究の鍵となる課題と機会を明らかにするため、エンド・ツー・エンドおよびコンポーネントレベルの評価プロトコルをさらに設計する。データセットとコードはhttps://github.com/XiaoZhou2024/SlideAgent.comで公開されている。

論文の概要: Automatic Slide Updating with User-Defined Dynamic Templates and Natural Language Instructions

関連論文リスト