Fugu-MT 論文翻訳(概要): UniPPTBench: A Unified Benchmark for Presentation Generation Across Diverse Input Settings

論文の概要: UniPPTBench: A Unified Benchmark for Presentation Generation Across Diverse Input Settings

arxiv url: http://arxiv.org/abs/2605.17356v1
Date: Sun, 17 May 2026 09:50:16 GMT
ステータス: 翻訳完了
システム内更新日: 2026-05-19 17:57:47.917642
Title: UniPPTBench: A Unified Benchmark for Presentation Generation Across Diverse Input Settings
Title（参考訳）: UniPPTBench: 複数入力設定間のプレゼンテーション生成のための統一ベンチマーク
Authors: Bo Zhao, Maosheng Pang, Chen Zhang, Huan Yang, Yixin Cao, Wei Ji,
Abstract要約: 既存の作業は通常、独立した入力設定下でのプレゼンテーション生成に重点を置いている。現実世界のユースケースは、曖昧なユーザプロンプト、長いドキュメント、マルチモーダル素材、複数の異種ソースなど、さまざまなシナリオにまたがっています。提案するUniPPTBenchは,4つの代表的な入力設定にまたがって,プレゼンテーション生成のための統一ベンチマークである。
参考スコア（独自算出の注目度）: 23.076274859522883
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Existing works typically focus on presentation generation under isolated input settings, whereas real-world use cases span diverse scenarios, including vague user prompts, long documents, multimodal materials, and multiple heterogeneous sources. Moreover, current evaluations are often insufficiently scenario-specific. They mainly rely on generic presentation-quality criteria, such as visual appeal, layout quality, and overall coherence, but fail to assess the core capabilities required by different input settings, including grounded compression, visual-text alignment, and cross-source synthesis. Consequently, the field lacks a unified benchmark and a scenario-aware evaluation framework for faithfully diagnosing presentation-generation systems across diverse real-world settings. We present UniPPTBench, a unified benchmark for presentation generation across four representative input settings: vague-prompt, long-document, multimodal-document, and multi-source generation. We further introduce UniPPTEval, a scenario-aware evaluation protocol that combines shared metrics for cross-setting comparison with scenario-specific metrics tailored to the core requirements of each setting. We also provide transparent reference baselines to support reproducible comparison. Experiments on UniPPTBench reveal substantial performance variation across settings and recurring failure modes in content grounding, multimodal integration, and cross-source synthesis. In particular, strong performance on generic presentation-quality metrics does not necessarily imply strong task fulfillment in grounded scenarios. Together, UniPPTBench and UniPPTEval provide a faithful and diagnostic foundation for evaluating presentation generation across diverse real-world scenarios. Code and data will be publicly available.
Abstract（参考訳）: 既存の作業は、独立した入力設定下でのプレゼンテーション生成に重点を置いているのに対し、実際のユースケースは、曖昧なユーザプロンプト、長いドキュメント、マルチモーダル素材、複数の異種ソースなど、さまざまなシナリオにまたがっている。さらに、現在の評価は、しばしばシナリオ固有である。それらは主に、視覚的魅力、レイアウト品質、全体的なコヒーレンスといった一般的なプレゼンテーション品質基準に依存しているが、基底圧縮、ビジュアルテキストアライメント、クロスソース合成など、異なる入力設定で必要とされるコア機能の評価には失敗した。その結果、フィールドには統一されたベンチマークと、様々な実世界の環境にまたがるプレゼンテーション生成システムを忠実に診断するためのシナリオ対応評価フレームワークが欠如している。提案するUniPPTBenchは、あいまいなプロンプト、長いドキュメント、マルチモーダルドキュメント、マルチソース生成という4つの代表的な入力設定にまたがる、プレゼンテーション生成のための統一ベンチマークである。さらに、シナリオ対応評価プロトコルUniPPTEvalを導入し、各設定のコア要件に合わせたシナリオ固有のメトリクスとクロスセット比較のための共有メトリクスを組み合わせる。また、再現可能な比較をサポートするために、透過的な参照ベースラインも提供します。 UniPPTBenchの実験では、コンテントグラウンディング、マルチモーダル統合、クロスソース合成において、設定と繰り返し発生する障害モードの大幅なパフォーマンス変化が示されている。特に、汎用的なプレゼンテーション品質のメトリクスに対する強いパフォーマンスは、基礎的なシナリオにおいて、必ずしも強いタスク充足を暗示するわけではない。 UniPPTBenchとUniPPTEvalは共に、さまざまな実世界のシナリオにおけるプレゼンテーション生成を評価するための忠実で診断的な基盤を提供する。コードとデータは公開されます。

論文の概要: UniPPTBench: A Unified Benchmark for Presentation Generation Across Diverse Input Settings

関連論文リスト