Fugu-MT 論文翻訳(概要): PosterIQ: A Design Perspective Benchmark for Poster Understanding and Generation

論文の概要: PosterIQ: A Design Perspective Benchmark for Poster Understanding and Generation

arxiv url: http://arxiv.org/abs/2603.24078v1
Date: Wed, 25 Mar 2026 08:33:51 GMT
ステータス: 翻訳完了
システム内更新日: 2026-03-26 21:06:11.211242
Title: PosterIQ: A Design Perspective Benchmark for Poster Understanding and Generation
Title（参考訳）: PosterIQ: ポスター理解と生成のための設計視点ベンチマーク
Authors: Yuheng Feng, Wen Zhang, Haodong Duan, Xingxing Zou,
Abstract要約: PosterIQは、ポスター理解と生成のための設計主導のベンチマークである。 7,765のイメージアノテーションインスタンスと822の生成プロンプトが含まれており、実際のケース、専門ケース、合成ケースにまたがっている。
参考スコア（独自算出の注目度）: 27.097615059097322
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We present PosterIQ, a design-driven benchmark for poster understanding and generation, annotated across composition structure, typographic hierarchy, and semantic intent. It includes 7,765 image-annotation instances and 822 generation prompts spanning real, professional, and synthetic cases. To bridge visual design cognition and generative modeling, we define tasks for layout parsing, text-image correspondence, typography/readability and font perception, design quality assessment, and controllable, composition-aware generation with metaphor. We evaluate state-of-the-art MLLMs and diffusion-based generators, finding persistent gaps in visual hierarchy, typographic semantics, saliency control, and intention communication; commercial models lead on high-level reasoning but act as insensitive automatic raters, while generators render text well yet struggle with composition-aware synthesis. Extensive analyses show PosterIQ is both a quantitative benchmark and a diagnostic tool for design reasoning, offering reproducible, task-specific metrics. We aim to catalyze models' creativity and integrate human-centred design principles into generative vision-language systems.
Abstract（参考訳）: 提案するPosterIQは,コンポジション構造,タイポグラフィー階層,意味的意図にアノテートされた,ポスターの理解と生成のための設計によるベンチマークである。 7,765のイメージアノテーションインスタンスと822の生成プロンプトが含まれており、実際のケース、専門ケース、合成ケースにまたがっている。視覚的デザイン認知と生成モデリングを橋渡しするために、レイアウト解析、テキスト画像対応、タイポグラフィー/可読性、フォント認識、デザイン品質評価、メタファーによる制御可能な合成認識生成のタスクを定義する。我々は、最先端のMLLMと拡散型ジェネレータを評価し、視覚階層、タイポグラフィーのセマンティクス、サリエンシ制御、意図的なコミュニケーションに永続的なギャップを見出す。大規模な分析によると、PosterIQは定量的なベンチマークであり、設計推論のための診断ツールであり、再現可能なタスク固有のメトリクスを提供する。我々は,モデルの創造性を触媒し,人間中心のデザイン原則を生成的視覚言語システムに統合することを目指している。

論文の概要: PosterIQ: A Design Perspective Benchmark for Poster Understanding and Generation

関連論文リスト