Fugu-MT 論文翻訳(概要): BizGenEval: A Systematic Benchmark for Commercial Visual Content Generation

論文の概要: BizGenEval: A Systematic Benchmark for Commercial Visual Content Generation

arxiv url: http://arxiv.org/abs/2603.25732v1
Date: Thu, 26 Mar 2026 17:59:16 GMT
ステータス: 翻訳完了
システム内更新日: 2026-03-27 20:52:48.426623
Title: BizGenEval: A Systematic Benchmark for Commercial Visual Content Generation
Title（参考訳）: BizGenEval: 商用ビジュアルコンテンツ生成のためのシステムベンチマーク
Authors: Yan Li, Zezi Zeng, Ziwei Zhou, Xin Gao, Muzhao Tian, Yifan Yang, Mingxi Cheng, Qi Dai, Yuqing Yang, Lili Qiu, Zhendong Wang, Zhengyuan Yang, Xue Yang, Lijuan Wang, Ji Li, Chong Luo,
Abstract要約: BizGenEvalは、商用のビジュアルコンテンツ生成のための体系的なベンチマークである。 BizGenEvalには400の慎重にキュレートされたプロンプトと8000の人間検証チェックリスト質問が含まれている。その結果、現在の生成モデルとプロのビジュアルコンテンツ作成の要件との間には、かなりの能力ギャップが明らかとなった。
参考スコア（独自算出の注目度）: 96.52958279106777
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Recent advances in image generation models have expanded their applications beyond aesthetic imagery toward practical visual content creation. However, existing benchmarks mainly focus on natural image synthesis and fail to systematically evaluate models under the structured and multi-constraint requirements of real-world commercial design tasks. In this work, we introduce BizGenEval, a systematic benchmark for commercial visual content generation. The benchmark spans five representative document types: slides, charts, webpages, posters, and scientific figures, and evaluates four key capability dimensions: text rendering, layout control, attribute binding, and knowledge-based reasoning, forming 20 diverse evaluation tasks. BizGenEval contains 400 carefully curated prompts and 8000 human-verified checklist questions to rigorously assess whether generated images satisfy complex visual and semantic constraints. We conduct large-scale benchmarking on 26 popular image generation systems, including state-of-the-art commercial APIs and leading open-source models. The results reveal substantial capability gaps between current generative models and the requirements of professional visual content creation. We hope BizGenEval serves as a standardized benchmark for real-world commercial visual content generation.
Abstract（参考訳）: 画像生成モデルの最近の進歩は、美的イメージを超えて、実用的な視覚コンテンツ作成へと応用を広げている。しかし、既存のベンチマークは主に自然画像合成に重点を置いており、実世界の商用デザインタスクの構造的・多制約的な要求の下でモデルを体系的に評価することができない。本稿では,商業用ビジュアルコンテンツ生成のための体系的ベンチマークであるBizGenEvalを紹介する。このベンチマークは、スライド、チャート、Webページ、ポスター、科学的な数字の5つの代表的なドキュメントタイプにまたがっており、テキストレンダリング、レイアウト制御、属性バインディング、ナレッジベースの推論の4つの重要な機能ディメンションを評価し、20の多様な評価タスクを形成している。 BizGenEvalには400の慎重にキュレートされたプロンプトと8000の人間による検証されたチェックリストの質問が含まれており、生成した画像が複雑な視覚的および意味的な制約を満たすかどうかを厳格に評価している。最先端の商用APIや主要なオープンソースモデルを含む,26の人気のある画像生成システムに対して,大規模なベンチマークを実施している。その結果、現在の生成モデルとプロのビジュアルコンテンツ作成の要件との間には、かなりの能力ギャップが明らかとなった。 BizGenEvalが、現実世界の商用ビジュアルコンテンツ生成の標準ベンチマークとして機能することを願っている。

論文の概要: BizGenEval: A Systematic Benchmark for Commercial Visual Content Generation

関連論文リスト