Fugu-MT 論文翻訳(概要): TASTE: A Designer-Annotated Multi-Dimensional Preference Dataset for AI-Generated Graphic Design

論文の概要: TASTE: A Designer-Annotated Multi-Dimensional Preference Dataset for AI-Generated Graphic Design

arxiv url: http://arxiv.org/abs/2605.20731v1
Date: Wed, 20 May 2026 05:27:05 GMT
ステータス: 翻訳完了
システム内更新日: 2026-05-21 19:19:56.496499
Title: TASTE: A Designer-Annotated Multi-Dimensional Preference Dataset for AI-Generated Graphic Design
Title（参考訳）: TASTE:AI生成グラフィクス設計のためのデザイナアノテーション付き多次元参照データセット
Authors: Haonan Zhu, Elad Hirsch, Alexandria Minetti, Allison Nulty, Purvanshi Mehta,
Abstract要約: TASTE (Typography, Aesthetics, Space, Tone, Etc.): 現在の4つのテキスト・画像モデルの出力を9つの基準でランク付けした10人のプロデザイナー。 TASTEは、食品と映画の好みと写真スタイルの画質の間のグラフィックデザインに関するデザイナーの合意を定めている。ベンチマークでは,3Bから33Bパラメータの6人のオープンウェイトVLM審査員を含む,事前訓練されたシステムはない。
参考スコア（独自算出の注目度）: 43.31865418601155
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Text-to-image models produce graphic design at production scale, but their supervision comes from photo-style preference data with a single overall verdict per comparison. Designers evaluate along several distinct axes, including typography, visual hierarchy, color harmony, layout, and brief fidelity, and a single label collapses them. We release TASTE (Typography, Aesthetics, Spatial, Tone, Etc.): ten professional designers ranked outputs from four current text-to-image models on nine criteria across two disjoint cohorts, yielding 1,600 ratings per criterion plus per-image hallucination flags on the holistic-preference cohorts. We pair the dataset with three contributions. First, a criterion-agnostic signal test framework, using Kendall's tau, majority probability, and Condorcet cycles against exact iid-uniform nulls at p = 4 and R = 5, places designer agreement on graphic design between food and movie preferences and photo-style image quality, with every TASTE criterion rejecting the random-rater null. Second, no pre-trained system in our benchmark, including six open-weight VLM judges from 3B to 33B parameters and three dedicated T2I scorers, HPSv2.1, PickScore-v1, and LAION-Aesthetic-V2, exceeds 0.55 macro agreement with the 5-designer majority; VLM judges trade off position bias against content sensitivity, so scaling moves along this frontier without improving accuracy. Third, a small pairwise-difference head trained on TASTE reaches 0.611, closing roughly half the gap to the 0.741 single-rater ceiling.
Abstract（参考訳）: テキスト・ツー・イメージ・モデルは、プロダクション規模でグラフィックデザインを生成するが、その監督は、写真スタイルの嗜好データから得られる。デザイナーは、タイポグラフィー、視覚的階層、色調和、レイアウト、短い忠実さなど、いくつかの異なる軸に沿って評価し、1つのラベルがそれらを崩壊させる。 TASTE (Typography, Aesthetics, Space, Tone, Etc.): プロのデザイナー10人は、現在のテキスト・ツー・イメージ・モデルの出力を2つの非結合コホートで9つの基準でランク付けし、基準あたり1,600のレーティングと全体参照コホート上の画像毎の幻覚旗を出力した。データセットには3つのコントリビューションがあります。まず、Kendall's tau、多数確率、Condorcet cycles against exact iid-uniform nulls at p = 4 and R = 5という基準に依存しない信号テストフレームワークは、食品と映画の嗜好と写真スタイルの画質の間のグラフィックデザインに関する設計上の合意を定め、各TASTE criterionはランダムラターヌルを拒否する。第2に、3Bから33Bパラメータの6人のオープンウェイトVLM判事と3人の専用T2Iスコアラー、HPSv2.1、PickScore-v1、LAION-Aesthetic-V2を含む事前訓練されたシステムは、5-Designerの多数派と0.55マクロ合意を超えていない。第3に、TASTEで訓練された小さな対差ヘッドは0.611に達し、0.741枚の天井の約半分を閉じる。

論文の概要: TASTE: A Designer-Annotated Multi-Dimensional Preference Dataset for AI-Generated Graphic Design

関連論文リスト