Fugu-MT 論文翻訳(概要): Correlation-Weighted Multi-Reward Optimization for Compositional Generation

論文の概要: Correlation-Weighted Multi-Reward Optimization for Compositional Generation

arxiv url: http://arxiv.org/abs/2603.18528v1
Date: Thu, 19 Mar 2026 06:19:18 GMT
ステータス: 翻訳完了
システム内更新日: 2026-03-20 17:19:05.979615
Title: Correlation-Weighted Multi-Reward Optimization for Compositional Generation
Title（参考訳）: 相関重み付き多重逆最適化による合成生成
Authors: Jungmyung Wi, Hyunsoo Kim, Donghyun Kim,
Abstract要約: 我々のフレームワークは競合する報酬信号のバランスを保ち部分的に満足しているがサンプル間で一貫性のない概念を強調します本稿では,最先端拡散モデルであるSD3.5とFLUX.1-devのトレーニングにアプローチを適用し,挑戦的マルチコンセプトベンチマークにおける一貫した改善を実証する。
参考スコア（独自算出の注目度）: 5.347765461028618
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Text-to-image models produce images that align well with natural language prompts, but compositional generation has long been a central challenge. Models often struggle to satisfy multiple concepts within a single prompt, frequently omitting some concepts and resulting in partial success. Such failures highlight the difficulty of jointly optimizing multiple concepts during reward optimization, where competing concepts can interfere with one another. To address this limitation, we propose Correlation-Weighted Multi-Reward Optimization (\ours), a framework that leverages the correlation structure among concept rewards to adaptively weight each attribute concept in optimization. By accounting for interactions among concepts, \ours balances competing reward signals and emphasizes concepts that are partially satisfied yet inconsistently generated across samples, improving compositional generation. Specifically, we decompose multi-concept prompts into pre-defined concept groups (\eg, objects, attributes, and relations) and obtain reward signals from dedicated reward models for each concept. We then adaptively reweight these rewards, assigning higher weights to conflicting or hard-to-satisfy concepts using correlation-based difficulty estimation. By focusing optimization on the most challenging concepts within each group, \ours encourages the model to consistently satisfy all requested attributes simultaneously. We apply our approach to train state-of-the-art diffusion models, SD3.5 and FLUX.1-dev, and demonstrate consistent improvements on challenging multi-concept benchmarks, including ConceptMix, GenEval 2, and T2I-CompBench.
Abstract（参考訳）: テキスト・ツー・イメージモデルは、自然言語のプロンプトとよく一致した画像を生成するが、合成生成は長い間、中心的な課題であった。モデルは1つのプロンプト内で複数の概念を満たすのに苦労することが多く、しばしばいくつかの概念を省略し、部分的に成功する。このような失敗は、競合する概念が互いに干渉し合うような報酬最適化の間、複数の概念を共同で最適化することの難しさを浮き彫りにする。この制限に対処するために,概念報酬間の相関構造を利用して各属性概念を適応的に重み付けするフレームワークである相関重み付きマルチリワード最適化(\ours)を提案する。概念間の相互作用を考慮に入れることで、ウールズは競合する報酬信号のバランスを保ち、サンプル間で部分的に満たされながら一貫性のない概念を強調し、構成生成を改善する。具体的には、マルチコンセプトプロンプトを事前定義された概念群(対象、属性、関係)に分解し、各概念に対する専用報酬モデルから報酬信号を得る。次に、相関に基づく難易度推定を用いて、これらの報酬を適応的に再重み付けし、より高い重み付けを相反あるいは難易度の概念に割り当てる。各グループ内で最も困難な概念に最適化を集中させることで、 \oursはモデルに要求された全ての属性を同時に満足させることを奨励する。本稿では,最先端拡散モデルであるSD3.5とFLUX.1-devのトレーニングにアプローチを適用し,ConceptMix,GenEval 2,T2I-CompBenchなどのマルチコンセプトベンチマークに対する一貫した改善を示す。

論文の概要: Correlation-Weighted Multi-Reward Optimization for Compositional Generation

関連論文リスト