Fugu-MT 論文翻訳(概要): CPO: Condition Preference Optimization for Controllable Image Generation

論文の概要: CPO: Condition Preference Optimization for Controllable Image Generation

arxiv url: http://arxiv.org/abs/2511.04753v1
Date: Thu, 06 Nov 2025 19:02:06 GMT
ステータス: 翻訳完了
システム内更新日: 2025-11-10 21:00:44.562941
Title: CPO: Condition Preference Optimization for Controllable Image Generation
Title（参考訳）: CPO:制御可能な画像生成のための条件設定最適化
Authors: Zonglin Lyu, Ming Li, Xinxin Liu, Chen Chen,
Abstract要約: ControlNetは、テキスト・ツー・イメージ生成に画像ベースの制御信号を導入している。 ControlNet++は、生成された画像と入力制御信号の間のピクセルレベルのサイクル一貫性を改善する。生成した画像よりも制御条件を優先的に学習することを提案する。
参考スコア（独自算出の注目度）: 9.511718254502199
License: http://creativecommons.org/licenses/by/4.0/
Abstract: To enhance controllability in text-to-image generation, ControlNet introduces image-based control signals, while ControlNet++ improves pixel-level cycle consistency between generated images and the input control signal. To avoid the prohibitive cost of back-propagating through the sampling process, ControlNet++ optimizes only low-noise timesteps (e.g., $t < 200$) using a single-step approximation, which not only ignores the contribution of high-noise timesteps but also introduces additional approximation errors. A straightforward alternative for optimizing controllability across all timesteps is Direct Preference Optimization (DPO), a fine-tuning method that increases model preference for more controllable images ($I^{w}$) over less controllable ones ($I^{l}$). However, due to uncertainty in generative models, it is difficult to ensure that win--lose image pairs differ only in controllability while keeping other factors, such as image quality, fixed. To address this, we propose performing preference learning over control conditions rather than generated images. Specifically, we construct winning and losing control signals, $\mathbf{c}^{w}$ and $\mathbf{c}^{l}$, and train the model to prefer $\mathbf{c}^{w}$. This method, which we term \textit{Condition Preference Optimization} (CPO), eliminates confounding factors and yields a low-variance training objective. Our approach theoretically exhibits lower contrastive loss variance than DPO and empirically achieves superior results. Moreover, CPO requires less computation and storage for dataset curation. Extensive experiments show that CPO significantly improves controllability over the state-of-the-art ControlNet++ across multiple control types: over $10\%$ error rate reduction in segmentation, $70$--$80\%$ in human pose, and consistent $2$--$5\%$ reductions in edge and depth maps.
Abstract（参考訳）: ControlNet++は生成した画像と入力制御信号の間のピクセルレベルのサイクル一貫性を改善している。サンプリングプロセスによるバックプロパゲーションの禁止コストを回避するため、ControlNet++はシングルステップ近似を使用して低ノイズタイムステップ(例: $t < 200$)のみを最適化する。すべてのタイムステップにまたがって制御性を最適化する簡単な代替手段として、より制御しやすい画像(I^{w}$)に対して、より制御しやすい画像(I^{l}$)に対するモデル優先性を高める微調整法であるDirect Preference Optimization (DPO)がある。しかし, 生成モデルにおける不確実性のため, 画像品質などの他の要因を保ちながら, ウィン・ロー画像対が制御性においてのみ異なることを保証することは困難である。そこで本研究では,生成した画像よりも制御条件を優先的に学習することを提案する。具体的には、入賞制御信号、$\mathbf{c}^{w}$と$\mathbf{c}^{l}$を構築し、$\mathbf{c}^{w}$を選択するようにモデルを訓練する。 CPO(textit{Condition Preference Optimization})と呼ばれるこの手法は、相反する要因を排除し、低分散トレーニング目標を導出する。提案手法は理論的にはDPOよりも低いコントラスト損失分散を示し,実験により優れた結果が得られた。さらに、CPOは、データセットのキュレーションに少ない計算とストレージを必要とする。大規模な実験によると、CPOは複数のコントロールタイプにわたって、最先端のコントロールNet++に対する制御性を著しく改善している。

論文の概要: CPO: Condition Preference Optimization for Controllable Image Generation

関連論文リスト