Fugu-MT 論文翻訳(概要): Fine-Tuning Diffusion Models via Intermediate Distribution Shaping

論文の概要: Fine-Tuning Diffusion Models via Intermediate Distribution Shaping

arxiv url: http://arxiv.org/abs/2510.02692v1
Date: Fri, 03 Oct 2025 03:18:47 GMT
ステータス: 翻訳完了
システム内更新日: 2025-10-06 16:35:52.250425
Title: Fine-Tuning Diffusion Models via Intermediate Distribution Shaping
Title（参考訳）: 中間分布形状による微調整拡散モデル
Authors: Gautham Govind Anil, Shaan Ul Haque, Nithish Kannen, Dheeraj Nagaraj, Sanjay Shakkottai, Karthikeyan Shanmugam,
Abstract要約: 政策勾配法は自己回帰生成の文脈で広く用いられている。我々は,GRAFTが暗黙的にリフォーム報酬でPPOを行うことを示す。次に、P-GRAFTを導入し、中間雑音レベルで分布を形作る。そこで我々は,明示的な報奨を生かさずに,フローモデルを改善する逆ノイズ補正を提案する。
参考スコア（独自算出の注目度）: 33.26998978897412
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Diffusion models are widely used for generative tasks across domains. While pre-trained diffusion models effectively capture the training data distribution, it is often desirable to shape these distributions using reward functions to align with downstream applications. Policy gradient methods, such as Proximal Policy Optimization (PPO), are widely used in the context of autoregressive generation. However, the marginal likelihoods required for such methods are intractable for diffusion models, leading to alternative proposals and relaxations. In this context, we unify variants of Rejection sAmpling based Fine-Tuning (RAFT) as GRAFT, and show that this implicitly performs PPO with reshaped rewards. We then introduce P-GRAFT to shape distributions at intermediate noise levels and demonstrate empirically that this can lead to more effective fine-tuning. We mathematically explain this via a bias-variance tradeoff. Motivated by this, we propose inverse noise correction to improve flow models without leveraging explicit rewards. We empirically evaluate our methods on text-to-image(T2I) generation, layout generation, molecule generation and unconditional image generation. Notably, our framework, applied to Stable Diffusion 2, improves over policy gradient methods on popular T2I benchmarks in terms of VQAScore and shows an $8.81\%$ relative improvement over the base model. For unconditional image generation, inverse noise correction improves FID of generated images at lower FLOPs/image.
Abstract（参考訳）: 拡散モデルはドメイン間の生成タスクに広く使われている。事前学習した拡散モデルは、トレーニングデータ分布を効果的にキャプチャするが、これらの分布を報酬関数を用いて形成し、下流アプリケーションと整合させることが望ましい。 PPO(Proximal Policy Optimization)のような政策勾配法は、自己回帰生成の文脈で広く用いられている。しかし、そのような方法に必要な限界確率は拡散モデルにとって難解であり、代替の提案や緩和につながる。この文脈では、Rejection sAmpling based Fine-Tuning (RAFT) の変種をGRAFTと統一し、この変種が暗黙的にリフォームされた報酬でPPOを実行することを示す。次に、P-GRAFTを導入し、中間雑音レベルの分布を形作り、これがより効果的な微調整につながることを実証的に示す。数学的にはバイアス分散トレードオフによってこれを説明します。そこで我々は,明示的な報奨を生かさずに,フローモデルを改善する逆ノイズ補正を提案する。本研究では,テキスト・トゥ・イメージ(T2I)生成,レイアウト生成,分子生成,非条件画像生成について実験的に評価する。特に,本フレームワークを安定拡散2に適用することにより,VQAScoreで一般的なT2Iベンチマークのポリシー勾配法を改良し,ベースモデルに対して8.81 %の相対的改善を示した。非条件画像生成では、逆ノイズ補正により、より低いFLOP/画像で生成された画像のFIDが向上する。

論文の概要: Fine-Tuning Diffusion Models via Intermediate Distribution Shaping

関連論文リスト