Fine-Tuning Diffusion Models via Intermediate Distribution Shaping
- URL: http://arxiv.org/abs/2510.02692v1
- Date: Fri, 03 Oct 2025 03:18:47 GMT
- Title: Fine-Tuning Diffusion Models via Intermediate Distribution Shaping
- Authors: Gautham Govind Anil, Shaan Ul Haque, Nithish Kannen, Dheeraj Nagaraj, Sanjay Shakkottai, Karthikeyan Shanmugam,
- Abstract summary: Policy gradient methods are widely used in the context of autoregressive generation.<n>We show that GRAFT implicitly performs PPO with reshaped rewards.<n>We then introduce P-GRAFT to shape distributions at intermediate noise levels.<n>Motivated by this, we propose inverse noise correction to improve flow models without leveraging explicit rewards.
- Score: 33.26998978897412
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Diffusion models are widely used for generative tasks across domains. While pre-trained diffusion models effectively capture the training data distribution, it is often desirable to shape these distributions using reward functions to align with downstream applications. Policy gradient methods, such as Proximal Policy Optimization (PPO), are widely used in the context of autoregressive generation. However, the marginal likelihoods required for such methods are intractable for diffusion models, leading to alternative proposals and relaxations. In this context, we unify variants of Rejection sAmpling based Fine-Tuning (RAFT) as GRAFT, and show that this implicitly performs PPO with reshaped rewards. We then introduce P-GRAFT to shape distributions at intermediate noise levels and demonstrate empirically that this can lead to more effective fine-tuning. We mathematically explain this via a bias-variance tradeoff. Motivated by this, we propose inverse noise correction to improve flow models without leveraging explicit rewards. We empirically evaluate our methods on text-to-image(T2I) generation, layout generation, molecule generation and unconditional image generation. Notably, our framework, applied to Stable Diffusion 2, improves over policy gradient methods on popular T2I benchmarks in terms of VQAScore and shows an $8.81\%$ relative improvement over the base model. For unconditional image generation, inverse noise correction improves FID of generated images at lower FLOPs/image.
Related papers
- Improving Conditional VAE with approximation using Normalizing Flows [0.0]
Variational Autoencoders and Generative Adversarial Networks remained the state-of-the-art (SOTA) generative models until 2022.<n>Efforts to improve traditional models have stagnated as a result.<n>In old-school fashion, we explore image generation with conditional Variational Autoencoders (CVAE) to incorporate desired attributes within the images.
arXiv Detail & Related papers (2025-11-12T03:46:00Z) - Score Distillation of Flow Matching Models [67.86066177182046]
We extend Score identity Distillation (SiD) to pretrained text-to-image flow-matching models.<n>SiD works out of the box across these models, in both data-free and data-aided settings.<n>This provides the first systematic evidence that score distillation applies broadly to text-to-image flow matching models.
arXiv Detail & Related papers (2025-09-29T17:45:48Z) - Coefficients-Preserving Sampling for Reinforcement Learning with Flow Matching [6.238027696245818]
Reinforcement Learning (RL) has emerged as a powerful technique for improving image and video generation in Diffusion and Flow Matching models.<n>Our investigation reveals a significant drawback to this approach: SDE-based sampling introduces pronounced noise artifacts in the generated images.<n>Our proposed method, Coefficients-Preserving Sampling (CPS) eliminates these noise artifacts.
arXiv Detail & Related papers (2025-09-07T07:25:00Z) - DIVE: Inverting Conditional Diffusion Models for Discriminative Tasks [79.50756148780928]
This paper studies the problem of leveraging pretrained diffusion models for performing discriminative tasks.<n>We extend the discriminative capability of pretrained frozen generative diffusion models from the classification task to the more complex object detection task, by "inverting" a pretrained layout-to-image diffusion model.
arXiv Detail & Related papers (2025-04-24T05:13:27Z) - MAP-based Problem-Agnostic diffusion model for Inverse Problems [8.161067848524976]
We propose a problem-agnostic diffusion model called the maximum a posteriori (MAP)-based guided term estimation method for inverse problems.<n>This innovation allows us to better capture the intrinsic properties of the data, leading to improved performance.
arXiv Detail & Related papers (2025-01-25T08:30:15Z) - Derivative-Free Guidance in Continuous and Discrete Diffusion Models with Soft Value-Based Decoding [84.3224556294803]
Diffusion models excel at capturing the natural design spaces of images, molecules, DNA, RNA, and protein sequences.
We aim to optimize downstream reward functions while preserving the naturalness of these design spaces.
Our algorithm integrates soft value functions, which looks ahead to how intermediate noisy states lead to high rewards in the future.
arXiv Detail & Related papers (2024-08-15T16:47:59Z) - Denoising Diffusion Bridge Models [54.87947768074036]
Diffusion models are powerful generative models that map noise to data using processes.
For many applications such as image editing, the model input comes from a distribution that is not random noise.
In our work, we propose Denoising Diffusion Bridge Models (DDBMs)
arXiv Detail & Related papers (2023-09-29T03:24:24Z) - Low-Light Image Enhancement with Wavelet-based Diffusion Models [50.632343822790006]
Diffusion models have achieved promising results in image restoration tasks, yet suffer from time-consuming, excessive computational resource consumption, and unstable restoration.
We propose a robust and efficient Diffusion-based Low-Light image enhancement approach, dubbed DiffLL.
arXiv Detail & Related papers (2023-06-01T03:08:28Z) - ADIR: Adaptive Diffusion for Image Reconstruction [42.90778718695398]
Denoising diffusion models have recently achieved remarkable success in image generation, capturing rich information about natural image statistics.<n>We introduce a conditional sampling framework that leverages the powerful priors learned by diffusion models while enforcing consistency with the available measurements.<n>We employ LoRA-based adaptation using images that are semantically and visually similar to the degraded input, efficiently retrieved from a large and diverse dataset.
arXiv Detail & Related papers (2022-12-06T18:39:58Z) - DriftRec: Adapting diffusion models to blind JPEG restoration [16.596100244509575]
We utilize the high-fidelity generation abilities of diffusion models to solve blind JPEG restoration at high compression levels.
We show that our approach can escape the tendency of other methods to generate blurry images, and recovers the distribution of clean images significantly more faithfully.
arXiv Detail & Related papers (2022-11-12T22:29:42Z) - A Variational Perspective on Diffusion-Based Generative Models and Score
Matching [8.93483643820767]
We derive a variational framework for likelihood estimation for continuous-time generative diffusion.
We show that minimizing the score-matching loss is equivalent to maximizing a lower bound of the likelihood of the plug-in reverse SDE.
arXiv Detail & Related papers (2021-06-05T05:50:36Z) - Deep Variational Network Toward Blind Image Restoration [60.45350399661175]
Blind image restoration is a common yet challenging problem in computer vision.
We propose a novel blind image restoration method, aiming to integrate both the advantages of them.
Experiments on two typical blind IR tasks, namely image denoising and super-resolution, demonstrate that the proposed method achieves superior performance over current state-of-the-arts.
arXiv Detail & Related papers (2020-08-25T03:30:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.