Fine-Tuning of Continuous-Time Diffusion Models as Entropy-Regularized
Control
- URL: http://arxiv.org/abs/2402.15194v2
- Date: Wed, 28 Feb 2024 09:21:46 GMT
- Title: Fine-Tuning of Continuous-Time Diffusion Models as Entropy-Regularized
Control
- Authors: Masatoshi Uehara, Yulai Zhao, Kevin Black, Ehsan Hajiramezanali,
Gabriele Scalia, Nathaniel Lee Diamant, Alex M Tseng, Tommaso Biancalani,
Sergey Levine
- Abstract summary: Diffusion models excel at capturing complex data distributions, such as those of natural images and proteins.
While diffusion models are trained to represent the distribution in the training dataset, we often are more concerned with other properties, such as the aesthetic quality of the generated images.
We present theoretical and empirical evidence that demonstrates our framework is capable of efficiently generating diverse samples with high genuine rewards.
- Score: 54.132297393662654
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Diffusion models excel at capturing complex data distributions, such as those
of natural images and proteins. While diffusion models are trained to represent
the distribution in the training dataset, we often are more concerned with
other properties, such as the aesthetic quality of the generated images or the
functional properties of generated proteins. Diffusion models can be finetuned
in a goal-directed way by maximizing the value of some reward function (e.g.,
the aesthetic quality of an image). However, these approaches may lead to
reduced sample diversity, significant deviations from the training data
distribution, and even poor sample quality due to the exploitation of an
imperfect reward function. The last issue often occurs when the reward function
is a learned model meant to approximate a ground-truth "genuine" reward, as is
the case in many practical applications. These challenges, collectively termed
"reward collapse," pose a substantial obstacle. To address this reward
collapse, we frame the finetuning problem as entropy-regularized control
against the pretrained diffusion model, i.e., directly optimizing
entropy-enhanced rewards with neural SDEs. We present theoretical and empirical
evidence that demonstrates our framework is capable of efficiently generating
diverse samples with high genuine rewards, mitigating the overoptimization of
imperfect reward models.
Related papers
- Fine-Tuning Discrete Diffusion Models via Reward Optimization with Applications to DNA and Protein Design [56.957070405026194]
We propose an algorithm that enables direct backpropagation of rewards through entire trajectories generated by diffusion models.
DRAKES can generate sequences that are both natural-like and yield high rewards.
arXiv Detail & Related papers (2024-10-17T15:10:13Z) - Model Collapse in the Self-Consuming Chain of Diffusion Finetuning: A Novel Perspective from Quantitative Trait Modeling [10.159932782892865]
generative models have reached a unique threshold where their outputs are indistinguishable from real data.
Severe degradation in performance has been observed when iterative loops of training and generation occur.
We propose Reusable Diffusion Finetuning (ReDiFine), a simple yet effective strategy inspired by genetic mutations.
arXiv Detail & Related papers (2024-07-04T13:41:54Z) - Feedback Efficient Online Fine-Tuning of Diffusion Models [52.170384048274364]
We propose a novel reinforcement learning procedure that efficiently explores on the manifold of feasible samples.
We present a theoretical analysis providing a regret guarantee, as well as empirical validation across three domains.
arXiv Detail & Related papers (2024-02-26T07:24:32Z) - Training Class-Imbalanced Diffusion Model Via Overlap Optimization [55.96820607533968]
Diffusion models trained on real-world datasets often yield inferior fidelity for tail classes.
Deep generative models, including diffusion models, are biased towards classes with abundant training images.
We propose a method based on contrastive learning to minimize the overlap between distributions of synthetic images for different classes.
arXiv Detail & Related papers (2024-02-16T16:47:21Z) - Steered Diffusion: A Generalized Framework for Plug-and-Play Conditional
Image Synthesis [62.07413805483241]
Steered Diffusion is a framework for zero-shot conditional image generation using a diffusion model trained for unconditional generation.
We present experiments using steered diffusion on several tasks including inpainting, colorization, text-guided semantic editing, and image super-resolution.
arXiv Detail & Related papers (2023-09-30T02:03:22Z) - RAFT: Reward rAnked FineTuning for Generative Foundation Model Alignment [32.752633250862694]
Generative foundation models are susceptible to implicit biases that can arise from extensive unsupervised training data.
We introduce a new framework, Reward rAnked FineTuning, designed to align generative models effectively.
arXiv Detail & Related papers (2023-04-13T18:22:40Z) - Bi-Noising Diffusion: Towards Conditional Diffusion Models with
Generative Restoration Priors [64.24948495708337]
We introduce a new method that brings predicted samples to the training data manifold using a pretrained unconditional diffusion model.
We perform comprehensive experiments to demonstrate the effectiveness of our approach on super-resolution, colorization, turbulence removal, and image-deraining tasks.
arXiv Detail & Related papers (2022-12-14T17:26:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.