Bridging the Gap: Addressing Discrepancies in Diffusion Model Training
for Classifier-Free Guidance
- URL: http://arxiv.org/abs/2311.00938v1
- Date: Thu, 2 Nov 2023 02:03:12 GMT
- Title: Bridging the Gap: Addressing Discrepancies in Diffusion Model Training
for Classifier-Free Guidance
- Authors: Niket Patel, Luis Salamanca, Luis Barba
- Abstract summary: Diffusion models have emerged as a pivotal advancement in generative models.
In this paper we aim to underscore a discrepancy between conventional training methods and the desired conditional sampling behavior.
We introduce an updated loss function that better aligns training objectives with sampling behaviors.
- Score: 1.6804613362826175
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Diffusion models have emerged as a pivotal advancement in generative models,
setting new standards to the quality of the generated instances. In the current
paper we aim to underscore a discrepancy between conventional training methods
and the desired conditional sampling behavior of these models. While the
prevalent classifier-free guidance technique works well, it's not without
flaws. At higher values for the guidance scale parameter $w$, we often get out
of distribution samples and mode collapse, whereas at lower values for $w$ we
may not get the desired specificity. To address these challenges, we introduce
an updated loss function that better aligns training objectives with sampling
behaviors. Experimental validation with FID scores on CIFAR-10 elucidates our
method's ability to produce higher quality samples with fewer sampling
timesteps, and be more robust to the choice of guidance scale $w$. We also
experiment with fine-tuning Stable Diffusion on the proposed loss, to provide
early evidence that large diffusion models may also benefit from this refined
loss function.
Related papers
- DOTA: Distributional Test-Time Adaptation of Vision-Language Models [52.98590762456236]
Training-free test-time dynamic adapter (TDA) is a promising approach to address this issue.
We propose a simple yet effective method for DistributiOnal Test-time Adaptation (Dota)
Dota continually estimates the distributions of test samples, allowing the model to continually adapt to the deployment environment.
arXiv Detail & Related papers (2024-09-28T15:03:28Z) - Informed Correctors for Discrete Diffusion Models [32.87362154118195]
We propose a family of informed correctors that more reliably counteracts discretization error by leveraging information learned by the model.
We also propose $k$-Gillespie's, a sampling algorithm that better utilizes each model evaluation, while still enjoying the speed and flexibility of $tau$-leaping.
Across several real and synthetic datasets, we show that $k$-Gillespie's with informed correctors reliably produces higher quality samples at lower computational cost.
arXiv Detail & Related papers (2024-07-30T23:29:29Z) - Adding Conditional Control to Diffusion Models with Reinforcement Learning [59.295203871547336]
Diffusion models are powerful generative models that allow for precise control over the characteristics of the generated samples.
This work presents a novel method based on reinforcement learning (RL) to add additional controls, leveraging an offline dataset.
arXiv Detail & Related papers (2024-06-17T22:00:26Z) - Rejection via Learning Density Ratios [50.91522897152437]
Classification with rejection emerges as a learning paradigm which allows models to abstain from making predictions.
We propose a different distributional perspective, where we seek to find an idealized data distribution which maximizes a pretrained model's performance.
Our framework is tested empirically over clean and noisy datasets.
arXiv Detail & Related papers (2024-05-29T01:32:17Z) - Manifold Preserving Guided Diffusion [121.97907811212123]
Conditional image generation still faces challenges of cost, generalizability, and the need for task-specific training.
We propose Manifold Preserving Guided Diffusion (MPGD), a training-free conditional generation framework.
arXiv Detail & Related papers (2023-11-28T02:08:06Z) - Reducing Spatial Fitting Error in Distillation of Denoising Diffusion
Models [13.364271265023953]
Knowledge distillation for diffusion models is an effective method to address this limitation with a shortened sampling process.
We attribute the degradation to the spatial fitting error occurring in the training of both the teacher and student model.
SFERD utilizes attention guidance from the teacher model and a designed semantic gradient predictor to reduce the student's fitting error.
We achieve an FID of 5.31 on CIFAR-10 and 9.39 on ImageNet 64$times$64 with only one step, outperforming existing diffusion methods.
arXiv Detail & Related papers (2023-11-07T09:19:28Z) - Value function estimation using conditional diffusion models for control [62.27184818047923]
We propose a simple algorithm called Diffused Value Function (DVF)
It learns a joint multi-step model of the environment-robot interaction dynamics using a diffusion model.
We show how DVF can be used to efficiently capture the state visitation measure for multiple controllers.
arXiv Detail & Related papers (2023-06-09T18:40:55Z) - Towards Controllable Diffusion Models via Reward-Guided Exploration [15.857464051475294]
We propose a novel framework that guides the training-phase of diffusion models via reinforcement learning (RL)
RL enables calculating policy gradients via samples from a pay-off distribution proportional to exponential scaled rewards, rather than from policies themselves.
Experiments on 3D shape and molecule generation tasks show significant improvements over existing conditional diffusion models.
arXiv Detail & Related papers (2023-04-14T13:51:26Z) - Improved Denoising Diffusion Probabilistic Models [4.919647298882951]
We show that DDPMs can achieve competitive log-likelihoods while maintaining high sample quality.
We also find that learning variances of the reverse diffusion process allows sampling with an order of magnitude fewer forward passes.
We show that the sample quality and likelihood of these models scale smoothly with model capacity and training compute, making them easily scalable.
arXiv Detail & Related papers (2021-02-18T23:44:17Z) - Understanding Classifier Mistakes with Generative Models [88.20470690631372]
Deep neural networks are effective on supervised learning tasks, but have been shown to be brittle.
In this paper, we leverage generative models to identify and characterize instances where classifiers fail to generalize.
Our approach is agnostic to class labels from the training set which makes it applicable to models trained in a semi-supervised way.
arXiv Detail & Related papers (2020-10-05T22:13:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.