Related papers: Bridging the Gap: Addressing Discrepancies in Diffusion Model Training for Classifier-Free Guidance

Bridging the Gap: Addressing Discrepancies in Diffusion Model Training for Classifier-Free Guidance

URL: http://arxiv.org/abs/2311.00938v1
Date: Thu, 2 Nov 2023 02:03:12 GMT
Title: Bridging the Gap: Addressing Discrepancies in Diffusion Model Training for Classifier-Free Guidance
Authors: Niket Patel, Luis Salamanca, Luis Barba
Abstract summary: Diffusion models have emerged as a pivotal advancement in generative models. In this paper we aim to underscore a discrepancy between conventional training methods and the desired conditional sampling behavior. We introduce an updated loss function that better aligns training objectives with sampling behaviors.
Score: 1.6804613362826175
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Diffusion models have emerged as a pivotal advancement in generative models, setting new standards to the quality of the generated instances. In the current paper we aim to underscore a discrepancy between conventional training methods and the desired conditional sampling behavior of these models. While the prevalent classifier-free guidance technique works well, it's not without flaws. At higher values for the guidance scale parameter $w$, we often get out of distribution samples and mode collapse, whereas at lower values for $w$ we may not get the desired specificity. To address these challenges, we introduce an updated loss function that better aligns training objectives with sampling behaviors. Experimental validation with FID scores on CIFAR-10 elucidates our method's ability to produce higher quality samples with fewer sampling timesteps, and be more robust to the choice of guidance scale $w$. We also experiment with fine-tuning Stable Diffusion on the proposed loss, to provide early evidence that large diffusion models may also benefit from this refined loss function.

Related papers

Adaptive Destruction Processes for Diffusion Samplers [12.446080077998834]
This paper explores the challenges and benefits of a trainable destruction process in diffusion samplers.<n>We show that, when the number of steps is limited, training both generation and destruction processes results in faster convergence and improved sampling quality.
arXiv Detail & Related papers (2025-06-02T11:07:27Z)
Diffusion Sampling Path Tells More: An Efficient Plug-and-Play Strategy for Sample Filtering [18.543769006014383]
Diffusion models often exhibit inconsistent sample quality due to variations inherent in their sampling trajectories.<n>We introduce CFG-Rejection, an efficient, plug-and-play strategy that filters low-quality samples at an early stage of the denoising process.<n>We validate the effectiveness of CFG-Rejection in image generation through extensive experiments.
arXiv Detail & Related papers (2025-05-29T11:08:24Z)
PLADIS: Pushing the Limits of Attention in Diffusion Models at Inference Time by Leveraging Sparsity [9.092404060771306]
Diffusion models have shown impressive results in generating high-quality conditional samples. However, existing methods often require additional training or neural function evaluations (NFEs) We propose a novel and efficient method, termed PLADIS, which boosts pre-trained models by leveraging sparse attention.
arXiv Detail & Related papers (2025-03-10T07:23:19Z)
DOTA: Distributional Test-Time Adaptation of Vision-Language Models [52.98590762456236]
Training-free test-time dynamic adapter (TDA) is a promising approach to address this issue. We propose a simple yet effective method for DistributiOnal Test-time Adaptation (Dota) Dota continually estimates the distributions of test samples, allowing the model to continually adapt to the deployment environment.
arXiv Detail & Related papers (2024-09-28T15:03:28Z)
Informed Correctors for Discrete Diffusion Models [32.87362154118195]
We propose a family of informed correctors that more reliably counteracts discretization error by leveraging information learned by the model. We also propose $k$-Gillespie's, a sampling algorithm that better utilizes each model evaluation, while still enjoying the speed and flexibility of $tau$-leaping. Across several real and synthetic datasets, we show that $k$-Gillespie's with informed correctors reliably produces higher quality samples at lower computational cost.
arXiv Detail & Related papers (2024-07-30T23:29:29Z)
Adding Conditional Control to Diffusion Models with Reinforcement Learning [59.295203871547336]
Diffusion models are powerful generative models that allow for precise control over the characteristics of the generated samples. This work presents a novel method based on reinforcement learning (RL) to add additional controls, leveraging an offline dataset.
arXiv Detail & Related papers (2024-06-17T22:00:26Z)
Rejection via Learning Density Ratios [50.91522897152437]
Classification with rejection emerges as a learning paradigm which allows models to abstain from making predictions. We propose a different distributional perspective, where we seek to find an idealized data distribution which maximizes a pretrained model's performance. Our framework is tested empirically over clean and noisy datasets.
arXiv Detail & Related papers (2024-05-29T01:32:17Z)
Manifold Preserving Guided Diffusion [121.97907811212123]
Conditional image generation still faces challenges of cost, generalizability, and the need for task-specific training. We propose Manifold Preserving Guided Diffusion (MPGD), a training-free conditional generation framework.
arXiv Detail & Related papers (2023-11-28T02:08:06Z)
Reducing Spatial Fitting Error in Distillation of Denoising Diffusion Models [13.364271265023953]
Knowledge distillation for diffusion models is an effective method to address this limitation with a shortened sampling process. We attribute the degradation to the spatial fitting error occurring in the training of both the teacher and student model. SFERD utilizes attention guidance from the teacher model and a designed semantic gradient predictor to reduce the student's fitting error. We achieve an FID of 5.31 on CIFAR-10 and 9.39 on ImageNet 64$times$64 with only one step, outperforming existing diffusion methods.
arXiv Detail & Related papers (2023-11-07T09:19:28Z)
Discrete Diffusion Modeling by Estimating the Ratios of the Data Distribution [67.9215891673174]
We propose score entropy as a novel loss that naturally extends score matching to discrete spaces. We test our Score Entropy Discrete Diffusion models on standard language modeling tasks.
arXiv Detail & Related papers (2023-10-25T17:59:12Z)
Value function estimation using conditional diffusion models for control [62.27184818047923]
We propose a simple algorithm called Diffused Value Function (DVF) It learns a joint multi-step model of the environment-robot interaction dynamics using a diffusion model. We show how DVF can be used to efficiently capture the state visitation measure for multiple controllers.
arXiv Detail & Related papers (2023-06-09T18:40:55Z)
Towards Controllable Diffusion Models via Reward-Guided Exploration [15.857464051475294]
We propose a novel framework that guides the training-phase of diffusion models via reinforcement learning (RL) RL enables calculating policy gradients via samples from a pay-off distribution proportional to exponential scaled rewards, rather than from policies themselves. Experiments on 3D shape and molecule generation tasks show significant improvements over existing conditional diffusion models.
arXiv Detail & Related papers (2023-04-14T13:51:26Z)
Distributionally Robust Models with Parametric Likelihood Ratios [123.05074253513935]
Three simple ideas allow us to train models with DRO using a broader class of parametric likelihood ratios. We find that models trained with the resulting parametric adversaries are consistently more robust to subpopulation shifts when compared to other DRO approaches.
arXiv Detail & Related papers (2022-04-13T12:43:12Z)
Improved Denoising Diffusion Probabilistic Models [4.919647298882951]
We show that DDPMs can achieve competitive log-likelihoods while maintaining high sample quality. We also find that learning variances of the reverse diffusion process allows sampling with an order of magnitude fewer forward passes. We show that the sample quality and likelihood of these models scale smoothly with model capacity and training compute, making them easily scalable.
arXiv Detail & Related papers (2021-02-18T23:44:17Z)
Understanding Classifier Mistakes with Generative Models [88.20470690631372]
Deep neural networks are effective on supervised learning tasks, but have been shown to be brittle. In this paper, we leverage generative models to identify and characterize instances where classifiers fail to generalize. Our approach is agnostic to class labels from the training set which makes it applicable to models trained in a semi-supervised way.
arXiv Detail & Related papers (2020-10-05T22:13:21Z)

This list is automatically generated from the titles and abstracts of the papers in this site.