Improving Diffusion-Based Generative Models via Approximated Optimal
Transport
- URL: http://arxiv.org/abs/2403.05069v1
- Date: Fri, 8 Mar 2024 05:43:00 GMT
- Title: Improving Diffusion-Based Generative Models via Approximated Optimal
Transport
- Authors: Daegyu Kim, Jooyoung Choi, Chaehun Shin, Uiwon Hwang, Sungroh Yoon
- Abstract summary: We introduce the Approximated Optimal Transport technique, a novel training scheme for diffusion-based generative models.
We achieve superior image quality and reduced sampling steps by employing AOT in training.
- Score: 41.25847212384836
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: We introduce the Approximated Optimal Transport (AOT) technique, a novel
training scheme for diffusion-based generative models. Our approach aims to
approximate and integrate optimal transport into the training process,
significantly enhancing the ability of diffusion models to estimate the
denoiser outputs accurately. This improvement leads to ODE trajectories of
diffusion models with lower curvature and reduced truncation errors during
sampling. We achieve superior image quality and reduced sampling steps by
employing AOT in training. Specifically, we achieve FID scores of 1.88 with
just 27 NFEs and 1.73 with 29 NFEs in unconditional and conditional
generations, respectively. Furthermore, when applying AOT to train the
discriminator for guidance, we establish new state-of-the-art FID scores of
1.68 and 1.58 for unconditional and conditional generations, respectively, each
with 29 NFEs. This outcome demonstrates the effectiveness of AOT in enhancing
the performance of diffusion models.
Related papers
- Representation Alignment for Generation: Training Diffusion Transformers Is Easier Than You Think [72.48325960659822]
One main bottleneck in training large-scale diffusion models for generation lies in effectively learning these representations.
We study this by introducing a straightforward regularization called REPresentation Alignment (REPA), which aligns the projections of noisy input hidden states in denoising networks with clean image representations obtained from external, pretrained visual encoders.
The results are striking: our simple strategy yields significant improvements in both training efficiency and generation quality when applied to popular diffusion and flow-based transformers, such as DiTs and SiTs.
arXiv Detail & Related papers (2024-10-09T14:34:53Z) - Model Inversion Attacks Through Target-Specific Conditional Diffusion Models [54.69008212790426]
Model inversion attacks (MIAs) aim to reconstruct private images from a target classifier's training set, thereby raising privacy concerns in AI applications.
Previous GAN-based MIAs tend to suffer from inferior generative fidelity due to GAN's inherent flaws and biased optimization within latent space.
We propose Diffusion-based Model Inversion (Diff-MI) attacks to alleviate these issues.
arXiv Detail & Related papers (2024-07-16T06:38:49Z) - Self-Play Fine-Tuning of Diffusion Models for Text-to-Image Generation [59.184980778643464]
Fine-tuning Diffusion Models remains an underexplored frontier in generative artificial intelligence (GenAI)
In this paper, we introduce an innovative technique called self-play fine-tuning for diffusion models (SPIN-Diffusion)
Our approach offers an alternative to conventional supervised fine-tuning and RL strategies, significantly improving both model performance and alignment.
arXiv Detail & Related papers (2024-02-15T18:59:18Z) - CADS: Unleashing the Diversity of Diffusion Models through Condition-Annealed Sampling [27.795088366122297]
Condition-Annealed Diffusion Sampler (CADS) can be used with any pretrained model and sampling algorithm.
We show that it boosts the diversity of diffusion models in various conditional generation tasks.
arXiv Detail & Related papers (2023-10-26T12:27:56Z) - Analyzing and Improving Optimal-Transport-based Adversarial Networks [9.980822222343921]
Optimal Transport (OT) problem aims to find a transport plan that bridges two distributions while minimizing a given cost function.
OT theory has been widely utilized in generative modeling.
Our approach achieves a FID score of 2.51 on CIFAR-10 and 5.99 on CelebA-HQ-256, outperforming unified OT-based adversarial approaches.
arXiv Detail & Related papers (2023-10-04T06:52:03Z) - Protein Design with Guided Discrete Diffusion [67.06148688398677]
A popular approach to protein design is to combine a generative model with a discriminative model for conditional sampling.
We propose diffusioN Optimized Sampling (NOS), a guidance method for discrete diffusion models.
NOS makes it possible to perform design directly in sequence space, circumventing significant limitations of structure-based methods.
arXiv Detail & Related papers (2023-05-31T16:31:24Z) - Generative Modeling through the Semi-dual Formulation of Unbalanced
Optimal Transport [9.980822222343921]
We propose a novel generative model based on the semi-dual formulation of Unbalanced Optimal Transport (UOT)
Unlike OT, UOT relaxes the hard constraint on distribution matching. This approach provides better robustness against outliers, stability during training, and faster convergence.
Our model outperforms existing OT-based generative models, achieving FID scores of 2.97 on CIFAR-10 and 6.36 on CelebA-HQ-256.
arXiv Detail & Related papers (2023-05-24T06:31:05Z) - On Distillation of Guided Diffusion Models [94.95228078141626]
We propose an approach to distilling classifier-free guided diffusion models into models that are fast to sample from.
For standard diffusion models trained on the pixelspace, our approach is able to generate images visually comparable to that of the original model.
For diffusion models trained on the latent-space (e.g., Stable Diffusion), our approach is able to generate high-fidelity images using as few as 1 to 4 denoising steps.
arXiv Detail & Related papers (2022-10-06T18:03:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.