Structure-Guided Adversarial Training of Diffusion Models
- URL: http://arxiv.org/abs/2402.17563v2
- Date: Mon, 4 Mar 2024 14:51:40 GMT
- Title: Structure-Guided Adversarial Training of Diffusion Models
- Authors: Ling Yang, Haotian Qian, Zhilong Zhang, Jingwei Liu, Bin Cui
- Abstract summary: We introduce Structure-guided Adversarial training of Diffusion Models (SADM)
We compel the model to learn manifold structures between samples in each training batch.
SADM substantially improves existing diffusion transformers and outperforms existing methods in image generation and fine-tuning tasks.
- Score: 27.723913809313125
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Diffusion models have demonstrated exceptional efficacy in various generative
applications. While existing models focus on minimizing a weighted sum of
denoising score matching losses for data distribution modeling, their training
primarily emphasizes instance-level optimization, overlooking valuable
structural information within each mini-batch, indicative of pair-wise
relationships among samples. To address this limitation, we introduce
Structure-guided Adversarial training of Diffusion Models (SADM). In this
pioneering approach, we compel the model to learn manifold structures between
samples in each training batch. To ensure the model captures authentic manifold
structures in the data distribution, we advocate adversarial training of the
diffusion generator against a novel structure discriminator in a minimax game,
distinguishing real manifold structures from the generated ones. SADM
substantially improves existing diffusion transformers (DiT) and outperforms
existing methods in image generation and cross-domain fine-tuning tasks across
12 datasets, establishing a new state-of-the-art FID of 1.58 and 2.11 on
ImageNet for class-conditional image generation at resolutions of 256x256 and
512x512, respectively.
Related papers
- Representation Alignment for Generation: Training Diffusion Transformers Is Easier Than You Think [72.48325960659822]
One main bottleneck in training large-scale diffusion models for generation lies in effectively learning these representations.
We study this by introducing a straightforward regularization called REPresentation Alignment (REPA), which aligns the projections of noisy input hidden states in denoising networks with clean image representations obtained from external, pretrained visual encoders.
The results are striking: our simple strategy yields significant improvements in both training efficiency and generation quality when applied to popular diffusion and flow-based transformers, such as DiTs and SiTs.
arXiv Detail & Related papers (2024-10-09T14:34:53Z) - Diffusion Models Learn Low-Dimensional Distributions via Subspace Clustering [15.326641037243006]
diffusion models can effectively learn the image distribution and generate new samples.
We provide theoretical insights into this phenomenon by leveraging key empirical observations.
We show that the minimal number of samples required to learn the underlying distribution scales linearly with the intrinsic dimensions.
arXiv Detail & Related papers (2024-09-04T04:14:02Z) - Constrained Diffusion Models via Dual Training [80.03953599062365]
We develop constrained diffusion models based on desired distributions informed by requirements.
We show that our constrained diffusion models generate new data from a mixture data distribution that achieves the optimal trade-off among objective and constraints.
arXiv Detail & Related papers (2024-08-27T14:25:42Z) - Training Class-Imbalanced Diffusion Model Via Overlap Optimization [55.96820607533968]
Diffusion models trained on real-world datasets often yield inferior fidelity for tail classes.
Deep generative models, including diffusion models, are biased towards classes with abundant training images.
We propose a method based on contrastive learning to minimize the overlap between distributions of synthetic images for different classes.
arXiv Detail & Related papers (2024-02-16T16:47:21Z) - Steerable Conditional Diffusion for Out-of-Distribution Adaptation in Medical Image Reconstruction [75.91471250967703]
We introduce a novel sampling framework called Steerable Conditional Diffusion.
This framework adapts the diffusion model, concurrently with image reconstruction, based solely on the information provided by the available measurement.
We achieve substantial enhancements in out-of-distribution performance across diverse imaging modalities.
arXiv Detail & Related papers (2023-08-28T08:47:06Z) - Diff-Instruct: A Universal Approach for Transferring Knowledge From
Pre-trained Diffusion Models [77.83923746319498]
We propose a framework called Diff-Instruct to instruct the training of arbitrary generative models.
We show that Diff-Instruct results in state-of-the-art single-step diffusion-based models.
Experiments on refining GAN models show that the Diff-Instruct can consistently improve the pre-trained generators of GAN models.
arXiv Detail & Related papers (2023-05-29T04:22:57Z) - On Distillation of Guided Diffusion Models [94.95228078141626]
We propose an approach to distilling classifier-free guided diffusion models into models that are fast to sample from.
For standard diffusion models trained on the pixelspace, our approach is able to generate images visually comparable to that of the original model.
For diffusion models trained on the latent-space (e.g., Stable Diffusion), our approach is able to generate high-fidelity images using as few as 1 to 4 denoising steps.
arXiv Detail & Related papers (2022-10-06T18:03:56Z) - Few-Shot Diffusion Models [15.828257653106537]
We present Few-Shot Diffusion Models (FSDM), a framework for few-shot generation leveraging conditional DDPMs.
FSDM is trained to adapt the generative process conditioned on a small set of images from a given class by aggregating image patch information.
We empirically show that FSDM can perform few-shot generation and transfer to new datasets.
arXiv Detail & Related papers (2022-05-30T23:20:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.