Structure-Guided Adversarial Training of Diffusion Models
- URL: http://arxiv.org/abs/2402.17563v2
- Date: Mon, 4 Mar 2024 14:51:40 GMT
- Title: Structure-Guided Adversarial Training of Diffusion Models
- Authors: Ling Yang, Haotian Qian, Zhilong Zhang, Jingwei Liu, Bin Cui
- Abstract summary: We introduce Structure-guided Adversarial training of Diffusion Models (SADM)
We compel the model to learn manifold structures between samples in each training batch.
SADM substantially improves existing diffusion transformers and outperforms existing methods in image generation and fine-tuning tasks.
- Score: 27.723913809313125
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Diffusion models have demonstrated exceptional efficacy in various generative
applications. While existing models focus on minimizing a weighted sum of
denoising score matching losses for data distribution modeling, their training
primarily emphasizes instance-level optimization, overlooking valuable
structural information within each mini-batch, indicative of pair-wise
relationships among samples. To address this limitation, we introduce
Structure-guided Adversarial training of Diffusion Models (SADM). In this
pioneering approach, we compel the model to learn manifold structures between
samples in each training batch. To ensure the model captures authentic manifold
structures in the data distribution, we advocate adversarial training of the
diffusion generator against a novel structure discriminator in a minimax game,
distinguishing real manifold structures from the generated ones. SADM
substantially improves existing diffusion transformers (DiT) and outperforms
existing methods in image generation and cross-domain fine-tuning tasks across
12 datasets, establishing a new state-of-the-art FID of 1.58 and 2.11 on
ImageNet for class-conditional image generation at resolutions of 256x256 and
512x512, respectively.
Related papers
- Diffusion Models without Classifier-free Guidance [41.59396565229466]
Model-guidance (MG) is a novel objective for training diffusion model addresses and removes commonly used guidance (CFG)
Our innovative approach transcends the standard modeling and incorporates the posterior probability of conditions.
Our method significantly accelerates the training process, doubles inference speed, and achieve exceptional quality that parallel surpass even concurrent diffusion models with CFG.
arXiv Detail & Related papers (2025-02-17T18:59:50Z) - Diffusion Models Learn Low-Dimensional Distributions via Subspace Clustering [15.326641037243006]
diffusion models can effectively learn the image distribution and generate new samples.
We provide theoretical insights into this phenomenon by leveraging key empirical observations.
We show that the minimal number of samples required to learn the underlying distribution scales linearly with the intrinsic dimensions.
arXiv Detail & Related papers (2024-09-04T04:14:02Z) - Constrained Diffusion Models via Dual Training [80.03953599062365]
Diffusion processes are prone to generating samples that reflect biases in a training dataset.
We develop constrained diffusion models by imposing diffusion constraints based on desired distributions.
We show that our constrained diffusion models generate new data from a mixture data distribution that achieves the optimal trade-off among objective and constraints.
arXiv Detail & Related papers (2024-08-27T14:25:42Z) - Training Class-Imbalanced Diffusion Model Via Overlap Optimization [55.96820607533968]
Diffusion models trained on real-world datasets often yield inferior fidelity for tail classes.
Deep generative models, including diffusion models, are biased towards classes with abundant training images.
We propose a method based on contrastive learning to minimize the overlap between distributions of synthetic images for different classes.
arXiv Detail & Related papers (2024-02-16T16:47:21Z) - Adaptive Training Meets Progressive Scaling: Elevating Efficiency in Diffusion Models [52.1809084559048]
We propose a novel two-stage divide-and-conquer training strategy termed TDC Training.
It groups timesteps based on task similarity and difficulty, assigning highly customized denoising models to each group, thereby enhancing the performance of diffusion models.
While two-stage training avoids the need to train each model separately, the total training cost is even lower than training a single unified denoising model.
arXiv Detail & Related papers (2023-12-20T03:32:58Z) - Steerable Conditional Diffusion for Out-of-Distribution Adaptation in Medical Image Reconstruction [75.91471250967703]
We introduce a novel sampling framework called Steerable Conditional Diffusion.
This framework adapts the diffusion model, concurrently with image reconstruction, based solely on the information provided by the available measurement.
We achieve substantial enhancements in out-of-distribution performance across diverse imaging modalities.
arXiv Detail & Related papers (2023-08-28T08:47:06Z) - Diff-Instruct: A Universal Approach for Transferring Knowledge From
Pre-trained Diffusion Models [77.83923746319498]
We propose a framework called Diff-Instruct to instruct the training of arbitrary generative models.
We show that Diff-Instruct results in state-of-the-art single-step diffusion-based models.
Experiments on refining GAN models show that the Diff-Instruct can consistently improve the pre-trained generators of GAN models.
arXiv Detail & Related papers (2023-05-29T04:22:57Z) - On Distillation of Guided Diffusion Models [94.95228078141626]
We propose an approach to distilling classifier-free guided diffusion models into models that are fast to sample from.
For standard diffusion models trained on the pixelspace, our approach is able to generate images visually comparable to that of the original model.
For diffusion models trained on the latent-space (e.g., Stable Diffusion), our approach is able to generate high-fidelity images using as few as 1 to 4 denoising steps.
arXiv Detail & Related papers (2022-10-06T18:03:56Z) - Few-Shot Diffusion Models [15.828257653106537]
We present Few-Shot Diffusion Models (FSDM), a framework for few-shot generation leveraging conditional DDPMs.
FSDM is trained to adapt the generative process conditioned on a small set of images from a given class by aggregating image patch information.
We empirically show that FSDM can perform few-shot generation and transfer to new datasets.
arXiv Detail & Related papers (2022-05-30T23:20:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.