Related papers: Improving Diffusion-Based Generative Models via Approximated Optimal Transport

Related papers

Diagnosing and Improving Diffusion Models by Estimating the Optimal Loss Value [37.31559473176938]
We advocate the need to estimate the optimal loss value for diagnosing and improving diffusion models.<n>We first derive the optimal loss in closed form under a unified formulation of diffusion models.<n>We develop a more performant training schedule based on the optimal loss.
arXiv Detail & Related papers (2025-06-16T17:59:54Z)
Improved Immiscible Diffusion: Accelerate Diffusion Training by Reducing Its Miscibility [62.272571285823595]
We show how immiscibility eases denoising and improves efficiency.<n>We propose a family of implementations including K-nearest neighbor (KNN) noise selection and image scaling to reduce miscibility.<n>This work establishes a potentially new direction for future research into high-efficiency diffusion training.
arXiv Detail & Related papers (2025-05-24T05:38:35Z)
VARD: Efficient and Dense Fine-Tuning for Diffusion Models with Value-based RL [28.95582264086289]
VAlue-based Reinforced Diffusion (VARD) is a novel approach that first learns a value function predicting expection of rewards from intermediate states.<n>Our method maintains proximity to the pretrained model while enabling effective and stable training via backpropagation.
arXiv Detail & Related papers (2025-05-21T17:44:37Z)
PLADIS: Pushing the Limits of Attention in Diffusion Models at Inference Time by Leveraging Sparsity [9.092404060771306]
Diffusion models have shown impressive results in generating high-quality conditional samples. However, existing methods often require additional training or neural function evaluations (NFEs) We propose a novel and efficient method, termed PLADIS, which boosts pre-trained models by leveraging sparse attention.
arXiv Detail & Related papers (2025-03-10T07:23:19Z)
Direct Discriminative Optimization: Your Likelihood-Based Visual Generative Model is Secretly a GAN Discriminator [38.883123034958935]
We propose a unified framework that bridges likelihood-based generative training and the maximum likelihood estimation objective. Our key insight is to parameterize a discriminator implicitly using the likelihood ratio between a learnable target model and a fixed reference model. Unlike GANs, this parameterization eliminates the need for joint training of generator and discriminator networks.
arXiv Detail & Related papers (2025-03-03T02:06:22Z)
Diffusion Models without Classifier-free Guidance [41.59396565229466]
Model-guidance (MG) is a novel objective for training diffusion model addresses and removes commonly used guidance (CFG) Our innovative approach transcends the standard modeling and incorporates the posterior probability of conditions. Our method significantly accelerates the training process, doubles inference speed, and achieve exceptional quality that parallel surpass even concurrent diffusion models with CFG.
arXiv Detail & Related papers (2025-02-17T18:59:50Z)
Efficient Diversity-Preserving Diffusion Alignment via Gradient-Informed GFlowNets [65.42834731617226]
We propose a reinforcement learning method for diffusion model finetuning, dubbed Nabla-GFlowNet. We show that our proposed method achieves fast yet diversity- and prior-preserving finetuning of Stable Diffusion, a large-scale text-conditioned image diffusion model.
arXiv Detail & Related papers (2024-12-10T18:59:58Z)
Representation Alignment for Generation: Training Diffusion Transformers Is Easier Than You Think [72.48325960659822]
One main bottleneck in training large-scale diffusion models for generation lies in effectively learning these representations. We study this by introducing a straightforward regularization called REPresentation Alignment (REPA), which aligns the projections of noisy input hidden states in denoising networks with clean image representations obtained from external, pretrained visual encoders. The results are striking: our simple strategy yields significant improvements in both training efficiency and generation quality when applied to popular diffusion and flow-based transformers, such as DiTs and SiTs.
arXiv Detail & Related papers (2024-10-09T14:34:53Z)
Model Inversion Attacks Through Target-Specific Conditional Diffusion Models [54.69008212790426]
Model inversion attacks (MIAs) aim to reconstruct private images from a target classifier's training set, thereby raising privacy concerns in AI applications. Previous GAN-based MIAs tend to suffer from inferior generative fidelity due to GAN's inherent flaws and biased optimization within latent space. We propose Diffusion-based Model Inversion (Diff-MI) attacks to alleviate these issues.
arXiv Detail & Related papers (2024-07-16T06:38:49Z)
Self-Play Fine-Tuning of Diffusion Models for Text-to-Image Generation [59.184980778643464]
Fine-tuning Diffusion Models remains an underexplored frontier in generative artificial intelligence (GenAI) In this paper, we introduce an innovative technique called self-play fine-tuning for diffusion models (SPIN-Diffusion) Our approach offers an alternative to conventional supervised fine-tuning and RL strategies, significantly improving both model performance and alignment.
arXiv Detail & Related papers (2024-02-15T18:59:18Z)
Adaptive Training Meets Progressive Scaling: Elevating Efficiency in Diffusion Models [52.1809084559048]
We propose a novel two-stage divide-and-conquer training strategy termed TDC Training. It groups timesteps based on task similarity and difficulty, assigning highly customized denoising models to each group, thereby enhancing the performance of diffusion models. While two-stage training avoids the need to train each model separately, the total training cost is even lower than training a single unified denoising model.
arXiv Detail & Related papers (2023-12-20T03:32:58Z)
CADS: Unleashing the Diversity of Diffusion Models through Condition-Annealed Sampling [27.795088366122297]
Condition-Annealed Diffusion Sampler (CADS) can be used with any pretrained model and sampling algorithm. We show that it boosts the diversity of diffusion models in various conditional generation tasks.
arXiv Detail & Related papers (2023-10-26T12:27:56Z)
Analyzing and Improving Optimal-Transport-based Adversarial Networks [9.980822222343921]
Optimal Transport (OT) problem aims to find a transport plan that bridges two distributions while minimizing a given cost function. OT theory has been widely utilized in generative modeling. Our approach achieves a FID score of 2.51 on CIFAR-10 and 5.99 on CelebA-HQ-256, outperforming unified OT-based adversarial approaches.
arXiv Detail & Related papers (2023-10-04T06:52:03Z)
Protein Design with Guided Discrete Diffusion [67.06148688398677]
A popular approach to protein design is to combine a generative model with a discriminative model for conditional sampling. We propose diffusioN Optimized Sampling (NOS), a guidance method for discrete diffusion models. NOS makes it possible to perform design directly in sequence space, circumventing significant limitations of structure-based methods.
arXiv Detail & Related papers (2023-05-31T16:31:24Z)
Generative Modeling through the Semi-dual Formulation of Unbalanced Optimal Transport [9.980822222343921]
We propose a novel generative model based on the semi-dual formulation of Unbalanced Optimal Transport (UOT) Unlike OT, UOT relaxes the hard constraint on distribution matching. This approach provides better robustness against outliers, stability during training, and faster convergence. Our model outperforms existing OT-based generative models, achieving FID scores of 2.97 on CIFAR-10 and 6.36 on CelebA-HQ-256.
arXiv Detail & Related papers (2023-05-24T06:31:05Z)
On Distillation of Guided Diffusion Models [94.95228078141626]
We propose an approach to distilling classifier-free guided diffusion models into models that are fast to sample from. For standard diffusion models trained on the pixelspace, our approach is able to generate images visually comparable to that of the original model. For diffusion models trained on the latent-space (e.g., Stable Diffusion), our approach is able to generate high-fidelity images using as few as 1 to 4 denoising steps.
arXiv Detail & Related papers (2022-10-06T18:03:56Z)

This list is automatically generated from the titles and abstracts of the papers in this site.