Adding Additional Control to One-Step Diffusion with Joint Distribution Matching
- URL: http://arxiv.org/abs/2503.06652v2
- Date: Wed, 12 Mar 2025 12:24:26 GMT
- Title: Adding Additional Control to One-Step Diffusion with Joint Distribution Matching
- Authors: Yihong Luo, Tianyang Hu, Yifan Song, Jiacheng Sun, Zhenguo Li, Jing Tang,
- Abstract summary: JDM is a novel approach that minimizes the reverse KL divergence between image-condition joint distributions.<n>By deriving a tractable upper bound, JDM decouples fidelity learning from condition learning.<n>This asymmetric distillation scheme enables our one-step student to handle controls unknown to the teacher model.
- Score: 58.37264951734603
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: While diffusion distillation has enabled one-step generation through methods like Variational Score Distillation, adapting distilled models to emerging new controls -- such as novel structural constraints or latest user preferences -- remains challenging. Conventional approaches typically requires modifying the base diffusion model and redistilling it -- a process that is both computationally intensive and time-consuming. To address these challenges, we introduce Joint Distribution Matching (JDM), a novel approach that minimizes the reverse KL divergence between image-condition joint distributions. By deriving a tractable upper bound, JDM decouples fidelity learning from condition learning. This asymmetric distillation scheme enables our one-step student to handle controls unknown to the teacher model and facilitates improved classifier-free guidance (CFG) usage and seamless integration of human feedback learning (HFL). Experimental results demonstrate that JDM surpasses baseline methods such as multi-step ControlNet by mere one-step in most cases, while achieving state-of-the-art performance in one-step text-to-image synthesis through improved usage of CFG or HFL integration.
Related papers
- Taming Flow Matching with Unbalanced Optimal Transport into Fast Pansharpening [10.23957420290553]
We propose the Optimal Transport Flow Matching framework to achieve one-step, high-quality pansharpening.
The OTFM framework enables simulation-free training and single-step inference while maintaining strict adherence to pansharpening constraints.
arXiv Detail & Related papers (2025-03-19T08:10:49Z) - Efficient Distillation of Classifier-Free Guidance using Adapters [0.0]
adapter guidance distillation (AGD) is a novel approach that simulates CFG in a single forward pass.<n>AGD keeps the base model frozen and only trains minimal additional parameters.<n>We show that AGD achieves comparable or superior FID to CFG across multiple architectures.
arXiv Detail & Related papers (2025-03-10T12:55:08Z) - One-Step Diffusion Model for Image Motion-Deblurring [85.76149042561507]
We propose a one-step diffusion model for deblurring (OSDD), a novel framework that reduces the denoising process to a single step.<n>To tackle fidelity loss in diffusion models, we introduce an enhanced variational autoencoder (eVAE), which improves structural restoration.<n>Our method achieves strong performance on both full and no-reference metrics.
arXiv Detail & Related papers (2025-03-09T09:39:57Z) - Sequential Controlled Langevin Diffusions [80.93988625183485]
Two popular methods are (1) Sequential Monte Carlo (SMC), where the transport is performed through successive densities via prescribed Markov chains and resampling steps, and (2) recently developed diffusion-based sampling methods, where a learned dynamical transport is used.<n>We present a principled framework for combining SMC with diffusion-based samplers by viewing both methods in continuous time and considering measures on path space.<n>This culminates in the new Sequential Controlled Langevin Diffusion (SCLD) sampling method, which is able to utilize the benefits of both methods and reaches improved performance on multiple benchmark problems, in many cases using only 10% of the training budget of previous diffusion-
arXiv Detail & Related papers (2024-12-10T00:47:10Z) - SNOOPI: Supercharged One-step Diffusion Distillation with Proper Guidance [12.973835034100428]
This paper presents SNOOPI, a novel framework designed to enhance the guidance in one-step diffusion models during both training and inference.
By varying the guidance scale of both teacher models, we broaden their output distributions, resulting in a more robust VSD loss that enables SB to perform effectively across diverse backbones while maintaining competitive performance.
Second, we propose a training-free method called Negative-Away Steer Attention (NASA), which integrates negative prompts into one-step diffusion models via cross-attention to suppress undesired elements in generated images.
arXiv Detail & Related papers (2024-12-03T18:56:32Z) - Joint Diffusion models in Continual Learning [4.013156524547073]
We introduce JDCL - a new method for continual learning with generative rehearsal based on joint diffusion models.
Generative-replay-based continual learning methods try to mitigate this issue by retraining a model with a combination of new and rehearsal data sampled from a generative model.
We show that such shared parametrization, combined with the knowledge distillation technique allows for stable adaptation to new tasks without catastrophic forgetting.
arXiv Detail & Related papers (2024-11-12T22:35:44Z) - Improving Vector-Quantized Image Modeling with Latent Consistency-Matching Diffusion [55.185588994883226]
We introduce VQ-LCMD, a continuous-space latent diffusion framework within the embedding space that stabilizes training.
VQ-LCMD uses a novel training objective combining the joint embedding-diffusion variational lower bound with a consistency-matching (CM) loss.
Experiments show that the proposed VQ-LCMD yields superior results on FFHQ, LSUN Churches, and LSUN Bedrooms compared to discrete-state latent diffusion models.
arXiv Detail & Related papers (2024-10-18T09:12:33Z) - Multi-Granularity Semantic Revision for Large Language Model Distillation [66.03746866578274]
We propose a multi-granularity semantic revision method for LLM distillation.
At the sequence level, we propose a sequence correction and re-generation strategy.
At the token level, we design a distribution adaptive clipping Kullback-Leibler loss as the distillation objective function.
At the span level, we leverage the span priors of a sequence to compute the probability correlations within spans, and constrain the teacher and student's probability correlations to be consistent.
arXiv Detail & Related papers (2024-07-14T03:51:49Z) - Improving Consistency Models with Generator-Augmented Flows [16.049476783301724]
Consistency models imitate the multi-step sampling of score-based diffusion in a single forward pass of a neural network.<n>They can be learned in two ways: consistency distillation and consistency training.<n>We propose a novel flow that transports noisy data towards their corresponding outputs derived from a consistency model.
arXiv Detail & Related papers (2024-06-13T20:22:38Z) - Adv-KD: Adversarial Knowledge Distillation for Faster Diffusion Sampling [2.91204440475204]
Diffusion Probabilistic Models (DPMs) have emerged as a powerful class of deep generative models.
They rely on sequential denoising steps during sample generation.
We propose a novel method that integrates denoising phases directly into the model's architecture.
arXiv Detail & Related papers (2024-05-31T08:19:44Z) - One More Step: A Versatile Plug-and-Play Module for Rectifying Diffusion
Schedule Flaws and Enhancing Low-Frequency Controls [77.42510898755037]
One More Step (OMS) is a compact network that incorporates an additional simple yet effective step during inference.
OMS elevates image fidelity and harmonizes the dichotomy between training and inference, while preserving original model parameters.
Once trained, various pre-trained diffusion models with the same latent domain can share the same OMS module.
arXiv Detail & Related papers (2023-11-27T12:02:42Z) - Distributionally Robust Models with Parametric Likelihood Ratios [123.05074253513935]
Three simple ideas allow us to train models with DRO using a broader class of parametric likelihood ratios.
We find that models trained with the resulting parametric adversaries are consistently more robust to subpopulation shifts when compared to other DRO approaches.
arXiv Detail & Related papers (2022-04-13T12:43:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.