Distribution Matching Distillation Meets Reinforcement Learning
- URL: http://arxiv.org/abs/2511.13649v2
- Date: Wed, 19 Nov 2025 17:27:53 GMT
- Title: Distribution Matching Distillation Meets Reinforcement Learning
- Authors: Dengyang Jiang, Dongyang Liu, Zanyi Wang, Qilong Wu, Liuzhuozheng Li, Hengzhuang Li, Xin Jin, David Liu, Zhen Li, Bo Zhang, Mengmeng Wang, Steven Hoi, Peng Gao, Harry Yang,
- Abstract summary: We propose DMDR, a novel framework that combines Reinforcement Learning (RL) techniques into the distillation process.<n>We show that for the RL of the few-step generator, the DMD loss itself is a more effective regularization compared to the traditional ones.<n>Experiments demonstrate that DMDR can achieve leading visual quality, prompt coherence among few-step methods, and even exhibit performance that exceeds the multi-step teacher.
- Score: 30.960105413888943
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Distribution Matching Distillation (DMD) distills a pre-trained multi-step diffusion model to a few-step one to improve inference efficiency. However, the performance of the latter is often capped by the former. To circumvent this dilemma, we propose DMDR, a novel framework that combines Reinforcement Learning (RL) techniques into the distillation process. We show that for the RL of the few-step generator, the DMD loss itself is a more effective regularization compared to the traditional ones. In turn, RL can help to guide the mode coverage process in DMD more effectively. These allow us to unlock the capacity of the few-step generator by conducting distillation and RL simultaneously. Meanwhile, we design the dynamic distribution guidance and dynamic renoise sampling training strategies to improve the initial distillation process. The experiments demonstrate that DMDR can achieve leading visual quality, prompt coherence among few-step methods, and even exhibit performance that exceeds the multi-step teacher.
Related papers
- ReDiF: Reinforced Distillation for Few Step Diffusion [21.686373820429736]
Distillation addresses the slow sampling problem in diffusion models by creating models with smaller size or fewer steps.<n>We propose a reinforcement learning based distillation framework for diffusion models.
arXiv Detail & Related papers (2025-12-28T06:27:24Z) - Phased DMD: Few-step Distribution Matching Distillation via Score Matching within Subintervals [48.14879329270912]
Phased DMD is a multi-step distillation framework that bridges the idea of phase-wise distillation with Mixture-of-Experts.<n>Phased DMD is built upon two key ideas: progressive distribution matching and score matching within subintervals.<n> Experimental results demonstrate that Phased DMD preserves output diversity better than DMD while retaining key generative capabilities.
arXiv Detail & Related papers (2025-10-31T17:55:10Z) - Adversarial Distribution Matching for Diffusion Distillation Towards Efficient Image and Video Synthesis [65.77083310980896]
We propose Adrial Distribution Matching (ADM) to align latent predictions between real and fake score estimators for score distillation.<n>Our proposed method achieves superior one-step performance on SDXL compared to DMD2 while consuming less GPU time.<n>Additional experiments that apply multi-step ADM distillation on SD3-Medium, SD3.5-Large, and CogVideoX set a new benchmark towards efficient image and video synthesis.
arXiv Detail & Related papers (2025-07-24T16:45:05Z) - DDIL: Diversity Enhancing Diffusion Distillation With Imitation Learning [57.3467234269487]
Diffusion models excel at generative modeling (e.g., text-to-image) but sampling requires multiple denoising network passes.<n>Progressive distillation or consistency distillation have shown promise by reducing the number of passes.<n>We show that DDIL consistency improves on baseline algorithms of progressive distillation (PD), Latent consistency models (LCM) and Distribution Matching Distillation (DMD2)
arXiv Detail & Related papers (2024-10-15T18:21:47Z) - Relational Diffusion Distillation for Efficient Image Generation [27.127061578093674]
Diffusion model's high delay hinders its wide application in edge devices with scarce computing resources.<n>We propose Diffusion Distillation (RDD), a novel distillation method tailored specifically for distilling diffusion models.<n>Our proposed RDD leads to 1.47 FID decrease under 1 sampling step compared to state-of-the-art diffusion distillation methods and achieving 256x speed-up.
arXiv Detail & Related papers (2024-10-10T07:40:51Z) - Presto! Distilling Steps and Layers for Accelerating Music Generation [49.34961693154768]
Presto! is an approach to inference acceleration for score-based diffusion transformers.<n>We develop a new score-based distribution matching distillation (DMD) method for the EDM-family of diffusion models.<n>To reduce the cost per step, we develop a simple, but powerful improvement to a recent layer distillation method.
arXiv Detail & Related papers (2024-10-07T16:24:18Z) - Unleashing the Power of One-Step Diffusion based Image Super-Resolution via a Large-Scale Diffusion Discriminator [81.81748032199813]
Diffusion models have demonstrated excellent performance for real-world image super-resolution (Real-ISR)<n>We propose a new One-Step textbfDiffusion model with a larger-scale textbfDiscriminator for SR.<n>Our discriminator is able to distill noisy features from any time step of diffusion models in the latent space.
arXiv Detail & Related papers (2024-10-05T16:41:36Z) - EM Distillation for One-step Diffusion Models [65.57766773137068]
We propose a maximum likelihood-based approach that distills a diffusion model to a one-step generator model with minimal loss of quality.<n>We develop a reparametrized sampling scheme and a noise cancellation technique that together stabilizes the distillation process.
arXiv Detail & Related papers (2024-05-27T05:55:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.