One-step Diffusion with Distribution Matching Distillation
- URL: http://arxiv.org/abs/2311.18828v3
- Date: Tue, 5 Dec 2023 16:08:36 GMT
- Title: One-step Diffusion with Distribution Matching Distillation
- Authors: Tianwei Yin, Micha\"el Gharbi, Richard Zhang, Eli Shechtman, Fredo
Durand, William T. Freeman, Taesung Park
- Abstract summary: We introduce Distribution Matching Distillation (DMD), a procedure to transform a diffusion model into a one-step image generator.
We enforce the one-step image generator match the diffusion model at distribution level, by minimizing an approximate KL divergence.
Our method outperforms all published few-step diffusion approaches, reaching 2.62 FID on ImageNet 64x64 and 11.49 FID on zero-shot COCO-30k.
- Score: 50.45103465564635
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Diffusion models generate high-quality images but require dozens of forward
passes. We introduce Distribution Matching Distillation (DMD), a procedure to
transform a diffusion model into a one-step image generator with minimal impact
on image quality. We enforce the one-step image generator match the diffusion
model at distribution level, by minimizing an approximate KL divergence whose
gradient can be expressed as the difference between 2 score functions, one of
the target distribution and the other of the synthetic distribution being
produced by our one-step generator. The score functions are parameterized as
two diffusion models trained separately on each distribution. Combined with a
simple regression loss matching the large-scale structure of the multi-step
diffusion outputs, our method outperforms all published few-step diffusion
approaches, reaching 2.62 FID on ImageNet 64x64 and 11.49 FID on zero-shot
COCO-30k, comparable to Stable Diffusion but orders of magnitude faster.
Utilizing FP16 inference, our model generates images at 20 FPS on modern
hardware.
Related papers
- Regularized Distribution Matching Distillation for One-step Unpaired Image-to-Image Translation [1.8434042562191815]
We introduce Regularized Distribution Matching Distillation, applicable to unpaired image-to-image (I2I) problems.
We demonstrate its empirical performance in application to several translation tasks, including 2D examples and I2I between different image datasets.
arXiv Detail & Related papers (2024-06-20T22:22:31Z) - EM Distillation for One-step Diffusion Models [65.57766773137068]
We propose a maximum likelihood-based approach that distills a diffusion model to a one-step generator model with minimal loss of quality.
We develop a reparametrized sampling scheme and a noise cancellation technique that together stabilizes the distillation process.
arXiv Detail & Related papers (2024-05-27T05:55:22Z) - Distilling Diffusion Models into Conditional GANs [90.76040478677609]
We distill a complex multistep diffusion model into a single-step conditional GAN student model.
For efficient regression loss, we propose E-LatentLPIPS, a perceptual loss operating directly in diffusion model's latent space.
We demonstrate that our one-step generator outperforms cutting-edge one-step diffusion distillation models.
arXiv Detail & Related papers (2024-05-09T17:59:40Z) - SinDiffusion: Learning a Diffusion Model from a Single Natural Image [159.4285444680301]
We present SinDiffusion, leveraging denoising diffusion models to capture internal distribution of patches from a single natural image.
It is based on two core designs. First, SinDiffusion is trained with a single model at a single scale instead of multiple models with progressive growing of scales.
Second, we identify that a patch-level receptive field of the diffusion network is crucial and effective for capturing the image's patch statistics.
arXiv Detail & Related papers (2022-11-22T18:00:03Z) - Unifying Diffusion Models' Latent Space, with Applications to
CycleDiffusion and Guidance [95.12230117950232]
We show that a common latent space emerges from two diffusion models trained independently on related domains.
Applying CycleDiffusion to text-to-image diffusion models, we show that large-scale text-to-image diffusion models can be used as zero-shot image-to-image editors.
arXiv Detail & Related papers (2022-10-11T15:53:52Z) - On Distillation of Guided Diffusion Models [94.95228078141626]
We propose an approach to distilling classifier-free guided diffusion models into models that are fast to sample from.
For standard diffusion models trained on the pixelspace, our approach is able to generate images visually comparable to that of the original model.
For diffusion models trained on the latent-space (e.g., Stable Diffusion), our approach is able to generate high-fidelity images using as few as 1 to 4 denoising steps.
arXiv Detail & Related papers (2022-10-06T18:03:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.