Related papers: Fast High-Resolution Image Synthesis with Latent Adversarial Diffusion Distillation

Fast High-Resolution Image Synthesis with Latent Adversarial Diffusion Distillation

URL: http://arxiv.org/abs/2403.12015v1
Date: Mon, 18 Mar 2024 17:51:43 GMT
Title: Fast High-Resolution Image Synthesis with Latent Adversarial Diffusion Distillation
Authors: Axel Sauer, Frederic Boesel, Tim Dockhorn, Andreas Blattmann, Patrick Esser, Robin Rombach,
Abstract summary: Distillation methods aim to shift the model from many-shot to single-step inference. We introduce Latent Adversarial Diffusion Distillation (LADD), a novel distillation approach overcoming the limitations of ADD. In contrast to pixel-based ADD, LADD utilizes generative features from pretrained latent diffusion models.
Score: 24.236841051249243
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Diffusion models are the main driver of progress in image and video synthesis, but suffer from slow inference speed. Distillation methods, like the recently introduced adversarial diffusion distillation (ADD) aim to shift the model from many-shot to single-step inference, albeit at the cost of expensive and difficult optimization due to its reliance on a fixed pretrained DINOv2 discriminator. We introduce Latent Adversarial Diffusion Distillation (LADD), a novel distillation approach overcoming the limitations of ADD. In contrast to pixel-based ADD, LADD utilizes generative features from pretrained latent diffusion models. This approach simplifies training and enhances performance, enabling high-resolution multi-aspect ratio image synthesis. We apply LADD to Stable Diffusion 3 (8B) to obtain SD3-Turbo, a fast model that matches the performance of state-of-the-art text-to-image generators using only four unguided sampling steps. Moreover, we systematically investigate its scaling behavior and demonstrate LADD's effectiveness in various applications such as image editing and inpainting.

Related papers

Diffusion Once and Done: Degradation-Aware LoRA for Efficient All-in-One Image Restoration [14.922600858354983]
Diffusion Once and Done (DOD) method aims to achieve superior restoration performance with only one-step sampling of Stable Diffusion (SD) models.<n>Our method outperforms existing diffusion-based restoration approaches in both visual quality and inference efficiency.
arXiv Detail & Related papers (2025-08-05T12:26:28Z)
Adversarial Distribution Matching for Diffusion Distillation Towards Efficient Image and Video Synthesis [65.77083310980896]
We propose Adrial Distribution Matching (ADM) to align latent predictions between real and fake score estimators for score distillation.<n>Our proposed method achieves superior one-step performance on SDXL compared to DMD2 while consuming less GPU time.<n>Additional experiments that apply multi-step ADM distillation on SD3-Medium, SD3.5-Large, and CogVideoX set a new benchmark towards efficient image and video synthesis.
arXiv Detail & Related papers (2025-07-24T16:45:05Z)
Acc3D: Accelerating Single Image to 3D Diffusion Models via Edge Consistency Guided Score Distillation [49.202383675543466]
We present Acc3D to tackle the challenge of accelerating the diffusion process to generate 3D models from single images. To derive high-quality reconstructions through few-step inferences, we emphasize the critical issue of regularizing the learning of score function in states of random noise.
arXiv Detail & Related papers (2025-03-20T09:18:10Z)
One-Step Diffusion Model for Image Motion-Deblurring [85.76149042561507]
We propose a one-step diffusion model for deblurring (OSDD), a novel framework that reduces the denoising process to a single step. To tackle fidelity loss in diffusion models, we introduce an enhanced variational autoencoder (eVAE), which improves structural restoration. Our method achieves strong performance on both full and no-reference metrics.
arXiv Detail & Related papers (2025-03-09T09:39:57Z)
One Diffusion Step to Real-World Super-Resolution via Flow Trajectory Distillation [60.54811860967658]
FluxSR is a novel one-step diffusion Real-ISR based on flow matching models. First, we introduce Flow Trajectory Distillation (FTD) to distill a multi-step flow matching model into a one-step Real-ISR. Second, to improve image realism and address high-frequency artifact issues in generated images, we propose TV-LPIPS as a perceptual loss.
arXiv Detail & Related papers (2025-02-04T04:11:29Z)
FADA: Fast Diffusion Avatar Synthesis with Mixed-Supervised Multi-CFG Distillation [55.424665700339695]
Diffusion-based audio-driven talking avatar methods have recently gained attention for their high-fidelity, vivid, and expressive results. Despite the development of various distillation techniques for diffusion models, we found that naive diffusion distillation methods do not yield satisfactory results. We propose FADA (Fast Diffusion Avatar Synthesis with Mixed-Supervised Multi-CFG Distillation) to address this problem.
arXiv Detail & Related papers (2024-12-22T08:19:22Z)
Relational Diffusion Distillation for Efficient Image Generation [27.127061578093674]
Diffusion model's high delay hinders its wide application in edge devices with scarce computing resources. We propose Diffusion Distillation (RDD), a novel distillation method tailored specifically for distilling diffusion models. Our proposed RDD leads to 1.47 FID decrease under 1 sampling step compared to state-of-the-art diffusion distillation methods and achieving 256x speed-up.
arXiv Detail & Related papers (2024-10-10T07:40:51Z)
Distillation-Free One-Step Diffusion for Real-World Image Super-Resolution [81.81748032199813]
We propose a Distillation-Free One-Step Diffusion model. Specifically, we propose a noise-aware discriminator (NAD) to participate in adversarial training. We improve the perceptual loss with edge-aware DISTS (EA-DISTS) to enhance the model's ability to generate fine details.
arXiv Detail & Related papers (2024-10-05T16:41:36Z)
Tuning Timestep-Distilled Diffusion Model Using Pairwise Sample Optimization [97.35427957922714]
We present an algorithm named pairwise sample optimization (PSO), which enables the direct fine-tuning of an arbitrary timestep-distilled diffusion model. PSO introduces additional reference images sampled from the current time-step distilled model, and increases the relative likelihood margin between the training images and reference images. We show that PSO can directly adapt distilled models to human-preferred generation with both offline and online-generated pairwise preference image data.
arXiv Detail & Related papers (2024-10-04T07:05:16Z)
One Step Diffusion-based Super-Resolution with Time-Aware Distillation [60.262651082672235]
Diffusion-based image super-resolution (SR) methods have shown promise in reconstructing high-resolution images with fine details from low-resolution counterparts. Recent techniques have been devised to enhance the sampling efficiency of diffusion-based SR models via knowledge distillation. We propose a time-aware diffusion distillation method, named TAD-SR, to accomplish effective and efficient image super-resolution.
arXiv Detail & Related papers (2024-08-14T11:47:22Z)
Distilling Diffusion Models into Conditional GANs [90.76040478677609]
We distill a complex multistep diffusion model into a single-step conditional GAN student model. For efficient regression loss, we propose E-LatentLPIPS, a perceptual loss operating directly in diffusion model's latent space. We demonstrate that our one-step generator outperforms cutting-edge one-step diffusion distillation models.
arXiv Detail & Related papers (2024-05-09T17:59:40Z)
AddSR: Accelerating Diffusion-based Blind Super-Resolution with Adversarial Diffusion Distillation [43.62480338471837]
Blind super-resolution methods based on stable diffusion showcase formidable generative capabilities in reconstructing clear high-resolution images with intricate details from low-resolution inputs. Their practical applicability is often hampered by poor efficiency, stemming from the requirement of thousands or hundreds of sampling steps. Inspired by the efficient adversarial diffusion distillation (ADD), we designnameto address this issue by incorporating the ideas of both distillation and ControlNet.
arXiv Detail & Related papers (2024-04-02T08:07:38Z)
LoRA-Enhanced Distillation on Guided Diffusion Models [0.0]
This research explores a novel approach that combines Low-Rank Adaptation (LoRA) with model distillation to efficiently compress diffusion models. Results are remarkable, featuring a significant reduction in inference time due to the distillation process and a substantial 50% reduction in memory consumption.
arXiv Detail & Related papers (2023-12-12T00:01:47Z)
Adversarial Diffusion Distillation [18.87099764514747]
Adversarial Diffusion Distillation (ADD) is a novel training approach that efficiently samples large-scale foundational image diffusion models in just 1-4 steps. We use score distillation to leverage large-scale off-the-shelf image diffusion models as a teacher signal. Our model clearly outperforms existing few-step methods in a single step and reaches the performance of state-of-the-art diffusion models (SDXL) in only four steps.
arXiv Detail & Related papers (2023-11-28T18:53:24Z)
ResShift: Efficient Diffusion Model for Image Super-resolution by Residual Shifting [70.83632337581034]
Diffusion-based image super-resolution (SR) methods are mainly limited by the low inference speed. We propose a novel and efficient diffusion model for SR that significantly reduces the number of diffusion steps. Our method constructs a Markov chain that transfers between the high-resolution image and the low-resolution image by shifting the residual.
arXiv Detail & Related papers (2023-07-23T15:10:02Z)
Low-Light Image Enhancement with Wavelet-based Diffusion Models [50.632343822790006]
Diffusion models have achieved promising results in image restoration tasks, yet suffer from time-consuming, excessive computational resource consumption, and unstable restoration. We propose a robust and efficient Diffusion-based Low-Light image enhancement approach, dubbed DiffLL.
arXiv Detail & Related papers (2023-06-01T03:08:28Z)

This list is automatically generated from the titles and abstracts of the papers in this site.