You Only Need One Step: Fast Super-Resolution with Stable Diffusion via
Scale Distillation
- URL: http://arxiv.org/abs/2401.17258v1
- Date: Tue, 30 Jan 2024 18:49:44 GMT
- Title: You Only Need One Step: Fast Super-Resolution with Stable Diffusion via
Scale Distillation
- Authors: Mehdi Noroozi, Isma Hadji, Brais Martinez, Adrian Bulat and Georgios
Tzimiropoulos
- Abstract summary: YONOS-SR is a stable diffusion-based approach for image super-resolution that yields state-of-the-art results using only a single DDIM step.
We propose a novel scale distillation approach to train our SR model.
- Score: 42.599077240711
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we introduce YONOS-SR, a novel stable diffusion-based approach
for image super-resolution that yields state-of-the-art results using only a
single DDIM step. We propose a novel scale distillation approach to train our
SR model. Instead of directly training our SR model on the scale factor of
interest, we start by training a teacher model on a smaller magnification
scale, thereby making the SR problem simpler for the teacher. We then train a
student model for a higher magnification scale, using the predictions of the
teacher as a target during the training. This process is repeated iteratively
until we reach the target scale factor of the final model. The rationale behind
our scale distillation is that the teacher aids the student diffusion model
training by i) providing a target adapted to the current noise level rather
than using the same target coming from ground truth data for all noise levels
and ii) providing an accurate target as the teacher has a simpler task to
solve. We empirically show that the distilled model significantly outperforms
the model trained for high scales directly, specifically with few steps during
inference. Having a strong diffusion model that requires only one step allows
us to freeze the U-Net and fine-tune the decoder on top of it. We show that the
combination of spatially distilled U-Net and fine-tuned decoder outperforms
state-of-the-art methods requiring 200 steps with only one single step.
Related papers
- Towards Flexible and Efficient Diffusion Low Light Enhancer [30.515393168075448]
Diffusion-based Low-Light Image Enhancement (LLIE) has demonstrated significant success in improving the visibility of low-light images.
We propose textbfReflectance-aware textbfDiffusion with textbfDistilled textbfTrajectory (textbfReDDiT), a step distillation framework specifically designed for LLIE.
arXiv Detail & Related papers (2024-10-16T08:07:18Z) - One Step Diffusion-based Super-Resolution with Time-Aware Distillation [60.262651082672235]
Diffusion-based image super-resolution (SR) methods have shown promise in reconstructing high-resolution images with fine details from low-resolution counterparts.
Recent techniques have been devised to enhance the sampling efficiency of diffusion-based SR models via knowledge distillation.
We propose a time-aware diffusion distillation method, named TAD-SR, to accomplish effective and efficient image super-resolution.
arXiv Detail & Related papers (2024-08-14T11:47:22Z) - SFDDM: Single-fold Distillation for Diffusion models [4.688721356965585]
We propose a single-fold distillation algorithm, SFDDM, which can flexibly compress the teacher diffusion model into a student model of any desired step.
Experiments on four datasets demonstrate that SFDDM is able to sample high-quality data with steps reduced to as little as approximately 1%.
arXiv Detail & Related papers (2024-05-23T18:11:14Z) - Improved Distribution Matching Distillation for Fast Image Synthesis [54.72356560597428]
We introduce DMD2, a set of techniques that lift this limitation and improve DMD training.
First, we eliminate the regression loss and the need for expensive dataset construction.
Second, we integrate a GAN loss into the distillation procedure, discriminating between generated samples and real images.
arXiv Detail & Related papers (2024-05-23T17:59:49Z) - One-Step Diffusion Distillation via Deep Equilibrium Models [64.11782639697883]
We introduce a simple yet effective means of distilling diffusion models directly from initial noise to the resulting image.
Our method enables fully offline training with just noise/image pairs from the diffusion model.
We demonstrate that the DEQ architecture is crucial to this capability, as GET matches a $5times$ larger ViT in terms of FID scores.
arXiv Detail & Related papers (2023-12-12T07:28:40Z) - Adversarial Diffusion Distillation [18.87099764514747]
Adversarial Diffusion Distillation (ADD) is a novel training approach that efficiently samples large-scale foundational image diffusion models in just 1-4 steps.
We use score distillation to leverage large-scale off-the-shelf image diffusion models as a teacher signal.
Our model clearly outperforms existing few-step methods in a single step and reaches the performance of state-of-the-art diffusion models (SDXL) in only four steps.
arXiv Detail & Related papers (2023-11-28T18:53:24Z) - SinSR: Diffusion-Based Image Super-Resolution in a Single Step [119.18813219518042]
Super-resolution (SR) methods based on diffusion models exhibit promising results.
But their practical application is hindered by the substantial number of required inference steps.
We propose a simple yet effective method for achieving single-step SR generation, named SinSR.
arXiv Detail & Related papers (2023-11-23T16:21:29Z) - BOOT: Data-free Distillation of Denoising Diffusion Models with
Bootstrapping [64.54271680071373]
Diffusion models have demonstrated excellent potential for generating diverse images.
Knowledge distillation has been recently proposed as a remedy that can reduce the number of inference steps to one or a few.
We present a novel technique called BOOT, that overcomes limitations with an efficient data-free distillation algorithm.
arXiv Detail & Related papers (2023-06-08T20:30:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.