You Only Need One Step: Fast Super-Resolution with Stable Diffusion via
Scale Distillation
- URL: http://arxiv.org/abs/2401.17258v1
- Date: Tue, 30 Jan 2024 18:49:44 GMT
- Title: You Only Need One Step: Fast Super-Resolution with Stable Diffusion via
Scale Distillation
- Authors: Mehdi Noroozi, Isma Hadji, Brais Martinez, Adrian Bulat and Georgios
Tzimiropoulos
- Abstract summary: YONOS-SR is a stable diffusion-based approach for image super-resolution that yields state-of-the-art results using only a single DDIM step.
We propose a novel scale distillation approach to train our SR model.
- Score: 42.599077240711
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we introduce YONOS-SR, a novel stable diffusion-based approach
for image super-resolution that yields state-of-the-art results using only a
single DDIM step. We propose a novel scale distillation approach to train our
SR model. Instead of directly training our SR model on the scale factor of
interest, we start by training a teacher model on a smaller magnification
scale, thereby making the SR problem simpler for the teacher. We then train a
student model for a higher magnification scale, using the predictions of the
teacher as a target during the training. This process is repeated iteratively
until we reach the target scale factor of the final model. The rationale behind
our scale distillation is that the teacher aids the student diffusion model
training by i) providing a target adapted to the current noise level rather
than using the same target coming from ground truth data for all noise levels
and ii) providing an accurate target as the teacher has a simpler task to
solve. We empirically show that the distilled model significantly outperforms
the model trained for high scales directly, specifically with few steps during
inference. Having a strong diffusion model that requires only one step allows
us to freeze the U-Net and fine-tune the decoder on top of it. We show that the
combination of spatially distilled U-Net and fine-tuned decoder outperforms
state-of-the-art methods requiring 200 steps with only one single step.
Related papers
- Towards Training One-Step Diffusion Models Without Distillation [72.80423908458772]
We show that one-step generative models can be trained directly without this distillation process.
We propose a family of distillation methods that achieve competitive results without relying on score estimation.
arXiv Detail & Related papers (2025-02-11T23:02:14Z) - One Diffusion Step to Real-World Super-Resolution via Flow Trajectory Distillation [60.54811860967658]
FluxSR is a novel one-step diffusion Real-ISR based on flow matching models.
First, we introduce Flow Trajectory Distillation (FTD) to distill a multi-step flow matching model into a one-step Real-ISR.
Second, to improve image realism and address high-frequency artifact issues in generated images, we propose TV-LPIPS as a perceptual loss.
arXiv Detail & Related papers (2025-02-04T04:11:29Z) - OFTSR: One-Step Flow for Image Super-Resolution with Tunable Fidelity-Realism Trade-offs [20.652907645817713]
OFTSR is a flow-based framework for one-step image super-resolution that can produce outputs with tunable levels of fidelity and realism.
We demonstrate that OFTSR achieves state-of-the-art performance for one-step image super-resolution, while having the ability to flexibly tune the fidelity-realism trade-off.
arXiv Detail & Related papers (2024-12-12T17:14:58Z) - SFDDM: Single-fold Distillation for Diffusion models [4.688721356965585]
We propose a single-fold distillation algorithm, SFDDM, which can flexibly compress the teacher diffusion model into a student model of any desired step.
Experiments on four datasets demonstrate that SFDDM is able to sample high-quality data with steps reduced to as little as approximately 1%.
arXiv Detail & Related papers (2024-05-23T18:11:14Z) - One-Step Diffusion Distillation via Deep Equilibrium Models [64.11782639697883]
We introduce a simple yet effective means of distilling diffusion models directly from initial noise to the resulting image.
Our method enables fully offline training with just noise/image pairs from the diffusion model.
We demonstrate that the DEQ architecture is crucial to this capability, as GET matches a $5times$ larger ViT in terms of FID scores.
arXiv Detail & Related papers (2023-12-12T07:28:40Z) - Adversarial Diffusion Distillation [18.87099764514747]
Adversarial Diffusion Distillation (ADD) is a novel training approach that efficiently samples large-scale foundational image diffusion models in just 1-4 steps.
We use score distillation to leverage large-scale off-the-shelf image diffusion models as a teacher signal.
Our model clearly outperforms existing few-step methods in a single step and reaches the performance of state-of-the-art diffusion models (SDXL) in only four steps.
arXiv Detail & Related papers (2023-11-28T18:53:24Z) - SinSR: Diffusion-Based Image Super-Resolution in a Single Step [119.18813219518042]
Super-resolution (SR) methods based on diffusion models exhibit promising results.
But their practical application is hindered by the substantial number of required inference steps.
We propose a simple yet effective method for achieving single-step SR generation, named SinSR.
arXiv Detail & Related papers (2023-11-23T16:21:29Z) - BOOT: Data-free Distillation of Denoising Diffusion Models with
Bootstrapping [64.54271680071373]
Diffusion models have demonstrated excellent potential for generating diverse images.
Knowledge distillation has been recently proposed as a remedy that can reduce the number of inference steps to one or a few.
We present a novel technique called BOOT, that overcomes limitations with an efficient data-free distillation algorithm.
arXiv Detail & Related papers (2023-06-08T20:30:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.