Unleashing the Power of One-Step Diffusion based Image Super-Resolution via a Large-Scale Diffusion Discriminator
- URL: http://arxiv.org/abs/2410.04224v3
- Date: Sun, 09 Mar 2025 16:37:34 GMT
- Title: Unleashing the Power of One-Step Diffusion based Image Super-Resolution via a Large-Scale Diffusion Discriminator
- Authors: Jianze Li, Jiezhang Cao, Zichen Zou, Xiongfei Su, Xin Yuan, Yulun Zhang, Yong Guo, Xiaokang Yang,
- Abstract summary: Diffusion models have demonstrated excellent performance for real-world image super-resolution (Real-ISR)<n>We propose a new One-Step textbfDiffusion model with a larger-scale textbfDiscriminator for SR.<n>Our discriminator is able to distill noisy features from any time step of diffusion models in the latent space.
- Score: 81.81748032199813
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Diffusion models have demonstrated excellent performance for real-world image super-resolution (Real-ISR), albeit at high computational costs. Most existing methods are trying to derive one-step diffusion models from multi-step counterparts through knowledge distillation (KD) or variational score distillation (VSD). However, these methods are limited by the capabilities of the teacher model, especially if the teacher model itself is not sufficiently strong. To tackle these issues, we propose a new One-Step \textbf{D}iffusion model with a larger-scale \textbf{D}iffusion \textbf{D}iscriminator for SR, called D$^3$SR. Our discriminator is able to distill noisy features from any time step of diffusion models in the latent space. In this way, our diffusion discriminator breaks through the potential limitations imposed by the presence of a teacher model. Additionally, we improve the perceptual loss with edge-aware DISTS (EA-DISTS) to enhance the model's ability to generate fine details. Our experiments demonstrate that, compared with previous diffusion-based methods requiring dozens or even hundreds of steps, our D$^3$SR attains comparable or even superior results in both quantitative metrics and qualitative evaluations. Moreover, compared with other methods, D$^3$SR achieves at least $3\times$ faster inference speed and reduces parameters by at least 30\%. We will release code and models at https://github.com/JianzeLi-114/D3SR.
Related papers
- One Diffusion Step to Real-World Super-Resolution via Flow Trajectory Distillation [60.54811860967658]
FluxSR is a novel one-step diffusion Real-ISR based on flow matching models.
First, we introduce Flow Trajectory Distillation (FTD) to distill a multi-step flow matching model into a one-step Real-ISR.
Second, to improve image realism and address high-frequency artifact issues in generated images, we propose TV-LPIPS as a perceptual loss.
arXiv Detail & Related papers (2025-02-04T04:11:29Z) - Relational Diffusion Distillation for Efficient Image Generation [27.127061578093674]
Diffusion model's high delay hinders its wide application in edge devices with scarce computing resources.
We propose Diffusion Distillation (RDD), a novel distillation method tailored specifically for distilling diffusion models.
Our proposed RDD leads to 1.47 FID decrease under 1 sampling step compared to state-of-the-art diffusion distillation methods and achieving 256x speed-up.
arXiv Detail & Related papers (2024-10-10T07:40:51Z) - One Step Diffusion-based Super-Resolution with Time-Aware Distillation [60.262651082672235]
Diffusion-based image super-resolution (SR) methods have shown promise in reconstructing high-resolution images with fine details from low-resolution counterparts.
Recent techniques have been devised to enhance the sampling efficiency of diffusion-based SR models via knowledge distillation.
We propose a time-aware diffusion distillation method, named TAD-SR, to accomplish effective and efficient image super-resolution.
arXiv Detail & Related papers (2024-08-14T11:47:22Z) - Adv-KD: Adversarial Knowledge Distillation for Faster Diffusion Sampling [2.91204440475204]
Diffusion Probabilistic Models (DPMs) have emerged as a powerful class of deep generative models.
They rely on sequential denoising steps during sample generation.
We propose a novel method that integrates denoising phases directly into the model's architecture.
arXiv Detail & Related papers (2024-05-31T08:19:44Z) - SFDDM: Single-fold Distillation for Diffusion models [4.688721356965585]
We propose a single-fold distillation algorithm, SFDDM, which can flexibly compress the teacher diffusion model into a student model of any desired step.
Experiments on four datasets demonstrate that SFDDM is able to sample high-quality data with steps reduced to as little as approximately 1%.
arXiv Detail & Related papers (2024-05-23T18:11:14Z) - Distilling Diffusion Models into Conditional GANs [90.76040478677609]
We distill a complex multistep diffusion model into a single-step conditional GAN student model.
For efficient regression loss, we propose E-LatentLPIPS, a perceptual loss operating directly in diffusion model's latent space.
We demonstrate that our one-step generator outperforms cutting-edge one-step diffusion distillation models.
arXiv Detail & Related papers (2024-05-09T17:59:40Z) - Self-Play Fine-Tuning of Diffusion Models for Text-to-Image Generation [59.184980778643464]
Fine-tuning Diffusion Models remains an underexplored frontier in generative artificial intelligence (GenAI)
In this paper, we introduce an innovative technique called self-play fine-tuning for diffusion models (SPIN-Diffusion)
Our approach offers an alternative to conventional supervised fine-tuning and RL strategies, significantly improving both model performance and alignment.
arXiv Detail & Related papers (2024-02-15T18:59:18Z) - You Only Need One Step: Fast Super-Resolution with Stable Diffusion via
Scale Distillation [42.599077240711]
YONOS-SR is a stable diffusion-based approach for image super-resolution that yields state-of-the-art results using only a single DDIM step.
We propose a novel scale distillation approach to train our SR model.
arXiv Detail & Related papers (2024-01-30T18:49:44Z) - One-Step Diffusion Distillation via Deep Equilibrium Models [64.11782639697883]
We introduce a simple yet effective means of distilling diffusion models directly from initial noise to the resulting image.
Our method enables fully offline training with just noise/image pairs from the diffusion model.
We demonstrate that the DEQ architecture is crucial to this capability, as GET matches a $5times$ larger ViT in terms of FID scores.
arXiv Detail & Related papers (2023-12-12T07:28:40Z) - SinSR: Diffusion-Based Image Super-Resolution in a Single Step [119.18813219518042]
Super-resolution (SR) methods based on diffusion models exhibit promising results.
But their practical application is hindered by the substantial number of required inference steps.
We propose a simple yet effective method for achieving single-step SR generation, named SinSR.
arXiv Detail & Related papers (2023-11-23T16:21:29Z) - Reducing Spatial Fitting Error in Distillation of Denoising Diffusion
Models [13.364271265023953]
Knowledge distillation for diffusion models is an effective method to address this limitation with a shortened sampling process.
We attribute the degradation to the spatial fitting error occurring in the training of both the teacher and student model.
SFERD utilizes attention guidance from the teacher model and a designed semantic gradient predictor to reduce the student's fitting error.
We achieve an FID of 5.31 on CIFAR-10 and 9.39 on ImageNet 64$times$64 with only one step, outperforming existing diffusion methods.
arXiv Detail & Related papers (2023-11-07T09:19:28Z) - BOOT: Data-free Distillation of Denoising Diffusion Models with
Bootstrapping [64.54271680071373]
Diffusion models have demonstrated excellent potential for generating diverse images.
Knowledge distillation has been recently proposed as a remedy that can reduce the number of inference steps to one or a few.
We present a novel technique called BOOT, that overcomes limitations with an efficient data-free distillation algorithm.
arXiv Detail & Related papers (2023-06-08T20:30:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.