Adversarial Diffusion Compression for Real-World Image Super-Resolution
- URL: http://arxiv.org/abs/2411.13383v1
- Date: Wed, 20 Nov 2024 15:13:36 GMT
- Title: Adversarial Diffusion Compression for Real-World Image Super-Resolution
- Authors: Bin Chen, Gehui Li, Rongyuan Wu, Xindong Zhang, Jie Chen, Jian Zhang, Lei Zhang,
- Abstract summary: Real-world image super-resolution aims to reconstruct high-resolution images from low-resolution inputs degraded by complex processes.
One-step diffusion networks like OSEDiff and S3Diff alleviate this issue but still incur high computational costs.
This paper proposes a novel Real-ISR method, AdcSR, by distilling the one-step diffusion network OSEDiff into a streamlined diffusion-GAN model.
- Score: 16.496532580598007
- License:
- Abstract: Real-world image super-resolution (Real-ISR) aims to reconstruct high-resolution images from low-resolution inputs degraded by complex, unknown processes. While many Stable Diffusion (SD)-based Real-ISR methods have achieved remarkable success, their slow, multi-step inference hinders practical deployment. Recent SD-based one-step networks like OSEDiff and S3Diff alleviate this issue but still incur high computational costs due to their reliance on large pretrained SD models. This paper proposes a novel Real-ISR method, AdcSR, by distilling the one-step diffusion network OSEDiff into a streamlined diffusion-GAN model under our Adversarial Diffusion Compression (ADC) framework. We meticulously examine the modules of OSEDiff, categorizing them into two types: (1) Removable (VAE encoder, prompt extractor, text encoder, etc.) and (2) Prunable (denoising UNet and VAE decoder). Since direct removal and pruning can degrade the model's generation capability, we pretrain our pruned VAE decoder to restore its ability to decode images and employ adversarial distillation to compensate for performance loss. This ADC-based diffusion-GAN hybrid design effectively reduces complexity by 73% in inference time, 78% in computation, and 74% in parameters, while preserving the model's generation capability. Experiments manifest that our proposed AdcSR achieves competitive recovery quality on both synthetic and real-world datasets, offering up to 9.3$\times$ speedup over previous one-step diffusion-based methods. Code and models will be made available.
Related papers
- One Diffusion Step to Real-World Super-Resolution via Flow Trajectory Distillation [60.54811860967658]
FluxSR is a novel one-step diffusion Real-ISR based on flow matching models.
First, we introduce Flow Trajectory Distillation (FTD) to distill a multi-step flow matching model into a one-step Real-ISR.
Second, to improve image realism and address high-frequency artifact issues in generated images, we propose TV-LPIPS as a perceptual loss.
arXiv Detail & Related papers (2025-02-04T04:11:29Z) - RealOSR: Latent Unfolding Boosting Diffusion-based Real-world Omnidirectional Image Super-Resolution [11.290865218020386]
RealOSR is a novel diffusion-based approach for real-world ODISR (Real-ODISR) with single-step diffusion denoising.
RealOSR achieves significant improvements in visual quality and over textbf200$times$ inference acceleration.
arXiv Detail & Related papers (2024-12-11T06:23:14Z) - TSD-SR: One-Step Diffusion with Target Score Distillation for Real-World Image Super-Resolution [25.994093587158808]
Pre-trained text-to-image diffusion models are increasingly applied to real-world image super-resolution (Real-ISR) tasks.
Given the iterative refinement nature of diffusion models, most existing approaches are computationally expensive.
We propose TSD-SR, a novel distillation framework specifically designed for real-world image super-resolution.
arXiv Detail & Related papers (2024-11-27T12:01:08Z) - ConsisSR: Delving Deep into Consistency in Diffusion-based Image Super-Resolution [28.945663118445037]
Real-world image super-resolution (Real-ISR) aims at restoring high-quality (HQ) images from low-quality (LQ) inputs corrupted by unknown and complex degradations.
We introduce ConsisSR to handle both semantic and pixel-level consistency.
arXiv Detail & Related papers (2024-10-17T17:41:52Z) - One-Step Effective Diffusion Network for Real-World Image Super-Resolution [11.326598938246558]
We propose a one-step effective diffusion network, namely OSEDiff, for the Real-ISR problem.
We finetune the pre-trained diffusion network with trainable layers to adapt it to complex image degradations.
Our OSEDiff model can efficiently and effectively generate HQ images in just one diffusion step.
arXiv Detail & Related papers (2024-06-12T13:10:31Z) - Binarized Diffusion Model for Image Super-Resolution [61.963833405167875]
Binarization, an ultra-compression algorithm, offers the potential for effectively accelerating advanced diffusion models (DMs)
Existing binarization methods result in significant performance degradation.
We introduce a novel binarized diffusion model, BI-DiffSR, for image SR.
arXiv Detail & Related papers (2024-06-09T10:30:25Z) - Invertible Diffusion Models for Compressed Sensing [22.293412255419614]
Invertible Diffusion Models (IDM) is a novel efficient, end-to-end diffusion-based compressed sensing method.
Our IDM outperforms existing state-of-the-art CS networks by up to 2.64dB in PSNR.
Compared to the recent diffusion-based approach DDNM, our IDM achieves up to 10.09dB PSNR gain and 14.54 times faster inference.
arXiv Detail & Related papers (2024-03-25T17:59:41Z) - Iterative Token Evaluation and Refinement for Real-World
Super-Resolution [77.74289677520508]
Real-world image super-resolution (RWSR) is a long-standing problem as low-quality (LQ) images often have complex and unidentified degradations.
We propose an Iterative Token Evaluation and Refinement framework for RWSR.
We show that ITER is easier to train than Generative Adversarial Networks (GANs) and more efficient than continuous diffusion models.
arXiv Detail & Related papers (2023-12-09T17:07:32Z) - Hierarchical Integration Diffusion Model for Realistic Image Deblurring [71.76410266003917]
Diffusion models (DMs) have been introduced in image deblurring and exhibited promising performance.
We propose the Hierarchical Integration Diffusion Model (HI-Diff), for realistic image deblurring.
Experiments on synthetic and real-world blur datasets demonstrate that our HI-Diff outperforms state-of-the-art methods.
arXiv Detail & Related papers (2023-05-22T12:18:20Z) - DR2: Diffusion-based Robust Degradation Remover for Blind Face
Restoration [66.01846902242355]
Blind face restoration usually synthesizes degraded low-quality data with a pre-defined degradation model for training.
It is expensive and infeasible to include every type of degradation to cover real-world cases in the training data.
We propose Robust Degradation Remover (DR2) to first transform the degraded image to a coarse but degradation-invariant prediction, then employ an enhancement module to restore the coarse prediction to a high-quality image.
arXiv Detail & Related papers (2023-03-13T06:05:18Z) - Towards Lightweight Super-Resolution with Dual Regression Learning [58.98801753555746]
Deep neural networks have exhibited remarkable performance in image super-resolution (SR) tasks.
The SR problem is typically an ill-posed problem and existing methods would come with several limitations.
We propose a dual regression learning scheme to reduce the space of possible SR mappings.
arXiv Detail & Related papers (2022-07-16T12:46:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.