Image Super-resolution Via Latent Diffusion: A Sampling-space Mixture Of
Experts And Frequency-augmented Decoder Approach
- URL: http://arxiv.org/abs/2310.12004v3
- Date: Wed, 13 Dec 2023 13:08:29 GMT
- Title: Image Super-resolution Via Latent Diffusion: A Sampling-space Mixture Of
Experts And Frequency-augmented Decoder Approach
- Authors: Feng Luo, Jinxi Xiang, Jun Zhang, Xiao Han, Wei Yang
- Abstract summary: latent-based diffusion for image super-resolution improved by pre-trained text-image models.
latent-based methods utilize a feature encoder to transform the image and then implement the SR image generation in a compact latent space.
We propose a frequency compensation module that enhances the frequency components from latent space to pixel space.
- Score: 17.693287544860638
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The recent use of diffusion prior, enhanced by pre-trained text-image models,
has markedly elevated the performance of image super-resolution (SR). To
alleviate the huge computational cost required by pixel-based diffusion SR,
latent-based methods utilize a feature encoder to transform the image and then
implement the SR image generation in a compact latent space. Nevertheless,
there are two major issues that limit the performance of latent-based
diffusion. First, the compression of latent space usually causes reconstruction
distortion. Second, huge computational cost constrains the parameter scale of
the diffusion model. To counteract these issues, we first propose a frequency
compensation module that enhances the frequency components from latent space to
pixel space. The reconstruction distortion (especially for high-frequency
information) can be significantly decreased. Then, we propose to use
Sample-Space Mixture of Experts (SS-MoE) to achieve more powerful latent-based
SR, which steadily improves the capacity of the model without a significant
increase in inference costs. These carefully crafted designs contribute to
performance improvements in largely explored 4x blind super-resolution
benchmarks and extend to large magnification factors, i.e., 8x image SR
benchmarks. The code is available at https://github.com/amandaluof/moe_sr.
Related papers
- Latent Diffusion, Implicit Amplification: Efficient Continuous-Scale Super-Resolution for Remote Sensing Images [7.920423405957888]
E$2$DiffSR achieves superior objective metrics and visual quality compared to the state-of-the-art SR methods.
It reduces the inference time of diffusion-based SR methods to a level comparable to that of non-diffusion methods.
arXiv Detail & Related papers (2024-10-30T09:14:13Z) - Realistic Extreme Image Rescaling via Generative Latent Space Learning [51.85790402171696]
We propose a novel framework called Latent Space Based Image Rescaling (LSBIR) for extreme image rescaling tasks.
LSBIR effectively leverages powerful natural image priors learned by a pre-trained text-to-image diffusion model to generate realistic HR images.
In the first stage, a pseudo-invertible encoder-decoder models the bidirectional mapping between the latent features of the HR image and the target-sized LR image.
In the second stage, the reconstructed features from the first stage are refined by a pre-trained diffusion model to generate more faithful and visually pleasing details.
arXiv Detail & Related papers (2024-08-17T09:51:42Z) - High Frequency Matters: Uncertainty Guided Image Compression with Wavelet Diffusion [35.168244436206685]
We propose an efficient Uncertainty-Guided image compression approach with wavelet Diffusion (UGDiff)
Our approach focuses on high frequency compression via the wavelet transform, since high frequency components are crucial for reconstructing image details.
Comprehensive experiments on two benchmark datasets validate the effectiveness of UGDiff.
arXiv Detail & Related papers (2024-07-17T13:21:31Z) - Arbitrary-Scale Image Generation and Upsampling using Latent Diffusion Model and Implicit Neural Decoder [29.924160271522354]
Super-resolution (SR) and image generation are important tasks in computer vision and are widely adopted in real-world applications.
Most existing methods, however, generate images only at fixed-scale magnification and suffer from over-smoothing and artifacts.
Most relevant work applied Implicit Neural Representation (INR) to the denoising diffusion model to obtain continuous-resolution yet diverse and high-quality SR results.
We propose a novel pipeline that can super-resolve an input image or generate from a random noise a novel image at arbitrary scales.
arXiv Detail & Related papers (2024-03-15T12:45:40Z) - ResShift: Efficient Diffusion Model for Image Super-resolution by
Residual Shifting [70.83632337581034]
Diffusion-based image super-resolution (SR) methods are mainly limited by the low inference speed.
We propose a novel and efficient diffusion model for SR that significantly reduces the number of diffusion steps.
Our method constructs a Markov chain that transfers between the high-resolution image and the low-resolution image by shifting the residual.
arXiv Detail & Related papers (2023-07-23T15:10:02Z) - Low-Light Image Enhancement with Wavelet-based Diffusion Models [50.632343822790006]
Diffusion models have achieved promising results in image restoration tasks, yet suffer from time-consuming, excessive computational resource consumption, and unstable restoration.
We propose a robust and efficient Diffusion-based Low-Light image enhancement approach, dubbed DiffLL.
arXiv Detail & Related papers (2023-06-01T03:08:28Z) - Hierarchical Integration Diffusion Model for Realistic Image Deblurring [71.76410266003917]
Diffusion models (DMs) have been introduced in image deblurring and exhibited promising performance.
We propose the Hierarchical Integration Diffusion Model (HI-Diff), for realistic image deblurring.
Experiments on synthetic and real-world blur datasets demonstrate that our HI-Diff outperforms state-of-the-art methods.
arXiv Detail & Related papers (2023-05-22T12:18:20Z) - Refusion: Enabling Large-Size Realistic Image Restoration with
Latent-Space Diffusion Models [9.245782611878752]
We enhance the diffusion model in several aspects such as network architecture, noise level, denoising steps, training image size, and perceptual/scheduler scores.
We also propose a U-Net based latent diffusion model which performs diffusion in a low-resolution latent space while preserving high-resolution information from the original input for the decoding process.
These modifications allow us to apply diffusion models to various image restoration tasks, including real-world shadow removal, HR non-homogeneous dehazing, stereo super-resolution, and bokeh effect transformation.
arXiv Detail & Related papers (2023-04-17T14:06:49Z) - Towards Lightweight Super-Resolution with Dual Regression Learning [58.98801753555746]
Deep neural networks have exhibited remarkable performance in image super-resolution (SR) tasks.
The SR problem is typically an ill-posed problem and existing methods would come with several limitations.
We propose a dual regression learning scheme to reduce the space of possible SR mappings.
arXiv Detail & Related papers (2022-07-16T12:46:10Z) - Fourier Space Losses for Efficient Perceptual Image Super-Resolution [131.50099891772598]
We show that it is possible to improve the performance of a recently introduced efficient generator architecture solely with the application of our proposed loss functions.
We show that our losses' direct emphasis on the frequencies in Fourier-space significantly boosts the perceptual image quality.
The trained generator achieves comparable results with and is 2.4x and 48x faster than state-of-the-art perceptual SR methods RankSRGAN and SRFlow respectively.
arXiv Detail & Related papers (2021-06-01T20:34:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.