ClearSR: Latent Low-Resolution Image Embeddings Help Diffusion-Based Real-World Super Resolution Models See Clearer
- URL: http://arxiv.org/abs/2410.14279v1
- Date: Fri, 18 Oct 2024 08:35:57 GMT
- Title: ClearSR: Latent Low-Resolution Image Embeddings Help Diffusion-Based Real-World Super Resolution Models See Clearer
- Authors: Yuhao Wan, Peng-Tao Jiang, Qibin Hou, Hao Zhang, Jinwei Chen, Ming-Ming Cheng, Bo Li,
- Abstract summary: We present ClearSR, a new method that can better take advantage of latent low-resolution image (LR) embeddings for diffusion-based real-world image super-resolution (Real-ISR)
Our model can achieve better performance across multiple metrics on several test sets and generate more consistent SR results with LR images than existing methods.
- Score: 68.72454974431749
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present ClearSR, a new method that can better take advantage of latent low-resolution image (LR) embeddings for diffusion-based real-world image super-resolution (Real-ISR). Previous Real-ISR models mostly focus on how to activate more generative priors of text-to-image diffusion models to make the output high-resolution (HR) images look better. However, since these methods rely too much on the generative priors, the content of the output images is often inconsistent with the input LR ones. To mitigate the above issue, in this work, we explore using latent LR embeddings to constrain the control signals from ControlNet, and extract LR information at both detail and structure levels. We show that the proper use of latent LR embeddings can produce higher-quality control signals, which enables the super-resolution results to be more consistent with the LR image and leads to clearer visual results. In addition, we also show that latent LR embeddings can be used to control the inference stage, allowing for the improvement of fidelity and generation ability simultaneously. Experiments demonstrate that our model can achieve better performance across multiple metrics on several test sets and generate more consistent SR results with LR images than existing methods. Our code will be made publicly available.
Related papers
- DiT4SR: Taming Diffusion Transformer for Real-World Image Super-Resolution [42.26299332658843]
DiT4SR is one of the pioneering works to tame the large-scale DiT model for Real-ISR.
Instead of directly injecting embeddings extracted from low-resolution (LR) images like ControlNet, we integrate the LR embeddings into the original attention mechanism of DiT.
The LR guidance is injected into the generated latent via a cross-stream convolution layer, compensating for DiT's limited ability to capture local information.
arXiv Detail & Related papers (2025-03-30T20:27:22Z) - TASR: Timestep-Aware Diffusion Model for Image Super-Resolution [22.156869195433615]
We explore the temporal dynamics of information infusion through ControlNet.
We introduce a novel timestep-aware diffusion model that adaptively integrates features from both ControlNet and the pre-trained Stable Diffusion.
Our method enhances the transmission of LR information in the early stages of diffusion to guarantee image fidelity.
arXiv Detail & Related papers (2024-12-04T14:39:54Z) - Unveiling Hidden Details: A RAW Data-Enhanced Paradigm for Real-World Super-Resolution [56.98910228239627]
Real-world image super-resolution (Real SR) aims to generate high-fidelity, detail-rich high-resolution (HR) images from low-resolution (LR) counterparts.
Existing Real SR methods primarily focus on generating details from the LR RGB domain, often leading to a lack of richness or fidelity in fine details.
We pioneer the use of details hidden in RAW data to complement existing RGB-only methods, yielding superior outputs.
arXiv Detail & Related papers (2024-11-16T13:29:50Z) - RBSR: Efficient and Flexible Recurrent Network for Burst
Super-Resolution [57.98314517861539]
Burst super-resolution (BurstSR) aims at reconstructing a high-resolution (HR) image from a sequence of low-resolution (LR) and noisy images.
In this paper, we suggest fusing cues frame-by-frame with an efficient and flexible recurrent network.
arXiv Detail & Related papers (2023-06-30T12:14:13Z) - Real Image Super-Resolution using GAN through modeling of LR and HR
process [20.537597542144916]
We propose a learnable adaptive sinusoidal nonlinearities incorporated in LR and SR models by directly learn degradation distributions.
We demonstrate the effectiveness of our proposed approach in quantitative and qualitative experiments.
arXiv Detail & Related papers (2022-10-19T09:23:37Z) - Self-Supervised Learning for Real-World Super-Resolution from Dual
Zoomed Observations [66.09210030518686]
We present a novel self-supervised learning approach for real-world RefSR from observations at dual camera zooms (SelfDZSR)
For the first issue, the more zoomed (telephoto) image can be naturally leveraged as the reference to guide the SR of the lesser zoomed (short-focus) image.
For the second issue, SelfDZSR learns a deep network to obtain the SR result of short-focal image and with the same resolution as the telephoto image.
arXiv Detail & Related papers (2022-03-02T13:30:56Z) - A Deep Residual Star Generative Adversarial Network for multi-domain
Image Super-Resolution [21.39772242119127]
Super-Resolution Residual StarGAN (SR2*GAN) is a novel and scalable approach that super-resolves the LR images for the multiple LR domains using only a single model.
We demonstrate the effectiveness of our proposed approach in quantitative and qualitative experiments compared to other state-of-the-art methods.
arXiv Detail & Related papers (2021-07-07T11:15:17Z) - Best-Buddy GANs for Highly Detailed Image Super-Resolution [71.13466303340192]
We consider the single image super-resolution (SISR) problem, where a high-resolution (HR) image is generated based on a low-resolution (LR) input.
Most methods along this line rely on a predefined single-LR-single-HR mapping, which is not flexible enough for the SISR task.
We propose best-buddy GANs (Beby-GAN) for rich-detail SISR. Relaxing the immutable one-to-one constraint, we allow the estimated patches to dynamically seek the best supervision.
arXiv Detail & Related papers (2021-03-29T02:58:27Z) - Real-World Super-Resolution of Face-Images from Surveillance Cameras [25.258587196435464]
We propose a novel framework for generation of realistic LR/HR training pairs.
Our framework estimates realistic blur kernels, noise distributions, and JPEG compression artifacts to generate LR images with similar image characteristics as the ones in the source domain.
For better perceptual quality we use a Generative Adrial Network (GAN) based SR model where we have exchanged the commonly used VGG-loss [24] with LPIPS-loss [52]
arXiv Detail & Related papers (2021-02-05T11:38:30Z) - Deep Generative Adversarial Residual Convolutional Networks for
Real-World Super-Resolution [31.934084942626257]
We propose a deep Super-Resolution Residual Convolutional Generative Adversarial Network (SRResCGAN)
It follows the real-world degradation settings by adversarial training the model with pixel-wise supervision in the HR domain from its generated LR counterpart.
The proposed network exploits the residual learning by minimizing the energy-based objective function with powerful image regularization and convex optimization techniques.
arXiv Detail & Related papers (2020-05-03T00:12:38Z) - Closed-loop Matters: Dual Regression Networks for Single Image
Super-Resolution [73.86924594746884]
Deep neural networks have exhibited promising performance in image super-resolution.
These networks learn a nonlinear mapping function from low-resolution (LR) images to high-resolution (HR) images.
We propose a dual regression scheme by introducing an additional constraint on LR data to reduce the space of the possible functions.
arXiv Detail & Related papers (2020-03-16T04:23:42Z) - PULSE: Self-Supervised Photo Upsampling via Latent Space Exploration of
Generative Models [77.32079593577821]
PULSE (Photo Upsampling via Latent Space Exploration) generates high-resolution, realistic images at resolutions previously unseen in the literature.
Our method outperforms state-of-the-art methods in perceptual quality at higher resolutions and scale factors than previously possible.
arXiv Detail & Related papers (2020-03-08T16:44:31Z) - DDet: Dual-path Dynamic Enhancement Network for Real-World Image
Super-Resolution [69.2432352477966]
Real image super-resolution(Real-SR) focus on the relationship between real-world high-resolution(HR) and low-resolution(LR) image.
In this article, we propose a Dual-path Dynamic Enhancement Network(DDet) for Real-SR.
Unlike conventional methods which stack up massive convolutional blocks for feature representation, we introduce a content-aware framework to study non-inherently aligned image pair.
arXiv Detail & Related papers (2020-02-25T18:24:51Z) - Characteristic Regularisation for Super-Resolving Face Images [81.84939112201377]
Existing facial image super-resolution (SR) methods focus mostly on improving artificially down-sampled low-resolution (LR) imagery.
Previous unsupervised domain adaptation (UDA) methods address this issue by training a model using unpaired genuine LR and HR data.
This renders the model overstretched with two tasks: consistifying the visual characteristics and enhancing the image resolution.
We formulate a method that joins the advantages of conventional SR and UDA models.
arXiv Detail & Related papers (2019-12-30T16:27:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.