Related papers: Latent Diffusion, Implicit Amplification: Efficient Continuous-Scale Super-Resolution for Remote Sensing Images

Latent Diffusion, Implicit Amplification: Efficient Continuous-Scale Super-Resolution for Remote Sensing Images

URL: http://arxiv.org/abs/2410.22830v1
Date: Wed, 30 Oct 2024 09:14:13 GMT
Title: Latent Diffusion, Implicit Amplification: Efficient Continuous-Scale Super-Resolution for Remote Sensing Images
Authors: Hanlin Wu, Jiangwei Mo, Xiaohui Sun, Jie Ma,
Abstract summary: E$2$DiffSR achieves superior objective metrics and visual quality compared to the state-of-the-art SR methods. It reduces the inference time of diffusion-based SR methods to a level comparable to that of non-diffusion methods.
Score: 7.920423405957888
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Recent advancements in diffusion models have significantly improved performance in super-resolution (SR) tasks. However, previous research often overlooks the fundamental differences between SR and general image generation. General image generation involves creating images from scratch, while SR focuses specifically on enhancing existing low-resolution (LR) images by adding typically missing high-frequency details. This oversight not only increases the training difficulty but also limits their inference efficiency. Furthermore, previous diffusion-based SR methods are typically trained and inferred at fixed integer scale factors, lacking flexibility to meet the needs of up-sampling with non-integer scale factors. To address these issues, this paper proposes an efficient and elastic diffusion-based SR model (E$^2$DiffSR), specially designed for continuous-scale SR in remote sensing imagery. E$^2$DiffSR employs a two-stage latent diffusion paradigm. During the first stage, an autoencoder is trained to capture the differential priors between high-resolution (HR) and LR images. The encoder intentionally ignores the existing LR content to alleviate the encoding burden, while the decoder introduces an SR branch equipped with a continuous scale upsampling module to accomplish the reconstruction under the guidance of the differential prior. In the second stage, a conditional diffusion model is learned within the latent space to predict the true differential prior encoding. Experimental results demonstrate that E$^2$DiffSR achieves superior objective metrics and visual quality compared to the state-of-the-art SR methods. Additionally, it reduces the inference time of diffusion-based SR methods to a level comparable to that of non-diffusion methods.

Related papers

One-Step Diffusion-based Real-World Image Super-Resolution with Visual Perception Distillation [53.24542646616045]
We propose VPD-SR, a novel visual perception diffusion distillation framework specifically designed for image super-resolution (SR) generation.<n>VPD-SR consists of two components: Explicit Semantic-aware Supervision (ESS) and High-frequency Perception (HFP) loss.<n>The proposed VPD-SR achieves superior performance compared to both previous state-of-the-art methods and the teacher model with just one-step sampling.
arXiv Detail & Related papers (2025-06-03T08:28:13Z)
Single-Step Latent Consistency Model for Remote Sensing Image Super-Resolution [7.920423405957888]
We propose a novel single-step diffusion approach designed to enhance both efficiency and visual quality in RSISR tasks. The proposed LCMSR reduces the iterative steps of traditional diffusion models from 50-1000 or more to just a single step. Experimental results demonstrate that LCMSR effectively balances efficiency and performance, achieving inference times comparable to non-diffusion models.
arXiv Detail & Related papers (2025-03-25T09:56:21Z)
Pixel to Gaussian: Ultra-Fast Continuous Super-Resolution with 2D Gaussian Modeling [50.34513854725803]
Arbitrary-scale super-resolution (ASSR) aims to reconstruct high-resolution (HR) images from low-resolution (LR) inputs with arbitrary upsampling factors. We propose a novel ContinuousSR framework with a Pixel-to-Gaussian paradigm, which explicitly reconstructs 2D continuous HR signals from LR images using Gaussian Splatting.
arXiv Detail & Related papers (2025-03-09T13:43:57Z)
Degradation-Guided One-Step Image Super-Resolution with Diffusion Priors [75.24313405671433]
Diffusion-based image super-resolution (SR) methods have achieved remarkable success by leveraging large pre-trained text-to-image diffusion models as priors. We introduce a novel one-step SR model, which significantly addresses the efficiency issue of diffusion-based SR methods. Unlike existing fine-tuning strategies, we designed a degradation-guided Low-Rank Adaptation (LoRA) module specifically for SR.
arXiv Detail & Related papers (2024-09-25T16:15:21Z)
Burst Super-Resolution with Diffusion Models for Improving Perceptual Quality [12.687175237915019]
Prior SR networks accepting the burst LR images are trained in a deterministic manner, which is known to produce a blurry SR image. Since such blurry images are perceptually degraded, we aim to reconstruct the sharp high-fidelity boundaries. In our proposed method, on the other hand, burst LR features are used to reconstruct the initial burst SR image.
arXiv Detail & Related papers (2024-03-28T13:58:05Z)
Improving the Stability and Efficiency of Diffusion Models for Content Consistent Super-Resolution [18.71638301931374]
generative priors of pre-trained latent diffusion models (DMs) have demonstrated great potential to enhance the visual quality of image super-resolution (SR) results. We propose to partition the generative SR process into two stages, where the DM is employed for reconstructing image structures and the GAN is employed for improving fine-grained details. Once trained, our proposed method, namely content consistent super-resolution (CCSR),allows flexible use of different diffusion steps in the inference stage without re-training.
arXiv Detail & Related papers (2023-12-30T10:22:59Z)
Image Super-resolution Via Latent Diffusion: A Sampling-space Mixture Of Experts And Frequency-augmented Decoder Approach [17.693287544860638]
latent-based diffusion for image super-resolution improved by pre-trained text-image models. latent-based methods utilize a feature encoder to transform the image and then implement the SR image generation in a compact latent space. We propose a frequency compensation module that enhances the frequency components from latent space to pixel space.
arXiv Detail & Related papers (2023-10-18T14:39:25Z)
Latent Multi-Relation Reasoning for GAN-Prior based Image Super-Resolution [61.65012981435095]
LAREN is a graph-based disentanglement that constructs a superior disentangled latent space via hierarchical multi-relation reasoning. We show that LAREN achieves superior large-factor image SR and outperforms the state-of-the-art consistently across multiple benchmarks.
arXiv Detail & Related papers (2022-08-04T19:45:21Z)
Towards Lightweight Super-Resolution with Dual Regression Learning [58.98801753555746]
Deep neural networks have exhibited remarkable performance in image super-resolution (SR) tasks. The SR problem is typically an ill-posed problem and existing methods would come with several limitations. We propose a dual regression learning scheme to reduce the space of possible SR mappings.
arXiv Detail & Related papers (2022-07-16T12:46:10Z)
Hierarchical Conditional Flow: A Unified Framework for Image Super-Resolution and Image Rescaling [139.25215100378284]
We propose a hierarchical conditional flow (HCFlow) as a unified framework for image SR and image rescaling. HCFlow learns a mapping between HR and LR image pairs by modelling the distribution of the LR image and the rest high-frequency component simultaneously. To further enhance the performance, other losses such as perceptual loss and GAN loss are combined with the commonly used negative log-likelihood loss in training.
arXiv Detail & Related papers (2021-08-11T16:11:01Z)
Blind Image Super-Resolution via Contrastive Representation Learning [41.17072720686262]
We design a contrastive representation learning network that focuses on blind SR of images with multi-modal and spatially variant distributions. We show that the proposed CRL-SR can handle multi-modal and spatially variant degradation effectively under blind settings. It also outperforms state-of-the-art SR methods qualitatively and quantitatively.
arXiv Detail & Related papers (2021-07-01T19:34:23Z)
SRDiff: Single Image Super-Resolution with Diffusion Probabilistic Models [19.17571465274627]
Single image super-resolution (SISR) aims to reconstruct high-resolution (HR) images from the given low-resolution (LR) ones. We propose a novel single image super-resolution diffusion probabilistic model (SRDiff) SRDiff is optimized with a variant of the variational bound on the data likelihood and can provide diverse and realistic SR predictions.
arXiv Detail & Related papers (2021-04-30T12:31:25Z)
Frequency Consistent Adaptation for Real World Super Resolution [64.91914552787668]
We propose a novel Frequency Consistent Adaptation (FCA) that ensures the frequency domain consistency when applying Super-Resolution (SR) methods to the real scene. We estimate degradation kernels from unsupervised images and generate the corresponding Low-Resolution (LR) images. Based on the domain-consistent LR-HR pairs, we train easy-implemented Convolutional Neural Network (CNN) SR models.
arXiv Detail & Related papers (2020-12-18T08:25:39Z)
Characteristic Regularisation for Super-Resolving Face Images [81.84939112201377]
Existing facial image super-resolution (SR) methods focus mostly on improving artificially down-sampled low-resolution (LR) imagery. Previous unsupervised domain adaptation (UDA) methods address this issue by training a model using unpaired genuine LR and HR data. This renders the model overstretched with two tasks: consistifying the visual characteristics and enhancing the image resolution. We formulate a method that joins the advantages of conventional SR and UDA models.
arXiv Detail & Related papers (2019-12-30T16:27:24Z)

This list is automatically generated from the titles and abstracts of the papers in this site.