An Efficient Remote Sensing Super Resolution Method Exploring Diffusion Priors and Multi-Modal Constraints for Crop Type Mapping
- URL: http://arxiv.org/abs/2510.23382v1
- Date: Mon, 27 Oct 2025 14:34:52 GMT
- Title: An Efficient Remote Sensing Super Resolution Method Exploring Diffusion Priors and Multi-Modal Constraints for Crop Type Mapping
- Authors: Songxi Yang, Tang Sui, Qunying Huang,
- Abstract summary: Super resolution offers a way to harness medium even lowresolution but historically valuable remote sensing image archives.<n>Current methods have limited utilization of auxiliary information as real-world constraints to reconstruct scientifically realistic images.<n>We present a efficient LSSR framework for RSSR, supported by a new multimodal dataset of paired 30 m Landsat 8 and 10 m Sentinel 2 imagery.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Super resolution offers a way to harness medium even lowresolution but historically valuable remote sensing image archives. Generative models, especially diffusion models, have recently been applied to remote sensing super resolution (RSSR), yet several challenges exist. First, diffusion models are effective but require expensive training from scratch resources and have slow inference speeds. Second, current methods have limited utilization of auxiliary information as real-world constraints to reconstruct scientifically realistic images. Finally, most current methods lack evaluation on downstream tasks. In this study, we present a efficient LSSR framework for RSSR, supported by a new multimodal dataset of paired 30 m Landsat 8 and 10 m Sentinel 2 imagery. Built on frozen pretrained Stable Diffusion, LSSR integrates crossmodal attention with auxiliary knowledge (Digital Elevation Model, land cover, month) and Synthetic Aperture Radar guidance, enhanced by adapters and a tailored Fourier NDVI loss to balance spatial details and spectral fidelity. Extensive experiments demonstrate that LSSR significantly improves crop boundary delineation and recovery, achieving state-of-the-art performance with Peak Signal-to-Noise Ratio/Structural Similarity Index Measure of 32.63/0.84 (RGB) and 23.99/0.78 (IR), and the lowest NDVI Mean Squared Error (0.042), while maintaining efficient inference (0.39 sec/image). Moreover, LSSR transfers effectively to NASA Harmonized Landsat and Sentinel (HLS) super resolution, yielding more reliable crop classification (F1: 0.86) than Sentinel-2 (F1: 0.85). These results highlight the potential of RSSR to advance precision agriculture.
Related papers
- Task-Driven Prompt Learning: A Joint Framework for Multi-modal Cloud Removal and Segmentation [11.468907022707013]
TDP-CR is a task-driven framework that jointly performs cloud removal and land-cover segmentation.<n>Central to our approach is a Prompt-Guided Fusion mechanism, which utilizes a learnable degradation prompt to encode cloud thickness and spatial uncertainty.<n>Experiments on the LuojiaSET-OSFCR dataset demonstrate the superiority of our framework.
arXiv Detail & Related papers (2026-01-17T13:32:38Z) - LPCAN: Lightweight Pyramid Cross-Attention Network for Rail Surface Defect Detection Using RGB-D Data [0.0]
This paper addresses the limitations of current vision-based rail defect detection methods.<n>We propose a Lightweight Pyramid Cross-Attention Network (LPCANet) that leverages RGB-D data for efficient and accurate defect identification.<n>LPCANet achieves state-of-the-art performance with only 9.90 million parameters, 2.50 G FLOPs, and 162.60 fps inference speed.
arXiv Detail & Related papers (2026-01-14T03:35:09Z) - SERA-H: Beyond Native Sentinel Spatial Limits for High-Resolution Canopy Height Mapping [3.8902217877872034]
High-resolution mapping of canopy height is essential for forest management and biodiversity monitoring.<n>We present SERA-H, an end-to-end model combining a super-resolution module and temporal attention encoding.<n>Our model generates 2.5 m resolution height maps from freely available Sentinel-1 and Sentinel-2 time series data.
arXiv Detail & Related papers (2025-12-19T23:23:14Z) - Dual-domain Adaptation Networks for Realistic Image Super-resolution [81.34345637776408]
Realistic image super-resolution (SR) focuses on transforming real-world low-resolution (LR) images into high-resolution (HR) ones.<n>Current methods struggle with limited real-world LR-HR data, impacting the learning of basic image features.<n>We introduce a novel approach, which is able to efficiently adapt pre-trained image SR models from simulated to real-world datasets.
arXiv Detail & Related papers (2025-11-21T12:57:23Z) - STAR: A Benchmark for Astronomical Star Fields Super-Resolution [52.895107920663236]
We propose STAR, a large-scale astronomical SR dataset containing 54,738 flux-consistent star field image pairs.<n>We propose a Flux-Invariant Super Resolution (FISR) model that could accurately infer the flux-consistent high-resolution images from input photometry.
arXiv Detail & Related papers (2025-07-22T09:28:28Z) - DiffFuSR: Super-Resolution of all Sentinel-2 Multispectral Bands using Diffusion Models [12.227962960123016]
This paper presents DiffFuSR, a modular pipeline for super-resolving all 12 spectral bands of Sentinel-2 Level-2A imagery.<n>The pipeline comprises two stages: (i) a diffusion-based super-resolution (SR) model trained on high-resolution RGB imagery from the NAIP and WorldStrat datasets, harmonized to simulate Sentinel-2 characteristics; and (ii) a learned fusion network that upscales the remaining multispectral bands using the super-resolved RGB image as a spatial prior.
arXiv Detail & Related papers (2025-06-13T13:18:09Z) - One-Step Diffusion-based Real-World Image Super-Resolution with Visual Perception Distillation [53.24542646616045]
We propose VPD-SR, a novel visual perception diffusion distillation framework specifically designed for image super-resolution (SR) generation.<n>VPD-SR consists of two components: Explicit Semantic-aware Supervision (ESS) and High-frequency Perception (HFP) loss.<n>The proposed VPD-SR achieves superior performance compared to both previous state-of-the-art methods and the teacher model with just one-step sampling.
arXiv Detail & Related papers (2025-06-03T08:28:13Z) - GenDR: Lightning Generative Detail Restorator [18.465568249533966]
We present a one-step diffusion model for generative detail restoration, GenDR, distilled from a tailored diffusion model with larger latent space.<n> Experimental results demonstrate that GenDR achieves state-of-the-art performance in both quantitative metrics and visual fidelity.
arXiv Detail & Related papers (2025-03-09T22:02:18Z) - Latent Diffusion, Implicit Amplification: Efficient Continuous-Scale Super-Resolution for Remote Sensing Images [7.920423405957888]
E$2$DiffSR achieves superior objective metrics and visual quality compared to the state-of-the-art SR methods.
It reduces the inference time of diffusion-based SR methods to a level comparable to that of non-diffusion methods.
arXiv Detail & Related papers (2024-10-30T09:14:13Z) - Spatial Annealing for Efficient Few-shot Neural Rendering [73.49548565633123]
We introduce an accurate and efficient few-shot neural rendering method named textbfSpatial textbfAnnealing regularized textbfNeRF (textbfSANeRF)<n>By adding merely one line of code, SANeRF delivers superior rendering quality and much faster reconstruction speed compared to current few-shot neural rendering methods.
arXiv Detail & Related papers (2024-06-12T02:48:52Z) - ResShift: Efficient Diffusion Model for Image Super-resolution by
Residual Shifting [70.83632337581034]
Diffusion-based image super-resolution (SR) methods are mainly limited by the low inference speed.
We propose a novel and efficient diffusion model for SR that significantly reduces the number of diffusion steps.
Our method constructs a Markov chain that transfers between the high-resolution image and the low-resolution image by shifting the residual.
arXiv Detail & Related papers (2023-07-23T15:10:02Z) - SuperYOLO: Super Resolution Assisted Object Detection in Multimodal
Remote Sensing Imagery [36.216230299131404]
We propose SuperYOLO, which fuses multimodal data and performs high-resolution (HR) object detection on multiscale objects.
Our proposed model shows a favorable accuracy and speed tradeoff compared to the state-of-the-art models.
arXiv Detail & Related papers (2022-09-27T12:58:58Z) - Fourier Space Losses for Efficient Perceptual Image Super-Resolution [131.50099891772598]
We show that it is possible to improve the performance of a recently introduced efficient generator architecture solely with the application of our proposed loss functions.
We show that our losses' direct emphasis on the frequencies in Fourier-space significantly boosts the perceptual image quality.
The trained generator achieves comparable results with and is 2.4x and 48x faster than state-of-the-art perceptual SR methods RankSRGAN and SRFlow respectively.
arXiv Detail & Related papers (2021-06-01T20:34:52Z) - Frequency Consistent Adaptation for Real World Super Resolution [64.91914552787668]
We propose a novel Frequency Consistent Adaptation (FCA) that ensures the frequency domain consistency when applying Super-Resolution (SR) methods to the real scene.
We estimate degradation kernels from unsupervised images and generate the corresponding Low-Resolution (LR) images.
Based on the domain-consistent LR-HR pairs, we train easy-implemented Convolutional Neural Network (CNN) SR models.
arXiv Detail & Related papers (2020-12-18T08:25:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.