SU-ESRGAN: Semantic and Uncertainty-Aware ESRGAN for Super-Resolution of Satellite and Drone Imagery with Fine-Tuning for Cross Domain Evaluation
- URL: http://arxiv.org/abs/2508.00750v1
- Date: Fri, 01 Aug 2025 16:25:21 GMT
- Title: SU-ESRGAN: Semantic and Uncertainty-Aware ESRGAN for Super-Resolution of Satellite and Drone Imagery with Fine-Tuning for Cross Domain Evaluation
- Authors: Prerana Ramkumar,
- Abstract summary: Generative Adversarial Networks (GANs) have achieved realistic super-resolution (SR) of images however, they lack semantic consistency and per-pixel confidence.<n>This paper introduces Semantic and Uncertainty-Aware ESRGAN (SU-ESRGAN), the first SR framework designed for satellite imagery to integrate the ESRGAN.<n>The novel model is valuable in satellite systems or UAVs that use wide field-of-view (FoV) cameras, trading off spatial resolution for coverage.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Generative Adversarial Networks (GANs) have achieved realistic super-resolution (SR) of images however, they lack semantic consistency and per-pixel confidence, limiting their credibility in critical remote sensing applications such as disaster response, urban planning and agriculture. This paper introduces Semantic and Uncertainty-Aware ESRGAN (SU-ESRGAN), the first SR framework designed for satellite imagery to integrate the ESRGAN, segmentation loss via DeepLabv3 for class detail preservation and Monte Carlo dropout to produce pixel-wise uncertainty maps. The SU-ESRGAN produces results (PSNR, SSIM, LPIPS) comparable to the Baseline ESRGAN on aerial imagery. This novel model is valuable in satellite systems or UAVs that use wide field-of-view (FoV) cameras, trading off spatial resolution for coverage. The modular design allows integration in UAV data pipelines for on-board or post-processing SR to enhance imagery resulting due to motion blur, compression and sensor limitations. Further, the model is fine-tuned to evaluate its performance on cross domain applications. The tests are conducted on two drone based datasets which differ in altitude and imaging perspective. Performance evaluation of the fine-tuned models show a stronger adaptation to the Aerial Maritime Drone Dataset, whose imaging characteristics align with the training data, highlighting the importance of domain-aware training in SR-applications.
Related papers
- Dual-domain Adaptation Networks for Realistic Image Super-resolution [81.34345637776408]
Realistic image super-resolution (SR) focuses on transforming real-world low-resolution (LR) images into high-resolution (HR) ones.<n>Current methods struggle with limited real-world LR-HR data, impacting the learning of basic image features.<n>We introduce a novel approach, which is able to efficiently adapt pre-trained image SR models from simulated to real-world datasets.
arXiv Detail & Related papers (2025-11-21T12:57:23Z) - ZoomEarth: Active Perception for Ultra-High-Resolution Geospatial Vision-Language Tasks [49.99788276124186]
Existing dynamic resolution and token pruning methods are constrained by a passive perception paradigm.<n>We present LRS-GRO, a large-scale benchmark dataset tailored for active perception in UHR RS processing.<n>We propose ZoomEarth, an adaptive cropping-zooming framework with a novel Region-Guided reward that provides fine-grained guidance.
arXiv Detail & Related papers (2025-11-15T15:47:46Z) - Satellite Image Utilization for Dehazing with Swin Transformer-Hybrid U-Net and Watershed loss [0.0]
This paper proposes a hybrid dehazing framework that integrates Swin Transformer and U-Net to balance global context learning and local detail restoration.<n>We introduce a composite loss function that combines L2 loss, guided loss, and a novel watershed loss, which enhances structural boundary preservation and ensures pixel-level accuracy.<n> Experimental results demonstrate that the proposed method outperforms state-of-the-art models on both the RICE and SateHaze1K datasets.
arXiv Detail & Related papers (2025-08-31T13:07:20Z) - Annotation-Free Open-Vocabulary Segmentation for Remote-Sensing Images [51.74614065919118]
This paper introduces SegEarth-OV, the first framework for annotation-free open-vocabulary segmentation of RS images.<n>We propose SimFeatUp, a universal upsampler that robustly restores high-resolution spatial details from coarse features.<n>We also present a simple yet effective Global Bias Alleviation operation to subtract the inherent global context from patch features.
arXiv Detail & Related papers (2025-08-25T14:22:57Z) - STAR: A Benchmark for Astronomical Star Fields Super-Resolution [51.79340280382437]
We propose STAR, a large-scale astronomical SR dataset containing 54,738 flux-consistent star field image pairs.<n>We propose a Flux-Invariant Super Resolution (FISR) model that could accurately infer the flux-consistent high-resolution images from input photometry.
arXiv Detail & Related papers (2025-07-22T09:28:28Z) - Data Augmentation and Resolution Enhancement using GANs and Diffusion Models for Tree Segmentation [49.13393683126712]
Urban forests play a key role in enhancing environmental quality and supporting biodiversity in cities.<n> accurately detecting trees is challenging due to complex landscapes and the variability in image resolution caused by different satellite sensors or UAV flight altitudes.<n>We propose a novel pipeline that integrates domain adaptation with GANs and Diffusion models to enhance the quality of low-resolution aerial images.
arXiv Detail & Related papers (2025-05-21T03:57:10Z) - A Diffusion-Based Framework for Terrain-Aware Remote Sensing Image Reconstruction [4.824120664293887]
SatelliteMaker is a diffusion-based method that reconstructs missing data across varying levels of data loss.<n>Digital Elevation Model (DEM) as a conditioning input and use tailored prompts to generate realistic images.<n>VGG-Adapter module based on Distribution Loss, which reduces distribution discrepancy and ensures style consistency.
arXiv Detail & Related papers (2025-04-16T14:19:57Z) - SatFlow: Generative model based framework for producing High Resolution Gap Free Remote Sensing Imagery [0.0]
We present SatFlow, a generative model-based framework that fuses low-resolution MODIS imagery and Landsat observations to produce frequent, high-resolution, gap-free surface reflectance imagery.<n>Our model, trained via Conditional Flow Matching, demonstrates better performance in generating imagery with preserved structural and spectral integrity.<n>This capability is crucial for downstream applications such as crop phenology tracking, environmental change detection etc.
arXiv Detail & Related papers (2025-02-03T06:40:13Z) - DemosaicFormer: Coarse-to-Fine Demosaicing Network for HybridEVS Camera [70.28702677370879]
Hybrid Event-Based Vision Sensor (HybridEVS) is a novel sensor integrating traditional frame-based and event-based sensors.
Despite its potential, the lack of Image signal processing (ISP) pipeline specifically designed for HybridEVS poses a significant challenge.
We propose a coarse-to-fine framework named DemosaicFormer which comprises coarse demosaicing and pixel correction.
arXiv Detail & Related papers (2024-06-12T07:20:46Z) - Semantic Guided Large Scale Factor Remote Sensing Image Super-resolution with Generative Diffusion Prior [13.148815217684277]
Large scale factor super-resolution (SR) algorithms are vital for maximizing the utilization of low-resolution (LR) satellite data captured from orbit.
Existing methods confront challenges in recovering SR images with clear textures and correct ground objects.
We introduce a novel framework, the Semantic Guided Diffusion Model (SGDM), designed for large scale factor remote sensing image super-resolution.
arXiv Detail & Related papers (2024-05-11T16:06:16Z) - DGNet: Dynamic Gradient-Guided Network for Water-Related Optics Image
Enhancement [77.0360085530701]
Underwater image enhancement (UIE) is a challenging task due to the complex degradation caused by underwater environments.
Previous methods often idealize the degradation process, and neglect the impact of medium noise and object motion on the distribution of image features.
Our approach utilizes predicted images to dynamically update pseudo-labels, adding a dynamic gradient to optimize the network's gradient space.
arXiv Detail & Related papers (2023-12-12T06:07:21Z) - A comparative analysis of SRGAN models [0.0]
We evaluate the performance of SRGAN (Super Resolution Generative Adversarial Network) models, ESRGAN, Real-ESRGAN and EDSR, on a benchmark dataset of real-world images.
Some models seem to significantly increase the resolution of the input images while preserving their visual quality, this is assessed using Tesseract OCR engine.
arXiv Detail & Related papers (2023-07-18T17:35:45Z) - Hierarchical Similarity Learning for Aliasing Suppression Image
Super-Resolution [64.15915577164894]
A hierarchical image super-resolution network (HSRNet) is proposed to suppress the influence of aliasing.
HSRNet achieves better quantitative and visual performance than other works, and remits the aliasing more effectively.
arXiv Detail & Related papers (2022-06-07T14:55:32Z) - Frequency Consistent Adaptation for Real World Super Resolution [64.91914552787668]
We propose a novel Frequency Consistent Adaptation (FCA) that ensures the frequency domain consistency when applying Super-Resolution (SR) methods to the real scene.
We estimate degradation kernels from unsupervised images and generate the corresponding Low-Resolution (LR) images.
Based on the domain-consistent LR-HR pairs, we train easy-implemented Convolutional Neural Network (CNN) SR models.
arXiv Detail & Related papers (2020-12-18T08:25:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.