RSDiff: Remote Sensing Image Generation from Text Using Diffusion Model
- URL: http://arxiv.org/abs/2309.02455v1
- Date: Sun, 3 Sep 2023 09:34:49 GMT
- Title: RSDiff: Remote Sensing Image Generation from Text Using Diffusion Model
- Authors: Ahmad Sebaq, Mohamed ElHelw
- Abstract summary: We propose an innovative and lightweight approach to generate high-resolution Satellite images purely based on text prompts.
Our results demonstrate that our approach outperforms existing state-of-the-art (SoTA) models in generating satellite images with realistic geographical features, weather conditions, and land structures.
- Score: 1.0334138809056097
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Satellite imagery generation and super-resolution are pivotal tasks in remote
sensing, demanding high-quality, detailed images for accurate analysis and
decision-making. In this paper, we propose an innovative and lightweight
approach that employs two-stage diffusion models to gradually generate
high-resolution Satellite images purely based on text prompts. Our innovative
pipeline comprises two interconnected diffusion models: a Low-Resolution
Generation Diffusion Model (LR-GDM) that generates low-resolution images from
text and a Super-Resolution Diffusion Model (SRDM) conditionally produced. The
LR-GDM effectively synthesizes low-resolution by (computing the correlations of
the text embedding and the image embedding in a shared latent space), capturing
the essential content and layout of the desired scenes. Subsequently, the SRDM
takes the generated low-resolution image and its corresponding text prompts and
efficiently produces the high-resolution counterparts, infusing fine-grained
spatial details and enhancing visual fidelity. Experiments are conducted on the
commonly used dataset, Remote Sensing Image Captioning Dataset (RSICD). Our
results demonstrate that our approach outperforms existing state-of-the-art
(SoTA) models in generating satellite images with realistic geographical
features, weather conditions, and land structures while achieving remarkable
super-resolution results for increased spatial precision.
Related papers
- Controllable Reference-Based Real-World Remote Sensing Image Super-Resolution with Generative Diffusion Priors [13.148815217684277]
Super-resolution (SR) techniques can enhance the spatial resolution of remote sensing images by utilizing low-resolution (LR) images to reconstruct high-resolution (HR) images.<n>Existing RefSR methods struggle with real-world complexities, such as cross-sensor resolution gap and significant land cover changes.<n>We propose CRefDiff, a novel controllable reference-based diffusion model for real-world remote sensing image SR.
arXiv Detail & Related papers (2025-06-30T12:45:28Z) - A Diffusion-Based Framework for Terrain-Aware Remote Sensing Image Reconstruction [4.824120664293887]
SatelliteMaker is a diffusion-based method that reconstructs missing data across varying levels of data loss.
Digital Elevation Model (DEM) as a conditioning input and use tailored prompts to generate realistic images.
VGG-Adapter module based on Distribution Loss, which reduces distribution discrepancy and ensures style consistency.
arXiv Detail & Related papers (2025-04-16T14:19:57Z) - HoliSDiP: Image Super-Resolution via Holistic Semantics and Diffusion Prior [62.04939047885834]
We present HoliSDiP, a framework that leverages semantic segmentation to provide both precise textual and spatial guidance for Real-ISR.
Our method employs semantic labels as concise text prompts while introducing dense semantic guidance through segmentation masks and our proposed spatial-CLIP Map.
arXiv Detail & Related papers (2024-11-27T15:22:44Z) - High-Precision Dichotomous Image Segmentation via Probing Diffusion Capacity [69.32473738284374]
We propose DiffDIS, a diffusion-driven segmentation model that taps into the potential of the pre-trained U-Net within diffusion models.
By leveraging the robust generalization capabilities and rich, versatile image representation prior to the SD models, we significantly reduce the inference time while preserving high-fidelity, detailed generation.
Experiments on the DIS5K dataset demonstrate the superiority of DiffDIS, achieving state-of-the-art results through a streamlined inference process.
arXiv Detail & Related papers (2024-10-14T02:49:23Z) - Conditional Brownian Bridge Diffusion Model for VHR SAR to Optical Image Translation [5.578820789388206]
This paper introduces a conditional image-to-image translation approach based on Brownian Bridge Diffusion Model (BBDM)
We conducted comprehensive experiments on the MSAW dataset, a paired SAR and optical images collection of 0.5m Very-High-Resolution (VHR)
arXiv Detail & Related papers (2024-08-15T05:43:46Z) - Semantic Guided Large Scale Factor Remote Sensing Image Super-resolution with Generative Diffusion Prior [13.148815217684277]
Large scale factor super-resolution (SR) algorithms are vital for maximizing the utilization of low-resolution (LR) satellite data captured from orbit.
Existing methods confront challenges in recovering SR images with clear textures and correct ground objects.
We introduce a novel framework, the Semantic Guided Diffusion Model (SGDM), designed for large scale factor remote sensing image super-resolution.
arXiv Detail & Related papers (2024-05-11T16:06:16Z) - RS-Mamba for Large Remote Sensing Image Dense Prediction [58.12667617617306]
We propose the Remote Sensing Mamba (RSM) for dense prediction tasks in large VHR remote sensing images.
RSM is specifically designed to capture the global context of remote sensing images with linear complexity.
Our model achieves better efficiency and accuracy than transformer-based models on large remote sensing images.
arXiv Detail & Related papers (2024-04-03T12:06:01Z) - Bridging the Domain Gap: A Simple Domain Matching Method for
Reference-based Image Super-Resolution in Remote Sensing [8.36527949191506]
Recently, reference-based image super-resolution (RefSR) has shown excellent performance in image super-resolution (SR) tasks.
We introduce a Domain Matching (DM) module that can be seamlessly integrated with existing RefSR models.
Our analysis reveals that their domain gaps often occur in different satellites, and our model effectively addresses these challenges.
arXiv Detail & Related papers (2024-01-29T08:10:00Z) - Recognition-Guided Diffusion Model for Scene Text Image Super-Resolution [15.391125077873745]
Scene Text Image Super-Resolution (STISR) aims to enhance the resolution and legibility of text within low-resolution (LR) images.
Previous methods predominantly employ discriminative Convolutional Neural Networks (CNNs) augmented with diverse forms of text guidance.
We introduce RGDiffSR, a Recognition-Guided Diffusion model for scene text image Super-Resolution, which exhibits great generative diversity and fidelity even in challenging scenarios.
arXiv Detail & Related papers (2023-11-22T11:10:45Z) - Exploiting Digital Surface Models for Inferring Super-Resolution for
Remotely Sensed Images [2.3204178451683264]
This paper introduces a novel approach for forcing an SRR model to output realistic remote sensing images.
Instead of relying on feature-space similarities as a perceptual loss, the model considers pixel-level information inferred from the normalized Digital Surface Model (nDSM) of the image.
Based on visual inspection, the inferred super-resolution images exhibit particularly superior quality.
arXiv Detail & Related papers (2022-05-09T06:02:50Z) - High-resolution Depth Maps Imaging via Attention-based Hierarchical
Multi-modal Fusion [84.24973877109181]
We propose a novel attention-based hierarchical multi-modal fusion network for guided DSR.
We show that our approach outperforms state-of-the-art methods in terms of reconstruction accuracy, running speed and memory efficiency.
arXiv Detail & Related papers (2021-04-04T03:28:33Z) - Real Image Super Resolution Via Heterogeneous Model Ensemble using
GP-NAS [63.48801313087118]
We propose a new method for image superresolution using deep residual network with dense skip connections.
The proposed method won the first place in all three tracks of the AIM 2020 Real Image Super-Resolution Challenge.
arXiv Detail & Related papers (2020-09-02T22:33:23Z) - PULSE: Self-Supervised Photo Upsampling via Latent Space Exploration of
Generative Models [77.32079593577821]
PULSE (Photo Upsampling via Latent Space Exploration) generates high-resolution, realistic images at resolutions previously unseen in the literature.
Our method outperforms state-of-the-art methods in perceptual quality at higher resolutions and scale factors than previously possible.
arXiv Detail & Related papers (2020-03-08T16:44:31Z) - DDet: Dual-path Dynamic Enhancement Network for Real-World Image
Super-Resolution [69.2432352477966]
Real image super-resolution(Real-SR) focus on the relationship between real-world high-resolution(HR) and low-resolution(LR) image.
In this article, we propose a Dual-path Dynamic Enhancement Network(DDet) for Real-SR.
Unlike conventional methods which stack up massive convolutional blocks for feature representation, we introduce a content-aware framework to study non-inherently aligned image pair.
arXiv Detail & Related papers (2020-02-25T18:24:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.