Reconciling Semantic Controllability and Diversity for Remote Sensing Image Synthesis with Hybrid Semantic Embedding
- URL: http://arxiv.org/abs/2411.14781v1
- Date: Fri, 22 Nov 2024 07:51:36 GMT
- Title: Reconciling Semantic Controllability and Diversity for Remote Sensing Image Synthesis with Hybrid Semantic Embedding
- Authors: Junde Liu, Danpei Zhao, Bo Yuan, Wentao Li, Tian Li,
- Abstract summary: We present a Hybrid Semantic Embedding Guided Geneversarative Adversarial Network (HySEGGAN) for controllable and efficient remote sensing image synthesis.
Motivated by feature description, we propose a hybrid semantic Embedding method, that coordinates fine-grained local semantic layouts.
A Semantic Refinement Network (SRN) is introduced, incorporating a novel loss function to ensure fine-grained semantic feedback.
- Score: 12.330893658398042
- License:
- Abstract: Significant advancements have been made in semantic image synthesis in remote sensing. However, existing methods still face formidable challenges in balancing semantic controllability and diversity. In this paper, we present a Hybrid Semantic Embedding Guided Generative Adversarial Network (HySEGGAN) for controllable and efficient remote sensing image synthesis. Specifically, HySEGGAN leverages hierarchical information from a single source. Motivated by feature description, we propose a hybrid semantic Embedding method, that coordinates fine-grained local semantic layouts to characterize the geometric structure of remote sensing objects without extra information. Besides, a Semantic Refinement Network (SRN) is introduced, incorporating a novel loss function to ensure fine-grained semantic feedback. The proposed approach mitigates semantic confusion and prevents geometric pattern collapse. Experimental results indicate that the method strikes an excellent balance between semantic controllability and diversity. Furthermore, HySEGGAN significantly improves the quality of synthesized images and achieves state-of-the-art performance as a data augmentation technique across multiple datasets for downstream tasks.
Related papers
- Trustworthy Image Semantic Communication with GenAI: Explainablity, Controllability, and Efficiency [59.15544887307901]
Image semantic communication (ISC) has garnered significant attention for its potential to achieve high efficiency in visual content transmission.
Existing ISC systems based on joint source-channel coding face challenges in interpretability, operability, and compatibility.
We propose a novel trustworthy ISC framework that employs Generative Artificial Intelligence (GenAI) for multiple downstream inference tasks.
arXiv Detail & Related papers (2024-08-07T14:32:36Z) - Spatial Semantic Recurrent Mining for Referring Image Segmentation [63.34997546393106]
We propose Stextsuperscript2RM to achieve high-quality cross-modality fusion.
It follows a working strategy of trilogy: distributing language feature, spatial semantic recurrent coparsing, and parsed-semantic balancing.
Our proposed method performs favorably against other state-of-the-art algorithms.
arXiv Detail & Related papers (2024-05-15T00:17:48Z) - Improving Diversity in Zero-Shot GAN Adaptation with Semantic Variations [61.132408427908175]
zero-shot GAN adaptation aims to reuse well-trained generators to synthesize images of an unseen target domain.
With only a single representative text feature instead of real images, the synthesized images gradually lose diversity.
We propose a novel method to find semantic variations of the target text in the CLIP space.
arXiv Detail & Related papers (2023-08-21T08:12:28Z) - Semantic-aware Network for Aerial-to-Ground Image Synthesis [42.360670351361584]
We propose a novel framework to explore the challenges by imposing enhanced structural alignment and semantic awareness.
We introduce a novel semantic-attentive feature transformation module that allows to reconstruct the complex geographic structures.
We also propose semantic-aware loss functions by leveraging a pre-trained segmentation network.
arXiv Detail & Related papers (2023-08-14T05:37:07Z) - Edge Guided GANs with Multi-Scale Contrastive Learning for Semantic
Image Synthesis [139.2216271759332]
We propose a novel ECGAN for the challenging semantic image synthesis task.
The semantic labels do not provide detailed structural information, making it challenging to synthesize local details and structures.
The widely adopted CNN operations such as convolution, down-sampling, and normalization usually cause spatial resolution loss.
We propose a novel contrastive learning method, which aims to enforce pixel embeddings belonging to the same semantic class to generate more similar image content.
arXiv Detail & Related papers (2023-07-22T14:17:19Z) - Unsupervised Synthetic Image Refinement via Contrastive Learning and
Consistent Semantic-Structural Constraints [32.07631215590755]
Contrastive learning (CL) has been successfully used to pull correlated patches together and push uncorrelated ones apart.
In this work, we exploit semantic and structural consistency between synthetic and refined images and adopt CL to reduce the semantic distortion.
arXiv Detail & Related papers (2023-04-25T05:55:28Z) - Scale-Semantic Joint Decoupling Network for Image-text Retrieval in
Remote Sensing [23.598273691455503]
We propose a novel Scale-Semantic Joint Decoupling Network (SJDN) for remote sensing image-text retrieval.
Our proposed SSJDN outperforms state-of-the-art approaches in numerical experiments conducted on four benchmark remote sensing datasets.
arXiv Detail & Related papers (2022-12-12T08:02:35Z) - Semantic Image Synthesis via Diffusion Models [159.4285444680301]
Denoising Diffusion Probabilistic Models (DDPMs) have achieved remarkable success in various image generation tasks.
Recent work on semantic image synthesis mainly follows the emphde facto Generative Adversarial Nets (GANs)
arXiv Detail & Related papers (2022-06-30T18:31:51Z) - Reinforcement Learning-powered Semantic Communication via Semantic
Similarity [13.569045590522316]
We introduce a new semantic communication mechanism, whose key idea is to preserve the semantic information instead of strictly securing the bit-level precision.
We show that the commonly used bit-level metrics are vulnerable of catching important semantic meaning and structures.
We put forward a reinforcement learning (RL)-based solution which allows us to simultaneously optimize any user-defined semantic measurement.
arXiv Detail & Related papers (2021-08-27T05:21:05Z) - Diverse Semantic Image Synthesis via Probability Distribution Modeling [103.88931623488088]
We propose a novel diverse semantic image synthesis framework.
Our method can achieve superior diversity and comparable quality compared to state-of-the-art methods.
arXiv Detail & Related papers (2021-03-11T18:59:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.