Dual Attention GANs for Semantic Image Synthesis
- URL: http://arxiv.org/abs/2008.13024v1
- Date: Sat, 29 Aug 2020 17:49:01 GMT
- Title: Dual Attention GANs for Semantic Image Synthesis
- Authors: Hao Tang, Song Bai, Nicu Sebe
- Abstract summary: We propose a novel Dual Attention GAN (DAGAN) to synthesize photo-realistic and semantically-consistent images.
We also propose two novel modules, i.e., position-wise Spatial Attention Module (SAM) and scale-wise Channel Attention Module (CAM)
DAGAN achieves remarkably better results than state-of-the-art methods, while using fewer model parameters.
- Score: 101.36015877815537
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we focus on the semantic image synthesis task that aims at
transferring semantic label maps to photo-realistic images. Existing methods
lack effective semantic constraints to preserve the semantic information and
ignore the structural correlations in both spatial and channel dimensions,
leading to unsatisfactory blurry and artifact-prone results. To address these
limitations, we propose a novel Dual Attention GAN (DAGAN) to synthesize
photo-realistic and semantically-consistent images with fine details from the
input layouts without imposing extra training overhead or modifying the network
architectures of existing methods. We also propose two novel modules, i.e.,
position-wise Spatial Attention Module (SAM) and scale-wise Channel Attention
Module (CAM), to capture semantic structure attention in spatial and channel
dimensions, respectively. Specifically, SAM selectively correlates the pixels
at each position by a spatial attention map, leading to pixels with the same
semantic label being related to each other regardless of their spatial
distances. Meanwhile, CAM selectively emphasizes the scale-wise features at
each channel by a channel attention map, which integrates associated features
among all channel maps regardless of their scales. We finally sum the outputs
of SAM and CAM to further improve feature representation. Extensive experiments
on four challenging datasets show that DAGAN achieves remarkably better results
than state-of-the-art methods, while using fewer model parameters. The source
code and trained models are available at https://github.com/Ha0Tang/DAGAN.
Related papers
- SSMG: Spatial-Semantic Map Guided Diffusion Model for Free-form
Layout-to-Image Generation [68.42476385214785]
We propose a novel Spatial-Semantic Map Guided (SSMG) diffusion model that adopts the feature map, derived from the layout, as guidance.
SSMG achieves superior generation quality with sufficient spatial and semantic controllability compared to previous works.
We also propose the Relation-Sensitive Attention (RSA) and Location-Sensitive Attention (LSA) mechanisms.
arXiv Detail & Related papers (2023-08-20T04:09:12Z) - Edge Guided GANs with Multi-Scale Contrastive Learning for Semantic
Image Synthesis [139.2216271759332]
We propose a novel ECGAN for the challenging semantic image synthesis task.
The semantic labels do not provide detailed structural information, making it challenging to synthesize local details and structures.
The widely adopted CNN operations such as convolution, down-sampling, and normalization usually cause spatial resolution loss.
We propose a novel contrastive learning method, which aims to enforce pixel embeddings belonging to the same semantic class to generate more similar image content.
arXiv Detail & Related papers (2023-07-22T14:17:19Z) - Efficient Multi-Scale Attention Module with Cross-Spatial Learning [4.046170185945849]
A novel efficient multi-scale attention (EMA) module is proposed.
We focus on retaining the information on per channel and decreasing the computational overhead.
We conduct extensive ablation studies and experiments on image classification and object detection tasks.
arXiv Detail & Related papers (2023-05-23T00:35:47Z) - Multi-Granularity Denoising and Bidirectional Alignment for Weakly
Supervised Semantic Segmentation [75.32213865436442]
We propose an end-to-end multi-granularity denoising and bidirectional alignment (MDBA) model to alleviate the noisy label and multi-class generalization issues.
The MDBA model can reach the mIoU of 69.5% and 70.2% on validation and test sets for the PASCAL VOC 2012 dataset.
arXiv Detail & Related papers (2023-05-09T03:33:43Z) - Scale-Semantic Joint Decoupling Network for Image-text Retrieval in
Remote Sensing [23.598273691455503]
We propose a novel Scale-Semantic Joint Decoupling Network (SJDN) for remote sensing image-text retrieval.
Our proposed SSJDN outperforms state-of-the-art approaches in numerical experiments conducted on four benchmark remote sensing datasets.
arXiv Detail & Related papers (2022-12-12T08:02:35Z) - Semantic Labeling of High Resolution Images Using EfficientUNets and
Transformers [5.177947445379688]
We propose a new segmentation model that combines convolutional neural networks with deep transformers.
Our results demonstrate that the proposed methodology improves segmentation accuracy compared to state-of-the-art techniques.
arXiv Detail & Related papers (2022-06-20T12:03:54Z) - Multi-Scale Feature Fusion: Learning Better Semantic Segmentation for
Road Pothole Detection [9.356003255288417]
This paper presents a novel pothole detection approach based on single-modal semantic segmentation.
It first extracts visual features from input images using a convolutional neural network.
A channel attention module then reweighs the channel features to enhance the consistency of different feature maps.
arXiv Detail & Related papers (2021-12-24T15:07:47Z) - Example-Guided Image Synthesis across Arbitrary Scenes using Masked
Spatial-Channel Attention and Self-Supervision [83.33283892171562]
Example-guided image synthesis has recently been attempted to synthesize an image from a semantic label map and an exemplary image.
In this paper, we tackle a more challenging and general task, where the exemplar is an arbitrary scene image that is semantically different from the given label map.
We propose an end-to-end network for joint global and local feature alignment and synthesis.
arXiv Detail & Related papers (2020-04-18T18:17:40Z) - Edge Guided GANs with Contrastive Learning for Semantic Image Synthesis [194.1452124186117]
We propose a novel ECGAN for the challenging semantic image synthesis task.
Our ECGAN achieves significantly better results than state-of-the-art methods.
arXiv Detail & Related papers (2020-03-31T01:23:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.