Example-Guided Image Synthesis across Arbitrary Scenes using Masked
Spatial-Channel Attention and Self-Supervision
- URL: http://arxiv.org/abs/2004.10024v1
- Date: Sat, 18 Apr 2020 18:17:40 GMT
- Title: Example-Guided Image Synthesis across Arbitrary Scenes using Masked
Spatial-Channel Attention and Self-Supervision
- Authors: Haitian Zheng, Haofu Liao, Lele Chen, Wei Xiong, Tianlang Chen, Jiebo
Luo
- Abstract summary: Example-guided image synthesis has recently been attempted to synthesize an image from a semantic label map and an exemplary image.
In this paper, we tackle a more challenging and general task, where the exemplar is an arbitrary scene image that is semantically different from the given label map.
We propose an end-to-end network for joint global and local feature alignment and synthesis.
- Score: 83.33283892171562
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Example-guided image synthesis has recently been attempted to synthesize an
image from a semantic label map and an exemplary image. In the task, the
additional exemplar image provides the style guidance that controls the
appearance of the synthesized output. Despite the controllability advantage,
the existing models are designed on datasets with specific and roughly aligned
objects. In this paper, we tackle a more challenging and general task, where
the exemplar is an arbitrary scene image that is semantically different from
the given label map. To this end, we first propose a Masked Spatial-Channel
Attention (MSCA) module which models the correspondence between two arbitrary
scenes via efficient decoupled attention. Next, we propose an end-to-end
network for joint global and local feature alignment and synthesis. Finally, we
propose a novel self-supervision task to enable training. Experiments on the
large-scale and more diverse COCO-stuff dataset show significant improvements
over the existing methods. Moreover, our approach provides interpretability and
can be readily extended to other content manipulation tasks including style and
spatial interpolation or extrapolation.
Related papers
- Comprehensive Generative Replay for Task-Incremental Segmentation with Concurrent Appearance and Semantic Forgetting [49.87694319431288]
Generalist segmentation models are increasingly favored for diverse tasks involving various objects from different image sources.
We propose a Comprehensive Generative (CGR) framework that restores appearance and semantic knowledge by synthesizing image-mask pairs.
Experiments on incremental tasks (cardiac, fundus and prostate segmentation) show its clear advantage for alleviating concurrent appearance and semantic forgetting.
arXiv Detail & Related papers (2024-06-28T10:05:58Z) - SAMPLING: Scene-adaptive Hierarchical Multiplane Images Representation
for Novel View Synthesis from a Single Image [60.52991173059486]
We introduce SAMPLING, a Scene-adaptive Hierarchical Multiplane Images Representation for Novel View Synthesis from a Single Image.
Our method demonstrates considerable performance gains in large-scale unbounded outdoor scenes using a single image on the KITTI dataset.
arXiv Detail & Related papers (2023-09-12T15:33:09Z) - Semantic Image Synthesis via Diffusion Models [159.4285444680301]
Denoising Diffusion Probabilistic Models (DDPMs) have achieved remarkable success in various image generation tasks.
Recent work on semantic image synthesis mainly follows the emphde facto Generative Adversarial Nets (GANs)
arXiv Detail & Related papers (2022-06-30T18:31:51Z) - Diverse Semantic Image Synthesis via Probability Distribution Modeling [103.88931623488088]
We propose a novel diverse semantic image synthesis framework.
Our method can achieve superior diversity and comparable quality compared to state-of-the-art methods.
arXiv Detail & Related papers (2021-03-11T18:59:25Z) - Dual Attention GANs for Semantic Image Synthesis [101.36015877815537]
We propose a novel Dual Attention GAN (DAGAN) to synthesize photo-realistic and semantically-consistent images.
We also propose two novel modules, i.e., position-wise Spatial Attention Module (SAM) and scale-wise Channel Attention Module (CAM)
DAGAN achieves remarkably better results than state-of-the-art methods, while using fewer model parameters.
arXiv Detail & Related papers (2020-08-29T17:49:01Z) - Panoptic-based Image Synthesis [32.82903428124024]
Conditional image synthesis serves various applications for content editing to content generation.
We propose a panoptic aware image synthesis network to generate high fidelity and photorealistic images conditioned on panoptic maps.
arXiv Detail & Related papers (2020-04-21T20:40:53Z) - Contextual Encoder-Decoder Network for Visual Saliency Prediction [42.047816176307066]
We propose an approach based on a convolutional neural network pre-trained on a large-scale image classification task.
We combine the resulting representations with global scene information for accurately predicting visual saliency.
Compared to state of the art approaches, the network is based on a lightweight image classification backbone.
arXiv Detail & Related papers (2019-02-18T16:15:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.