Controllable Image Synthesis via SegVAE
- URL: http://arxiv.org/abs/2007.08397v2
- Date: Fri, 17 Jul 2020 04:13:08 GMT
- Title: Controllable Image Synthesis via SegVAE
- Authors: Yen-Chi Cheng, Hsin-Ying Lee, Min Sun, Ming-Hsuan Yang
- Abstract summary: A semantic map is commonly used intermediate representation for conditional image generation.
In this work, we specifically target at generating semantic maps given a label-set consisting of desired categories.
The proposed framework, SegVAE, synthesizes semantic maps in an iterative manner using conditional variational autoencoder.
- Score: 89.04391680233493
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Flexible user controls are desirable for content creation and image editing.
A semantic map is commonly used intermediate representation for conditional
image generation. Compared to the operation on raw RGB pixels, the semantic map
enables simpler user modification. In this work, we specifically target at
generating semantic maps given a label-set consisting of desired categories.
The proposed framework, SegVAE, synthesizes semantic maps in an iterative
manner using conditional variational autoencoder. Quantitative and qualitative
experiments demonstrate that the proposed model can generate realistic and
diverse semantic maps. We also apply an off-the-shelf image-to-image
translation model to generate realistic RGB images to better understand the
quality of the synthesized semantic maps. Furthermore, we showcase several
real-world image-editing applications including object removal, object
insertion, and object replacement.
Related papers
- CIMGEN: Controlled Image Manipulation by Finetuning Pretrained
Generative Models on Limited Data [14.469539513542584]
A semantic map has information of objects present in the image.
One can easily modify the map to selectively insert, remove, or replace objects in the map.
The method proposed in this paper takes in the modified semantic map and alter the original image in accordance to the modified map.
arXiv Detail & Related papers (2024-01-23T06:30:47Z) - Unlocking Pre-trained Image Backbones for Semantic Image Synthesis [29.688029979801577]
We propose a new class of GAN discriminators for semantic image synthesis that generates highly realistic images.
Our model, which we dub DP-SIMS, achieves state-of-the-art results in terms of image quality and consistency with the input label maps on ADE-20K, COCO-Stuff, and Cityscapes.
arXiv Detail & Related papers (2023-12-20T09:39:19Z) - Wavelet-based Unsupervised Label-to-Image Translation [9.339522647331334]
We propose a new Unsupervised paradigm for SIS (USIS) that makes use of a self-supervised segmentation loss and whole image wavelet based discrimination.
We test our methodology on 3 challenging datasets and demonstrate its ability to bridge the performance gap between paired and unpaired models.
arXiv Detail & Related papers (2023-05-16T17:48:44Z) - Structure-Guided Image Completion with Image-level and Object-level Semantic Discriminators [97.12135238534628]
We propose a learning paradigm that consists of semantic discriminators and object-level discriminators for improving the generation of complex semantics and objects.
Specifically, the semantic discriminators leverage pretrained visual features to improve the realism of the generated visual concepts.
Our proposed scheme significantly improves the generation quality and achieves state-of-the-art results on various tasks.
arXiv Detail & Related papers (2022-12-13T01:36:56Z) - Semantic-shape Adaptive Feature Modulation for Semantic Image Synthesis [71.56830815617553]
A fine-grained part-level semantic layout will benefit object details generation.
A Shape-aware Position Descriptor (SPD) is proposed to describe each pixel's positional feature.
A Semantic-shape Adaptive Feature Modulation (SAFM) block is proposed to combine the given semantic map and our positional features.
arXiv Detail & Related papers (2022-03-31T09:06:04Z) - FlexIT: Towards Flexible Semantic Image Translation [59.09398209706869]
We propose FlexIT, a novel method which can take any input image and a user-defined text instruction for editing.
First, FlexIT combines the input image and text into a single target point in the CLIP multimodal embedding space.
We iteratively transform the input image toward the target point, ensuring coherence and quality with a variety of novel regularization terms.
arXiv Detail & Related papers (2022-03-09T13:34:38Z) - Linear Semantics in Generative Adversarial Networks [26.123252503846942]
We aim to better understand the semantic representation of GANs, and enable semantic control in GAN's generation process.
We find that a well-trained GAN encodes image semantics in its internal feature maps in a surprisingly simple way.
We propose two few-shot image editing approaches, namely Semantic-Conditional Sampling and Semantic Image Editing.
arXiv Detail & Related papers (2021-04-01T14:18:48Z) - Text-to-Image Generation Grounded by Fine-Grained User Attention [62.94737811887098]
Localized Narratives is a dataset with detailed natural language descriptions of images paired with mouse traces.
We propose TReCS, a sequential model that exploits this grounding to generate images.
arXiv Detail & Related papers (2020-11-07T13:23:31Z) - Panoptic-based Image Synthesis [32.82903428124024]
Conditional image synthesis serves various applications for content editing to content generation.
We propose a panoptic aware image synthesis network to generate high fidelity and photorealistic images conditioned on panoptic maps.
arXiv Detail & Related papers (2020-04-21T20:40:53Z) - SESAME: Semantic Editing of Scenes by Adding, Manipulating or Erasing
Objects [127.7627687126465]
SESAME is a novel generator-discriminator pair for Semantic Editing of Scenes by Adding, Manipulating or Erasing objects.
In our setup, the user provides the semantic labels of the areas to be edited and the generator synthesizes the corresponding pixels.
We evaluate our model on a diverse set of datasets and report state-of-the-art performance on two tasks.
arXiv Detail & Related papers (2020-04-10T10:19:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.