SESAME: Semantic Editing of Scenes by Adding, Manipulating or Erasing
Objects
- URL: http://arxiv.org/abs/2004.04977v2
- Date: Thu, 8 Oct 2020 14:52:01 GMT
- Title: SESAME: Semantic Editing of Scenes by Adding, Manipulating or Erasing
Objects
- Authors: Evangelos Ntavelis, Andr\'es Romero, Iason Kastanis, Luc Van Gool and
Radu Timofte
- Abstract summary: SESAME is a novel generator-discriminator pair for Semantic Editing of Scenes by Adding, Manipulating or Erasing objects.
In our setup, the user provides the semantic labels of the areas to be edited and the generator synthesizes the corresponding pixels.
We evaluate our model on a diverse set of datasets and report state-of-the-art performance on two tasks.
- Score: 127.7627687126465
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent advances in image generation gave rise to powerful tools for semantic
image editing. However, existing approaches can either operate on a single
image or require an abundance of additional information. They are not capable
of handling the complete set of editing operations, that is addition,
manipulation or removal of semantic concepts. To address these limitations, we
propose SESAME, a novel generator-discriminator pair for Semantic Editing of
Scenes by Adding, Manipulating or Erasing objects. In our setup, the user
provides the semantic labels of the areas to be edited and the generator
synthesizes the corresponding pixels. In contrast to previous methods that
employ a discriminator that trivially concatenates semantics and image as an
input, the SESAME discriminator is composed of two input streams that
independently process the image and its semantics, using the latter to
manipulate the results of the former. We evaluate our model on a diverse set of
datasets and report state-of-the-art performance on two tasks: (a) image
manipulation and (b) image generation conditioned on semantic labels.
Related papers
- Towards Image Semantics and Syntax Sequence Learning [8.033697392628424]
We introduce the concept of "image grammar", consisting of "image semantics" and "image syntax"
We propose a weakly supervised two-stage approach to learn the image grammar relative to a class of visual objects/scenes.
Our framework is trained to reason over patch semantics and detect faulty syntax.
arXiv Detail & Related papers (2024-01-31T00:16:02Z) - iEdit: Localised Text-guided Image Editing with Weak Supervision [53.082196061014734]
We propose a novel learning method for text-guided image editing.
It generates images conditioned on a source image and a textual edit prompt.
It shows favourable results against its counterparts in terms of image fidelity, CLIP alignment score and qualitatively for editing both generated and real images.
arXiv Detail & Related papers (2023-05-10T07:39:14Z) - Entity-Level Text-Guided Image Manipulation [70.81648416508867]
We study a novel task on text-guided image manipulation on the entity level in the real world (eL-TGIM)
We propose an elegant framework, dubbed as SeMani, forming the Semantic Manipulation of real-world images.
In the semantic alignment phase, SeMani incorporates a semantic alignment module to locate the entity-relevant region to be manipulated.
In the image manipulation phase, SeMani adopts a generative model to synthesize new images conditioned on the entity-irrelevant regions and target text descriptions.
arXiv Detail & Related papers (2023-02-22T13:56:23Z) - Controllable Image Synthesis via SegVAE [89.04391680233493]
A semantic map is commonly used intermediate representation for conditional image generation.
In this work, we specifically target at generating semantic maps given a label-set consisting of desired categories.
The proposed framework, SegVAE, synthesizes semantic maps in an iterative manner using conditional variational autoencoder.
arXiv Detail & Related papers (2020-07-16T15:18:53Z) - Semantic Photo Manipulation with a Generative Image Prior [86.01714863596347]
GANs are able to synthesize images conditioned on inputs such as user sketch, text, or semantic labels.
It is hard for GANs to precisely reproduce an input image.
In this paper, we address these issues by adapting the image prior learned by GANs to image statistics of an individual image.
Our method can accurately reconstruct the input image and synthesize new content, consistent with the appearance of the input image.
arXiv Detail & Related papers (2020-05-15T18:22:05Z) - Semantic Image Manipulation Using Scene Graphs [105.03614132953285]
We introduce a-semantic scene graph network that does not require direct supervision for constellation changes or image edits.
This makes possible to train the system from existing real-world datasets with no additional annotation effort.
arXiv Detail & Related papers (2020-04-07T20:02:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.