Semantic Image Manipulation Using Scene Graphs
- URL: http://arxiv.org/abs/2004.03677v1
- Date: Tue, 7 Apr 2020 20:02:49 GMT
- Title: Semantic Image Manipulation Using Scene Graphs
- Authors: Helisa Dhamo, Azade Farshad, Iro Laina, Nassir Navab, Gregory D.
Hager, Federico Tombari, Christian Rupprecht
- Abstract summary: We introduce a-semantic scene graph network that does not require direct supervision for constellation changes or image edits.
This makes possible to train the system from existing real-world datasets with no additional annotation effort.
- Score: 105.03614132953285
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Image manipulation can be considered a special case of image generation where
the image to be produced is a modification of an existing image. Image
generation and manipulation have been, for the most part, tasks that operate on
raw pixels. However, the remarkable progress in learning rich image and object
representations has opened the way for tasks such as text-to-image or
layout-to-image generation that are mainly driven by semantics. In our work, we
address the novel problem of image manipulation from scene graphs, in which a
user can edit images by merely applying changes in the nodes or edges of a
semantic graph that is generated from the image. Our goal is to encode image
information in a given constellation and from there on generate new
constellations, such as replacing objects or even changing relationships
between objects, while respecting the semantics and style from the original
image. We introduce a spatio-semantic scene graph network that does not require
direct supervision for constellation changes or image edits. This makes it
possible to train the system from existing real-world datasets with no
additional annotation effort.
Related papers
- CIMGEN: Controlled Image Manipulation by Finetuning Pretrained
Generative Models on Limited Data [14.469539513542584]
A semantic map has information of objects present in the image.
One can easily modify the map to selectively insert, remove, or replace objects in the map.
The method proposed in this paper takes in the modified semantic map and alter the original image in accordance to the modified map.
arXiv Detail & Related papers (2024-01-23T06:30:47Z) - iEdit: Localised Text-guided Image Editing with Weak Supervision [53.082196061014734]
We propose a novel learning method for text-guided image editing.
It generates images conditioned on a source image and a textual edit prompt.
It shows favourable results against its counterparts in terms of image fidelity, CLIP alignment score and qualitatively for editing both generated and real images.
arXiv Detail & Related papers (2023-05-10T07:39:14Z) - Diffusion-Based Scene Graph to Image Generation with Masked Contrastive
Pre-Training [112.94542676251133]
We propose to learn scene graph embeddings by directly optimizing their alignment with images.
Specifically, we pre-train an encoder to extract both global and local information from scene graphs.
The resulting method, called SGDiff, allows for the semantic manipulation of generated images by modifying scene graph nodes and connections.
arXiv Detail & Related papers (2022-11-21T01:11:19Z) - Transforming Image Generation from Scene Graphs [11.443097632746763]
We propose a transformer-based approach conditioned by scene graphs that employs a decoder to autoregressively compose images.
The proposed architecture is composed by three modules: 1) a graph convolutional network, to encode the relationships of the input graph; 2) an encoder-decoder transformer, which autoregressively composes the output image; 3) an auto-encoder, employed to generate representations used as input/output of each generation step by the transformer.
arXiv Detail & Related papers (2022-07-01T16:59:38Z) - SketchEdit: Mask-Free Local Image Manipulation with Partial Sketches [95.45728042499836]
We propose a new paradigm of sketch-based image manipulation: mask-free local image manipulation.
Our model automatically predicts the target modification region and encodes it into a structure style vector.
A generator then synthesizes the new image content based on the style vector and sketch.
arXiv Detail & Related papers (2021-11-30T02:42:31Z) - Unsupervised Image Transformation Learning via Generative Adversarial
Networks [40.84518581293321]
We study the image transformation problem by learning the underlying transformations from a collection of images using Generative Adversarial Networks (GANs)
We propose an unsupervised learning framework, termed as TrGAN, to project images onto a transformation space that is shared by the generator and the discriminator.
arXiv Detail & Related papers (2021-03-13T17:08:19Z) - Text as Neural Operator: Image Manipulation by Text Instruction [68.53181621741632]
In this paper, we study a setting that allows users to edit an image with multiple objects using complex text instructions to add, remove, or change the objects.
The inputs of the task are multimodal including (1) a reference image and (2) an instruction in natural language that describes desired modifications to the image.
We show that the proposed model performs favorably against recent strong baselines on three public datasets.
arXiv Detail & Related papers (2020-08-11T07:07:10Z) - SESAME: Semantic Editing of Scenes by Adding, Manipulating or Erasing
Objects [127.7627687126465]
SESAME is a novel generator-discriminator pair for Semantic Editing of Scenes by Adding, Manipulating or Erasing objects.
In our setup, the user provides the semantic labels of the areas to be edited and the generator synthesizes the corresponding pixels.
We evaluate our model on a diverse set of datasets and report state-of-the-art performance on two tasks.
arXiv Detail & Related papers (2020-04-10T10:19:19Z) - Local Facial Attribute Transfer through Inpainting [3.4376560669160394]
The term attribute transfer refers to the tasks of altering images in such a way, that the semantic interpretation of a given input image is shifted towards an intended direction.
Recent advances in attribute transfer are mostly based on generative deep neural networks, using various techniques to manipulate images in the latent space of the generator.
We present a novel method for the common sub-task of local attribute transfers, where only parts of a face have to be altered in order to achieve semantic changes.
arXiv Detail & Related papers (2020-02-07T22:57:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.