Diverse Semantic Image Editing with Style Codes
- URL: http://arxiv.org/abs/2309.13975v1
- Date: Mon, 25 Sep 2023 09:22:18 GMT
- Title: Diverse Semantic Image Editing with Style Codes
- Authors: Hakan Sivuk, Aysegul Dundar
- Abstract summary: We propose a framework that can encode visible and partially visible objects with a novel mechanism to achieve consistency in the style encoding and final generations.
Our method achieves better quantitative results and also provides diverse results.
- Score: 6.7737387715834725
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Semantic image editing requires inpainting pixels following a semantic map.
It is a challenging task since this inpainting requires both harmony with the
context and strict compliance with the semantic maps. The majority of the
previous methods proposed for this task try to encode the whole information
from erased images. However, when an object is added to a scene such as a car,
its style cannot be encoded from the context alone. On the other hand, the
models that can output diverse generations struggle to output images that have
seamless boundaries between the generated and unerased parts. Additionally,
previous methods do not have a mechanism to encode the styles of visible and
partially visible objects differently for better performance. In this work, we
propose a framework that can encode visible and partially visible objects with
a novel mechanism to achieve consistency in the style encoding and final
generations. We extensively compare with previous conditional image generation
and semantic image editing algorithms. Our extensive experiments show that our
method significantly improves over the state-of-the-art. Our method not only
achieves better quantitative results but also provides diverse results. Please
refer to the project web page for the released code and demo:
https://github.com/hakansivuk/DivSem.
Related papers
- LayoutLLM-T2I: Eliciting Layout Guidance from LLM for Text-to-Image
Generation [121.45667242282721]
We propose a coarse-to-fine paradigm to achieve layout planning and image generation.
Our proposed method outperforms the state-of-the-art models in terms of photorealistic layout and image generation.
arXiv Detail & Related papers (2023-08-09T17:45:04Z) - Imagic: Text-Based Real Image Editing with Diffusion Models [19.05825157237432]
We demonstrate the ability to apply complex (e.g., non-rigid) text-guided semantic edits to a single real image.
Our proposed method requires only a single input image and a target text.
It operates on real images, and does not require any additional inputs.
arXiv Detail & Related papers (2022-10-17T17:27:32Z) - Boosting Image Outpainting with Semantic Layout Prediction [18.819765707811904]
We train a GAN to extend regions in semantic segmentation domain instead of image domain.
Another GAN model is trained to synthesize real images based on the extended semantic layouts.
Our approach can handle semantic clues more easily and hence works better in complex scenarios.
arXiv Detail & Related papers (2021-10-18T13:09:31Z) - Context-Aware Image Inpainting with Learned Semantic Priors [100.99543516733341]
We introduce pretext tasks that are semantically meaningful to estimating the missing contents.
We propose a context-aware image inpainting model, which adaptively integrates global semantics and local features.
arXiv Detail & Related papers (2021-06-14T08:09:43Z) - In&Out : Diverse Image Outpainting via GAN Inversion [89.84841983778672]
Image outpainting seeks for a semantically consistent extension of the input image beyond its available content.
In this work, we formulate the problem from the perspective of inverting generative adversarial networks.
Our generator renders micro-patches conditioned on their joint latent code as well as their individual positions in the image.
arXiv Detail & Related papers (2021-04-01T17:59:10Z) - Semantic-Guided Inpainting Network for Complex Urban Scenes Manipulation [19.657440527538547]
In this work, we propose a novel deep learning model to alter a complex urban scene by removing a user-specified portion of the image.
Inspired by recent works on image inpainting, our proposed method leverages the semantic segmentation to model the content and structure of the image.
To generate reliable results, we design a new decoder block that combines the semantic segmentation and generation task.
arXiv Detail & Related papers (2020-10-19T09:17:17Z) - Controllable Image Synthesis via SegVAE [89.04391680233493]
A semantic map is commonly used intermediate representation for conditional image generation.
In this work, we specifically target at generating semantic maps given a label-set consisting of desired categories.
The proposed framework, SegVAE, synthesizes semantic maps in an iterative manner using conditional variational autoencoder.
arXiv Detail & Related papers (2020-07-16T15:18:53Z) - Swapping Autoencoder for Deep Image Manipulation [94.33114146172606]
We propose the Swapping Autoencoder, a deep model designed specifically for image manipulation.
The key idea is to encode an image with two independent components and enforce that any swapped combination maps to a realistic image.
Experiments on multiple datasets show that our model produces better results and is substantially more efficient compared to recent generative models.
arXiv Detail & Related papers (2020-07-01T17:59:57Z) - Semantic Image Manipulation Using Scene Graphs [105.03614132953285]
We introduce a-semantic scene graph network that does not require direct supervision for constellation changes or image edits.
This makes possible to train the system from existing real-world datasets with no additional annotation effort.
arXiv Detail & Related papers (2020-04-07T20:02:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.