EditGAN: High-Precision Semantic Image Editing
- URL: http://arxiv.org/abs/2111.03186v1
- Date: Thu, 4 Nov 2021 22:36:33 GMT
- Title: EditGAN: High-Precision Semantic Image Editing
- Authors: Huan Ling, Karsten Kreis, Daiqing Li, Seung Wook Kim, Antonio
Torralba, Sanja Fidler
- Abstract summary: EditGAN is a novel method for high quality, high precision semantic image editing.
We show that EditGAN can manipulate images with an unprecedented level of detail and freedom.
We can also easily combine multiple edits and perform plausible edits beyond EditGAN training data.
- Score: 120.49401527771067
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Generative adversarial networks (GANs) have recently found applications in
image editing. However, most GAN based image editing methods often require
large scale datasets with semantic segmentation annotations for training, only
provide high level control, or merely interpolate between different images.
Here, we propose EditGAN, a novel method for high quality, high precision
semantic image editing, allowing users to edit images by modifying their highly
detailed part segmentation masks, e.g., drawing a new mask for the headlight of
a car. EditGAN builds on a GAN framework that jointly models images and their
semantic segmentations, requiring only a handful of labeled examples, making it
a scalable tool for editing. Specifically, we embed an image into the GAN
latent space and perform conditional latent code optimization according to the
segmentation edit, which effectively also modifies the image. To amortize
optimization, we find editing vectors in latent space that realize the edits.
The framework allows us to learn an arbitrary number of editing vectors, which
can then be directly applied on other images at interactive rates. We
experimentally show that EditGAN can manipulate images with an unprecedented
level of detail and freedom, while preserving full image quality.We can also
easily combine multiple edits and perform plausible edits beyond EditGAN
training data. We demonstrate EditGAN on a wide variety of image types and
quantitatively outperform several previous editing methods on standard editing
benchmark tasks.
Related papers
- DiffEditor: Boosting Accuracy and Flexibility on Diffusion-based Image
Editing [66.43179841884098]
Large-scale Text-to-Image (T2I) diffusion models have revolutionized image generation over the last few years.
We propose DiffEditor to rectify two weaknesses in existing diffusion-based image editing.
Our method can efficiently achieve state-of-the-art performance on various fine-grained image editing tasks.
arXiv Detail & Related papers (2024-02-04T18:50:29Z) - Edit One for All: Interactive Batch Image Editing [44.50631647670942]
This paper presents a novel method for interactive batch image editing using StyleGAN as the medium.
Given an edit specified by users in an example image (e.g., make the face frontal), our method can automatically transfer that edit to other test images.
Experiments demonstrate that edits performed using our method have similar visual quality to existing single-image-editing methods.
arXiv Detail & Related papers (2024-01-18T18:58:44Z) - Optimisation-Based Multi-Modal Semantic Image Editing [58.496064583110694]
We propose an inference-time editing optimisation to accommodate multiple editing instruction types.
By allowing to adjust the influence of each loss function, we build a flexible editing solution that can be adjusted to user preferences.
We evaluate our method using text, pose and scribble edit conditions, and highlight our ability to achieve complex edits.
arXiv Detail & Related papers (2023-11-28T15:31:11Z) - Emu Edit: Precise Image Editing via Recognition and Generation Tasks [62.95717180730946]
We present Emu Edit, a multi-task image editing model which sets state-of-the-art results in instruction-based image editing.
We train it to multi-task across an unprecedented range of tasks, such as region-based editing, free-form editing, and Computer Vision tasks.
We show that Emu Edit can generalize to new tasks, such as image inpainting, super-resolution, and compositions of editing tasks, with just a few labeled examples.
arXiv Detail & Related papers (2023-11-16T18:55:58Z) - Object-aware Inversion and Reassembly for Image Editing [61.19822563737121]
We propose Object-aware Inversion and Reassembly (OIR) to enable object-level fine-grained editing.
We use our search metric to find the optimal inversion step for each editing pair when editing an image.
Our method achieves superior performance in editing object shapes, colors, materials, categories, etc., especially in multi-object editing scenarios.
arXiv Detail & Related papers (2023-10-18T17:59:02Z) - LEDITS: Real Image Editing with DDPM Inversion and Semantic Guidance [0.0]
LEDITS is a combined lightweight approach for real-image editing, incorporating the Edit Friendly DDPM inversion technique with Semantic Guidance.
This approach achieves versatile edits, both subtle and extensive as well as alterations in composition and style, while requiring no optimization nor extensions to the architecture.
arXiv Detail & Related papers (2023-07-02T09:11:09Z) - SpaceEdit: Learning a Unified Editing Space for Open-Domain Image
Editing [94.31103255204933]
We propose a unified model for open-domain image editing focusing on color and tone adjustment of open-domain images.
Our model learns a unified editing space that is more semantic, intuitive, and easy to manipulate.
We show that by inverting image pairs into latent codes of the learned editing space, our model can be leveraged for various downstream editing tasks.
arXiv Detail & Related papers (2021-11-30T23:53:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.