SpaceEdit: Learning a Unified Editing Space for Open-Domain Image
Editing
- URL: http://arxiv.org/abs/2112.00180v1
- Date: Tue, 30 Nov 2021 23:53:32 GMT
- Title: SpaceEdit: Learning a Unified Editing Space for Open-Domain Image
Editing
- Authors: Jing Shi, Ning Xu, Haitian Zheng, Alex Smith, Jiebo Luo, Chenliang Xu
- Abstract summary: We propose a unified model for open-domain image editing focusing on color and tone adjustment of open-domain images.
Our model learns a unified editing space that is more semantic, intuitive, and easy to manipulate.
We show that by inverting image pairs into latent codes of the learned editing space, our model can be leveraged for various downstream editing tasks.
- Score: 94.31103255204933
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Recently, large pretrained models (e.g., BERT, StyleGAN, CLIP) have shown
great knowledge transfer and generalization capability on various downstream
tasks within their domains. Inspired by these efforts, in this paper we propose
a unified model for open-domain image editing focusing on color and tone
adjustment of open-domain images while keeping their original content and
structure. Our model learns a unified editing space that is more semantic,
intuitive, and easy to manipulate than the operation space (e.g., contrast,
brightness, color curve) used in many existing photo editing softwares. Our
model belongs to the image-to-image translation framework which consists of an
image encoder and decoder, and is trained on pairs of before- and after-images
to produce multimodal outputs. We show that by inverting image pairs into
latent codes of the learned editing space, our model can be leveraged for
various downstream editing tasks such as language-guided image editing,
personalized editing, editing-style clustering, retrieval, etc. We extensively
study the unique properties of the editing space in experiments and demonstrate
superior performance on the aforementioned tasks.
Related papers
- SGEdit: Bridging LLM with Text2Image Generative Model for Scene Graph-based Image Editing [42.23117201457898]
We introduce a new framework that integrates large language model (LLM) with Text2 generative model for graph-based image editing.
Our framework significantly outperforms existing image editing methods in terms of editing precision and scene aesthetics.
arXiv Detail & Related papers (2024-10-15T17:40:48Z) - UltraEdit: Instruction-based Fine-Grained Image Editing at Scale [43.222251591410455]
This paper presents UltraEdit, a large-scale (approximately 4 million editing samples) automatically generated dataset for instruction-based image editing.
Our key idea is to address the drawbacks in existing image editing datasets like InstructPix2Pix and MagicBrush, and provide a systematic approach to producing massive and high-quality image editing samples.
arXiv Detail & Related papers (2024-07-07T06:50:22Z) - Zero-shot Image Editing with Reference Imitation [50.75310094611476]
We present a new form of editing, termed imitative editing, to help users exercise their creativity more conveniently.
We propose a generative training framework, dubbed MimicBrush, which randomly selects two frames from a video clip, masks some regions of one frame, and learns to recover the masked regions using the information from the other frame.
We experimentally show the effectiveness of our method under various test cases as well as its superiority over existing alternatives.
arXiv Detail & Related papers (2024-06-11T17:59:51Z) - Optimisation-Based Multi-Modal Semantic Image Editing [58.496064583110694]
We propose an inference-time editing optimisation to accommodate multiple editing instruction types.
By allowing to adjust the influence of each loss function, we build a flexible editing solution that can be adjusted to user preferences.
We evaluate our method using text, pose and scribble edit conditions, and highlight our ability to achieve complex edits.
arXiv Detail & Related papers (2023-11-28T15:31:11Z) - Emu Edit: Precise Image Editing via Recognition and Generation Tasks [62.95717180730946]
We present Emu Edit, a multi-task image editing model which sets state-of-the-art results in instruction-based image editing.
We train it to multi-task across an unprecedented range of tasks, such as region-based editing, free-form editing, and Computer Vision tasks.
We show that Emu Edit can generalize to new tasks, such as image inpainting, super-resolution, and compositions of editing tasks, with just a few labeled examples.
arXiv Detail & Related papers (2023-11-16T18:55:58Z) - LayerDiffusion: Layered Controlled Image Editing with Diffusion Models [5.58892860792971]
LayerDiffusion is a semantic-based layered controlled image editing method.
We leverage a large-scale text-to-image model and employ a layered controlled optimization strategy.
Experimental results demonstrate the effectiveness of our method in generating highly coherent images.
arXiv Detail & Related papers (2023-05-30T01:26:41Z) - Zero-shot Image-to-Image Translation [57.46189236379433]
We propose pix2pix-zero, an image-to-image translation method that can preserve the original image without manual prompting.
We propose cross-attention guidance, which aims to retain the cross-attention maps of the input image throughout the diffusion process.
Our method does not need additional training for these edits and can directly use the existing text-to-image diffusion model.
arXiv Detail & Related papers (2023-02-06T18:59:51Z) - EditGAN: High-Precision Semantic Image Editing [120.49401527771067]
EditGAN is a novel method for high quality, high precision semantic image editing.
We show that EditGAN can manipulate images with an unprecedented level of detail and freedom.
We can also easily combine multiple edits and perform plausible edits beyond EditGAN training data.
arXiv Detail & Related papers (2021-11-04T22:36:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.