LayerDiffusion: Layered Controlled Image Editing with Diffusion Models
- URL: http://arxiv.org/abs/2305.18676v1
- Date: Tue, 30 May 2023 01:26:41 GMT
- Title: LayerDiffusion: Layered Controlled Image Editing with Diffusion Models
- Authors: Pengzhi Li, QInxuan Huang, Yikang Ding, Zhiheng Li
- Abstract summary: LayerDiffusion is a semantic-based layered controlled image editing method.
We leverage a large-scale text-to-image model and employ a layered controlled optimization strategy.
Experimental results demonstrate the effectiveness of our method in generating highly coherent images.
- Score: 5.58892860792971
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Text-guided image editing has recently experienced rapid development.
However, simultaneously performing multiple editing actions on a single image,
such as background replacement and specific subject attribute changes, while
maintaining consistency between the subject and the background remains
challenging. In this paper, we propose LayerDiffusion, a semantic-based layered
controlled image editing method. Our method enables non-rigid editing and
attribute modification of specific subjects while preserving their unique
characteristics and seamlessly integrating them into new backgrounds. We
leverage a large-scale text-to-image model and employ a layered controlled
optimization strategy combined with layered diffusion training. During the
diffusion process, an iterative guidance strategy is used to generate a final
image that aligns with the textual description. Experimental results
demonstrate the effectiveness of our method in generating highly coherent
images that closely align with the given textual description. The edited images
maintain a high similarity to the features of the input image and surpass the
performance of current leading image editing methods. LayerDiffusion opens up
new possibilities for controllable image editing.
Related papers
- Streamlining Image Editing with Layered Diffusion Brushes [8.738398948669609]
Our system renders a single edit on a 512x512 image within 140 ms using a high-end consumer GPU.
Our approach demonstrates efficacy across a range of tasks, including object attribute adjustments, error correction, and sequential prompt-based object placement and manipulation.
arXiv Detail & Related papers (2024-05-01T04:30:03Z) - DiffEditor: Boosting Accuracy and Flexibility on Diffusion-based Image
Editing [66.43179841884098]
Large-scale Text-to-Image (T2I) diffusion models have revolutionized image generation over the last few years.
We propose DiffEditor to rectify two weaknesses in existing diffusion-based image editing.
Our method can efficiently achieve state-of-the-art performance on various fine-grained image editing tasks.
arXiv Detail & Related papers (2024-02-04T18:50:29Z) - Optimisation-Based Multi-Modal Semantic Image Editing [58.496064583110694]
We propose an inference-time editing optimisation to accommodate multiple editing instruction types.
By allowing to adjust the influence of each loss function, we build a flexible editing solution that can be adjusted to user preferences.
We evaluate our method using text, pose and scribble edit conditions, and highlight our ability to achieve complex edits.
arXiv Detail & Related papers (2023-11-28T15:31:11Z) - InFusion: Inject and Attention Fusion for Multi Concept Zero-Shot
Text-based Video Editing [27.661609140918916]
InFusion is a framework for zero-shot text-based video editing.
It supports editing of multiple concepts with pixel-level control over diverse concepts mentioned in the editing prompt.
Our framework is a low-cost alternative to one-shot tuned models for editing since it does not require training.
arXiv Detail & Related papers (2023-07-22T17:05:47Z) - DragonDiffusion: Enabling Drag-style Manipulation on Diffusion Models [66.43179841884098]
We propose a novel image editing method, DragonDiffusion, enabling Drag-style manipulation on Diffusion models.
Our method achieves various editing modes for the generated or real images, such as object moving, object resizing, object appearance replacement, and content dragging.
arXiv Detail & Related papers (2023-07-05T16:43:56Z) - PFB-Diff: Progressive Feature Blending Diffusion for Text-driven Image
Editing [8.19063619210761]
PFB-Diff is a Progressive Feature Blending method for Diffusion-based image editing.
Our method demonstrates its superior performance in terms of image fidelity, editing accuracy, efficiency, and faithfulness to the original image.
arXiv Detail & Related papers (2023-06-28T11:10:20Z) - Prompt Tuning Inversion for Text-Driven Image Editing Using Diffusion
Models [6.34777393532937]
We propose an accurate and quick inversion technique, Prompt Tuning Inversion, for text-driven image editing.
Our proposed editing method consists of a reconstruction stage and an editing stage.
Experiments on ImageNet demonstrate the superior editing performance of our method compared to the state-of-the-art baselines.
arXiv Detail & Related papers (2023-05-08T03:34:33Z) - MasaCtrl: Tuning-Free Mutual Self-Attention Control for Consistent Image
Synthesis and Editing [54.712205852602736]
We develop MasaCtrl, a tuning-free method to achieve consistent image generation and complex non-rigid image editing simultaneously.
Specifically, MasaCtrl converts existing self-attention in diffusion models into mutual self-attention, so that it can query correlated local contents and textures from source images for consistency.
Extensive experiments show that the proposed MasaCtrl can produce impressive results in both consistent image generation and complex non-rigid real image editing.
arXiv Detail & Related papers (2023-04-17T17:42:19Z) - DiffEdit: Diffusion-based semantic image editing with mask guidance [64.555930158319]
DiffEdit is a method to take advantage of text-conditioned diffusion models for the task of semantic image editing.
Our main contribution is able to automatically generate a mask highlighting regions of the input image that need to be edited.
arXiv Detail & Related papers (2022-10-20T17:16:37Z) - EditGAN: High-Precision Semantic Image Editing [120.49401527771067]
EditGAN is a novel method for high quality, high precision semantic image editing.
We show that EditGAN can manipulate images with an unprecedented level of detail and freedom.
We can also easily combine multiple edits and perform plausible edits beyond EditGAN training data.
arXiv Detail & Related papers (2021-11-04T22:36:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.