FlexEdit: Flexible and Controllable Diffusion-based Object-centric Image Editing
- URL: http://arxiv.org/abs/2403.18605v2
- Date: Thu, 28 Mar 2024 03:56:07 GMT
- Title: FlexEdit: Flexible and Controllable Diffusion-based Object-centric Image Editing
- Authors: Trong-Tung Nguyen, Duc-Anh Nguyen, Anh Tran, Cuong Pham,
- Abstract summary: We introduce FlexEdit, a flexible and controllable editing framework for objects.
We iteratively adjust latents at each denoising step using our FlexEdit block.
Our framework employs an adaptive mask, automatically extracted during denoising, to protect the background.
- Score: 3.852667054327356
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Our work addresses limitations seen in previous approaches for object-centric editing problems, such as unrealistic results due to shape discrepancies and limited control in object replacement or insertion. To this end, we introduce FlexEdit, a flexible and controllable editing framework for objects where we iteratively adjust latents at each denoising step using our FlexEdit block. Initially, we optimize latents at test time to align with specified object constraints. Then, our framework employs an adaptive mask, automatically extracted during denoising, to protect the background while seamlessly blending new content into the target image. We demonstrate the versatility of FlexEdit in various object editing tasks and curate an evaluation test suite with samples from both real and synthetic images, along with novel evaluation metrics designed for object-centric editing. We conduct extensive experiments on different editing scenarios, demonstrating the superiority of our editing framework over recent advanced text-guided image editing methods. Our project page is published at https://flex-edit.github.io/.
Related papers
- Move and Act: Enhanced Object Manipulation and Background Integrity for Image Editing [63.32399428320422]
We propose a tuning-free method with only two branches: inversion and editing.
This approach allows users to simultaneously edit the object's action and control the generation position of the edited object.
Impressive image editing results and quantitative evaluation demonstrate the effectiveness of our method.
arXiv Detail & Related papers (2024-07-25T08:00:49Z) - LoMOE: Localized Multi-Object Editing via Multi-Diffusion [8.90467024388923]
We introduce a novel framework for zero-shot localized multi-object editing through a multi-diffusion process.
Our approach leverages foreground masks and corresponding simple text prompts that exert localized influences on the target regions.
A combination of cross-attention and background losses within the latent space ensures that the characteristics of the object being edited are preserved.
arXiv Detail & Related papers (2024-03-01T10:46:47Z) - DiffEditor: Boosting Accuracy and Flexibility on Diffusion-based Image
Editing [66.43179841884098]
Large-scale Text-to-Image (T2I) diffusion models have revolutionized image generation over the last few years.
We propose DiffEditor to rectify two weaknesses in existing diffusion-based image editing.
Our method can efficiently achieve state-of-the-art performance on various fine-grained image editing tasks.
arXiv Detail & Related papers (2024-02-04T18:50:29Z) - Edit One for All: Interactive Batch Image Editing [44.50631647670942]
This paper presents a novel method for interactive batch image editing using StyleGAN as the medium.
Given an edit specified by users in an example image (e.g., make the face frontal), our method can automatically transfer that edit to other test images.
Experiments demonstrate that edits performed using our method have similar visual quality to existing single-image-editing methods.
arXiv Detail & Related papers (2024-01-18T18:58:44Z) - Optimisation-Based Multi-Modal Semantic Image Editing [58.496064583110694]
We propose an inference-time editing optimisation to accommodate multiple editing instruction types.
By allowing to adjust the influence of each loss function, we build a flexible editing solution that can be adjusted to user preferences.
We evaluate our method using text, pose and scribble edit conditions, and highlight our ability to achieve complex edits.
arXiv Detail & Related papers (2023-11-28T15:31:11Z) - Object-aware Inversion and Reassembly for Image Editing [61.19822563737121]
We propose Object-aware Inversion and Reassembly (OIR) to enable object-level fine-grained editing.
We use our search metric to find the optimal inversion step for each editing pair when editing an image.
Our method achieves superior performance in editing object shapes, colors, materials, categories, etc., especially in multi-object editing scenarios.
arXiv Detail & Related papers (2023-10-18T17:59:02Z) - LayerDiffusion: Layered Controlled Image Editing with Diffusion Models [5.58892860792971]
LayerDiffusion is a semantic-based layered controlled image editing method.
We leverage a large-scale text-to-image model and employ a layered controlled optimization strategy.
Experimental results demonstrate the effectiveness of our method in generating highly coherent images.
arXiv Detail & Related papers (2023-05-30T01:26:41Z) - PAIR-Diffusion: A Comprehensive Multimodal Object-Level Image Editor [135.17302411419834]
PAIR Diffusion is a generic framework that enables a diffusion model to control the structure and appearance of each object in the image.
We show that having control over the properties of each object in an image leads to comprehensive editing capabilities.
Our framework allows for various object-level editing operations on real images such as reference image-based appearance editing, free-form shape editing, adding objects, and variations.
arXiv Detail & Related papers (2023-03-30T17:13:56Z) - EditGAN: High-Precision Semantic Image Editing [120.49401527771067]
EditGAN is a novel method for high quality, high precision semantic image editing.
We show that EditGAN can manipulate images with an unprecedented level of detail and freedom.
We can also easily combine multiple edits and perform plausible edits beyond EditGAN training data.
arXiv Detail & Related papers (2021-11-04T22:36:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.