FEC: Three Finetuning-free Methods to Enhance Consistency for Real Image
Editing
- URL: http://arxiv.org/abs/2309.14934v1
- Date: Tue, 26 Sep 2023 13:43:06 GMT
- Title: FEC: Three Finetuning-free Methods to Enhance Consistency for Real Image
Editing
- Authors: Songyan Chen, Jiancheng Huang
- Abstract summary: We propose FEC, which consists of three sampling methods, each designed for different editing types and settings.
FEC achieves two important goals in image editing task: 1) ensuring successful reconstruction, i.e., sampling to get a generated result that preserves the texture and features of the original real image.
None of our sampling methods require fine-tuning of the diffusion model or time-consuming training on large-scale datasets.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Text-conditional image editing is a very useful task that has recently
emerged with immeasurable potential. Most current real image editing methods
first need to complete the reconstruction of the image, and then editing is
carried out by various methods based on the reconstruction. Most methods use
DDIM Inversion for reconstruction, however, DDIM Inversion often fails to
guarantee reconstruction performance, i.e., it fails to produce results that
preserve the original image content. To address the problem of reconstruction
failure, we propose FEC, which consists of three sampling methods, each
designed for different editing types and settings. Our three methods of FEC
achieve two important goals in image editing task: 1) ensuring successful
reconstruction, i.e., sampling to get a generated result that preserves the
texture and features of the original real image. 2) these sampling methods can
be paired with many editing methods and greatly improve the performance of
these editing methods to accomplish various editing tasks. In addition, none of
our sampling methods require fine-tuning of the diffusion model or
time-consuming training on large-scale datasets. Hence the cost of time as well
as the use of computer memory and computation can be significantly reduced.
Related papers
- OmniEdit: Building Image Editing Generalist Models Through Specialist Supervision [32.33777277141083]
We present omniedit, an omnipotent editor to handle seven different image editing tasks with any aspect ratio seamlessly.
omniedit is trained by utilizing the supervision from seven different specialist models to ensure task coverage.
We provide images with different aspect ratios to ensure that our model can handle any image in the wild.
arXiv Detail & Related papers (2024-11-11T18:21:43Z) - ReEdit: Multimodal Exemplar-Based Image Editing with Diffusion Models [11.830273909934688]
Modern Text-to-Image (T2I) Diffusion models have revolutionized image editing by enabling the generation of high-quality images.
We propose ReEdit, a modular and efficient end-to-end framework that captures edits in both text and image modalities.
Our results demonstrate that ReEdit consistently outperforms contemporary approaches both qualitatively and quantitatively.
arXiv Detail & Related papers (2024-11-06T15:19:24Z) - TurboEdit: Text-Based Image Editing Using Few-Step Diffusion Models [53.757752110493215]
We focus on a popular line of text-based editing frameworks - the edit-friendly'' DDPM-noise inversion approach.
We analyze its application to fast sampling methods and categorize its failures into two classes: the appearance of visual artifacts, and insufficient editing strength.
We propose a pseudo-guidance approach that efficiently increases the magnitude of edits without introducing new artifacts.
arXiv Detail & Related papers (2024-08-01T17:27:28Z) - Preserving Identity with Variational Score for General-purpose 3D Editing [48.314327790451856]
Piva is a novel optimization-based method for editing images and 3D models based on diffusion models.
We pinpoint the limitations in 2D and 3D editing, which causes detail loss and oversaturation.
We propose an additional score distillation term that enforces identity preservation.
arXiv Detail & Related papers (2024-06-13T09:32:40Z) - Unified Editing of Panorama, 3D Scenes, and Videos Through Disentangled Self-Attention Injection [60.47731445033151]
We propose a novel unified editing framework that combines the strengths of both approaches by utilizing only a basic 2D image text-to-image (T2I) diffusion model.
Experimental results confirm that our method enables editing across diverse modalities including 3D scenes, videos, and panorama images.
arXiv Detail & Related papers (2024-05-27T04:44:36Z) - Consolidating Attention Features for Multi-view Image Editing [126.19731971010475]
We focus on spatial control-based geometric manipulations and introduce a method to consolidate the editing process across various views.
We introduce QNeRF, a neural radiance field trained on the internal query features of the edited images.
We refine the process through a progressive, iterative method that better consolidates queries across the diffusion timesteps.
arXiv Detail & Related papers (2024-02-22T18:50:18Z) - Real-time 3D-aware Portrait Editing from a Single Image [111.27169315556444]
3DPE can edit a face image following given prompts, like reference images or text descriptions.
A lightweight module is distilled from a 3D portrait generator and a text-to-image model.
arXiv Detail & Related papers (2024-02-21T18:36:26Z) - Tuning-Free Inversion-Enhanced Control for Consistent Image Editing [44.311286151669464]
We present a novel approach called Tuning-free Inversion-enhanced Control (TIC)
TIC correlates features from the inversion process with those from the sampling process to mitigate the inconsistency in DDIM reconstruction.
We also propose a mask-guided attention concatenation strategy that combines contents from both the inversion and the naive DDIM editing processes.
arXiv Detail & Related papers (2023-12-22T11:13:22Z) - Object-aware Inversion and Reassembly for Image Editing [61.19822563737121]
We propose Object-aware Inversion and Reassembly (OIR) to enable object-level fine-grained editing.
We use our search metric to find the optimal inversion step for each editing pair when editing an image.
Our method achieves superior performance in editing object shapes, colors, materials, categories, etc., especially in multi-object editing scenarios.
arXiv Detail & Related papers (2023-10-18T17:59:02Z) - KV Inversion: KV Embeddings Learning for Text-Conditioned Real Image
Action Editing [15.831539388569473]
We propose KV Inversion, a method that can achieve satisfactory reconstruction performance and action editing.
Our method does not require training the Stable Diffusion model itself, nor does it require scanning a large-scale dataset to perform time-consuming training.
arXiv Detail & Related papers (2023-09-28T17:07:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.