CLIPAway: Harmonizing Focused Embeddings for Removing Objects via Diffusion Models
- URL: http://arxiv.org/abs/2406.09368v1
- Date: Thu, 13 Jun 2024 17:50:28 GMT
- Title: CLIPAway: Harmonizing Focused Embeddings for Removing Objects via Diffusion Models
- Authors: Yigit Ekin, Ahmet Burak Yildirim, Erdem Eren Caglar, Aykut Erdem, Erkut Erdem, Aysegul Dundar,
- Abstract summary: CLIPAway is a novel approach leveraging CLIP embeddings to focus on background regions while excluding foreground elements.
It enhances inpainting accuracy and quality by identifying embeddings that prioritize the background.
Unlike other methods that rely on specialized training datasets or costly manual annotations, CLIPAway provides a flexible, plug-and-play solution.
- Score: 16.58831310165623
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Advanced image editing techniques, particularly inpainting, are essential for seamlessly removing unwanted elements while preserving visual integrity. Traditional GAN-based methods have achieved notable success, but recent advancements in diffusion models have produced superior results due to their training on large-scale datasets, enabling the generation of remarkably realistic inpainted images. Despite their strengths, diffusion models often struggle with object removal tasks without explicit guidance, leading to unintended hallucinations of the removed object. To address this issue, we introduce CLIPAway, a novel approach leveraging CLIP embeddings to focus on background regions while excluding foreground elements. CLIPAway enhances inpainting accuracy and quality by identifying embeddings that prioritize the background, thus achieving seamless object removal. Unlike other methods that rely on specialized training datasets or costly manual annotations, CLIPAway provides a flexible, plug-and-play solution compatible with various diffusion-based inpainting techniques.
Related papers
- VDOR: A Video-based Dataset for Object Removal via Sequence Consistency [19.05827956984347]
Existing datasets related to object removal serve a valuable foundation for model validation and optimization.
We propose a novel video-based annotation pipeline for constructing a realistic illumination-aware object removal dataset.
By leveraging continuous real-world video frames, we minimize distribution gaps and accurately capture realistic lighting and shadow variations.
arXiv Detail & Related papers (2025-01-13T15:12:40Z) - Edicho: Consistent Image Editing in the Wild [90.42395533938915]
Edicho steps in with a training-free solution based on diffusion models.
It features a fundamental design principle of using explicit image correspondence to direct editing.
arXiv Detail & Related papers (2024-12-30T16:56:44Z) - Attentive Eraser: Unleashing Diffusion Model's Object Removal Potential via Self-Attention Redirection Guidance [4.295971864740951]
Attentive Eraser is a tuning-free method to empower pre-trained diffusion models for stable and effective object removal.
We introduce Attention Activation and Suppression (ASS), which re-engineers the self-attention mechanism.
We also introduce Self-Attention Redirection Guidance (SARG), which utilizes the self-attention redirected by ASS to guide the generation process.
arXiv Detail & Related papers (2024-12-17T14:56:59Z) - ExpRDiff: Short-exposure Guided Diffusion Model for Realistic Local Motion Deblurring [61.82010103478833]
We develop a context-based local blur detection module that incorporates additional contextual information to improve the identification of blurry regions.
Considering that modern smartphones are equipped with cameras capable of providing short-exposure images, we develop a blur-aware guided image restoration method.
We formulate the above components into a simple yet effective network, named ExpRDiff.
arXiv Detail & Related papers (2024-12-12T11:42:39Z) - EditScout: Locating Forged Regions from Diffusion-based Edited Images with Multimodal LLM [50.054404519821745]
We present a novel framework that integrates a multimodal Large Language Model for enhanced reasoning capabilities.
Our framework achieves promising results on MagicBrush, AutoSplice, and PerfBrush datasets.
Notably, our method excels on the PerfBrush dataset, a self-constructed test set featuring previously unseen types of edits.
arXiv Detail & Related papers (2024-12-05T02:05:33Z) - MagicEraser: Erasing Any Objects via Semantics-Aware Control [40.683569840182926]
We introduce MagicEraser, a diffusion model-based framework tailored for the object erasure task.
MagicEraser achieves fine and effective control of content generation while mitigating undesired artifacts.
arXiv Detail & Related papers (2024-10-14T07:03:14Z) - TALE: Training-free Cross-domain Image Composition via Adaptive Latent Manipulation and Energy-guided Optimization [59.412236435627094]
TALE is a training-free framework harnessing the generative capabilities of text-to-image diffusion models.
We equip TALE with two mechanisms dubbed Adaptive Latent Manipulation and Energy-guided Latent Optimization.
Our experiments demonstrate that TALE surpasses prior baselines and attains state-of-the-art performance in image-guided composition.
arXiv Detail & Related papers (2024-08-07T08:52:21Z) - DiffUHaul: A Training-Free Method for Object Dragging in Images [78.93531472479202]
We propose a training-free method, dubbed DiffUHaul, for the object dragging task.
We first apply attention masking in each denoising step to make the generation more disentangled across different objects.
In the early denoising steps, we interpolate the attention features between source and target images to smoothly fuse new layouts with the original appearance.
arXiv Detail & Related papers (2024-06-03T17:59:53Z) - DragDiffusion: Harnessing Diffusion Models for Interactive Point-based Image Editing [94.24479528298252]
DragGAN is an interactive point-based image editing framework that achieves impressive editing results with pixel-level precision.
By harnessing large-scale pretrained diffusion models, we greatly enhance the applicability of interactive point-based editing on both real and diffusion-generated images.
We present a challenging benchmark dataset called DragBench to evaluate the performance of interactive point-based image editing methods.
arXiv Detail & Related papers (2023-06-26T06:04:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.