ReMOVE: A Reference-free Metric for Object Erasure
- URL: http://arxiv.org/abs/2409.00707v1
- Date: Sun, 1 Sep 2024 12:26:14 GMT
- Title: ReMOVE: A Reference-free Metric for Object Erasure
- Authors: Aditya Chandrasekar, Goirik Chakrabarty, Jai Bardhan, Ramya Hebbalaguppe, Prathosh AP,
- Abstract summary: We introduce $texttReMOVE$, a novel reference-free metric for assessing object erasure efficacy in diffusion-based image editing models post-generation.
- Score: 7.8330705738412
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We introduce $\texttt{ReMOVE}$, a novel reference-free metric for assessing object erasure efficacy in diffusion-based image editing models post-generation. Unlike existing measures such as LPIPS and CLIPScore, $\texttt{ReMOVE}$ addresses the challenge of evaluating inpainting without a reference image, common in practical scenarios. It effectively distinguishes between object removal and replacement. This is a key issue in diffusion models due to stochastic nature of image generation. Traditional metrics fail to align with the intuitive definition of inpainting, which aims for (1) seamless object removal within masked regions (2) while preserving the background continuity. $\texttt{ReMOVE}$ not only correlates with state-of-the-art metrics and aligns with human perception but also captures the nuanced aspects of the inpainting process, providing a finer-grained evaluation of the generated outputs.
Related papers
- Improving Text-guided Object Inpainting with Semantic Pre-inpainting [95.17396565347936]
We decompose the typical single-stage object inpainting into two cascaded processes: semantic pre-inpainting and high-fieldity object generation.
To achieve this, we cascade a Transformer-based semantic inpainter and an object inpainting diffusion model, leading to a novel CAscaded Transformer-Diffusion framework.
arXiv Detail & Related papers (2024-09-12T17:55:37Z) - DiffUHaul: A Training-Free Method for Object Dragging in Images [78.93531472479202]
We propose a training-free method, dubbed DiffUHaul, for the object dragging task.
We first apply attention masking in each denoising step to make the generation more disentangled across different objects.
In the early denoising steps, we interpolate the attention features between source and target images to smoothly fuse new layouts with the original appearance.
arXiv Detail & Related papers (2024-06-03T17:59:53Z) - RecDiffusion: Rectangling for Image Stitching with Diffusion Models [53.824503710254206]
We introduce a novel diffusion-based learning framework, textbfRecDiffusion, for image stitching rectangling.
This framework combines Motion Diffusion Models (MDM) to generate motion fields, effectively transitioning from the stitched image's irregular borders to a geometrically corrected intermediary.
arXiv Detail & Related papers (2024-03-28T06:22:45Z) - SEMPART: Self-supervised Multi-resolution Partitioning of Image
Semantics [0.5439020425818999]
SEMPART produces high-quality masks rapidly without additional post-processing.
Our salient object detection and single object localization findings suggest that SEMPART produces high-quality masks rapidly without additional post-processing.
arXiv Detail & Related papers (2023-09-20T00:07:30Z) - Inst-Inpaint: Instructing to Remove Objects with Diffusion Models [18.30057229657246]
In this work, we are interested in an image inpainting algorithm that estimates which object to be removed based on natural language input and removes it, simultaneously.
We present a novel inpainting framework, Inst-Inpaint, that can remove objects from images based on the instructions given as text prompts.
arXiv Detail & Related papers (2023-04-06T17:29:50Z) - Semantics-Guided Object Removal for Facial Images: with Broad
Applicability and Robust Style Preservation [29.162655333387452]
Object removal and image inpainting in facial images is a task in which objects that occlude a facial image are specifically targeted, removed, and replaced by a properly reconstructed facial image.
Two different approaches utilizing U-net and modulated generator respectively have been widely endorsed for this task for their unique advantages but notwithstanding each method's innate disadvantages.
Here, we propose Semantics-Guided Inpainting Network (SGIN) which itself is a modification of the modulated generator, aiming to take advantage of its advanced generative capability and preserve the high-fidelity details of the original image.
arXiv Detail & Related papers (2022-09-29T00:09:12Z) - Self-Supervised Video Object Segmentation via Cutout Prediction and
Tagging [117.73967303377381]
We propose a novel self-supervised Video Object (VOS) approach that strives to achieve better object-background discriminability.
Our approach is based on a discriminative learning loss formulation that takes into account both object and background information.
Our proposed approach, CT-VOS, achieves state-of-the-art results on two challenging benchmarks: DAVIS-2017 and Youtube-VOS.
arXiv Detail & Related papers (2022-04-22T17:53:27Z) - Pose with Style: Detail-Preserving Pose-Guided Image Synthesis with
Conditional StyleGAN [88.62422914645066]
We present an algorithm for re-rendering a person from a single image under arbitrary poses.
Existing methods often have difficulties in hallucinating occluded contents photo-realistically while preserving the identity and fine details in the source image.
We show that our method compares favorably against the state-of-the-art algorithms in both quantitative evaluation and visual comparison.
arXiv Detail & Related papers (2021-09-13T17:59:33Z) - Enhanced Residual Networks for Context-based Image Outpainting [0.0]
Deep models struggle to understand context and extrapolation through retained information.
Current models use generative adversarial networks to generate results which lack localized image feature consistency and appear fake.
We propose two methods to improve this issue: the use of a local and global discriminator, and the addition of residual blocks within the encoding section of the network.
arXiv Detail & Related papers (2020-05-14T05:14:26Z) - Learning to Manipulate Individual Objects in an Image [71.55005356240761]
We describe a method to train a generative model with latent factors that are independent and localized.
This means that perturbing the latent variables affects only local regions of the synthesized image, corresponding to objects.
Unlike other unsupervised generative models, ours enables object-centric manipulation, without requiring object-level annotations.
arXiv Detail & Related papers (2020-04-11T21:50:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.