SmartEraser: Remove Anything from Images using Masked-Region Guidance
- URL: http://arxiv.org/abs/2501.08279v1
- Date: Tue, 14 Jan 2025 17:55:12 GMT
- Title: SmartEraser: Remove Anything from Images using Masked-Region Guidance
- Authors: Longtao Jiang, Zhendong Wang, Jianmin Bao, Wengang Zhou, Dongdong Chen, Lei Shi, Dong Chen, Houqiang Li,
- Abstract summary: SmartEraser is built with a new removing paradigm called Masked-Region Guidance.
Masked-Region Guidance retains the masked region in the input, using it as guidance for the removal process.
We present Syn4Removal, a large-scale object removal dataset.
- Score: 114.36809682798784
- License:
- Abstract: Object removal has so far been dominated by the mask-and-inpaint paradigm, where the masked region is excluded from the input, leaving models relying on unmasked areas to inpaint the missing region. However, this approach lacks contextual information for the masked area, often resulting in unstable performance. In this work, we introduce SmartEraser, built with a new removing paradigm called Masked-Region Guidance. This paradigm retains the masked region in the input, using it as guidance for the removal process. It offers several distinct advantages: (a) it guides the model to accurately identify the object to be removed, preventing its regeneration in the output; (b) since the user mask often extends beyond the object itself, it aids in preserving the surrounding context in the final result. Leveraging this new paradigm, we present Syn4Removal, a large-scale object removal dataset, where instance segmentation data is used to copy and paste objects onto images as removal targets, with the original images serving as ground truths. Experimental results demonstrate that SmartEraser significantly outperforms existing methods, achieving superior performance in object removal, especially in complex scenes with intricate compositions.
Related papers
- DiffUHaul: A Training-Free Method for Object Dragging in Images [78.93531472479202]
We propose a training-free method, dubbed DiffUHaul, for the object dragging task.
We first apply attention masking in each denoising step to make the generation more disentangled across different objects.
In the early denoising steps, we interpolate the attention features between source and target images to smoothly fuse new layouts with the original appearance.
arXiv Detail & Related papers (2024-06-03T17:59:53Z) - Paint by Inpaint: Learning to Add Image Objects by Removing Them First [8.399234415641319]
We train a diffusion model to inverse the inpainting process, effectively adding objects into images.
We provide detailed descriptions of the removed objects and a Large Language Model to convert these descriptions into diverse, natural-language instructions.
arXiv Detail & Related papers (2024-04-28T15:07:53Z) - Inpainting-Driven Mask Optimization for Object Removal [15.429649454099085]
This paper proposes a mask optimization method for improving the quality of object removal using image inpainting.
In our method, this domain gap is resolved by training the inpainting network with object masks extracted by segmentation.
To optimize the object masks for inpainting, the segmentation network is connected to the inpainting network and end-to-end trained to improve the inpainting performance.
arXiv Detail & Related papers (2024-03-23T13:52:16Z) - Variance-insensitive and Target-preserving Mask Refinement for
Interactive Image Segmentation [68.16510297109872]
Point-based interactive image segmentation can ease the burden of mask annotation in applications such as semantic segmentation and image editing.
We introduce a novel method, Variance-Insensitive and Target-Preserving Mask Refinement to enhance segmentation quality with fewer user inputs.
Experiments on GrabCut, Berkeley, SBD, and DAVIS datasets demonstrate our method's state-of-the-art performance in interactive image segmentation.
arXiv Detail & Related papers (2023-12-22T02:31:31Z) - Completing Visual Objects via Bridging Generation and Segmentation [84.4552458720467]
MaskComp delineates the completion process through iterative stages of generation and segmentation.
In each iteration, the object mask is provided as an additional condition to boost image generation.
We demonstrate that the combination of one generation and one segmentation stage effectively functions as a mask denoiser.
arXiv Detail & Related papers (2023-10-01T22:25:40Z) - SEMPART: Self-supervised Multi-resolution Partitioning of Image
Semantics [0.5439020425818999]
SEMPART produces high-quality masks rapidly without additional post-processing.
Our salient object detection and single object localization findings suggest that SEMPART produces high-quality masks rapidly without additional post-processing.
arXiv Detail & Related papers (2023-09-20T00:07:30Z) - AURA : Automatic Mask Generator using Randomized Input Sampling for Object Removal [26.81218265405809]
In this paper, we focus on generating the input mask to better remove objects using the off-the-shelf image inpainting network.
We propose an automatic mask generator inspired by the explainable AI (XAI) method, whose output can better remove objects than a semantic segmentation mask.
Experiments confirm that our method shows better performance in removing target class objects than the masks generated from the semantic segmentation maps.
arXiv Detail & Related papers (2023-05-13T07:51:35Z) - Towards Improved Input Masking for Convolutional Neural Networks [66.99060157800403]
We propose a new masking method for CNNs we call layer masking.
We show that our method is able to eliminate or minimize the influence of the mask shape or color on the output of the model.
We also demonstrate how the shape of the mask may leak information about the class, thus affecting estimates of model reliance on class-relevant features.
arXiv Detail & Related papers (2022-11-26T19:31:49Z) - Layered Depth Refinement with Mask Guidance [61.10654666344419]
We formulate a novel problem of mask-guided depth refinement that utilizes a generic mask to refine the depth prediction of SIDE models.
Our framework performs layered refinement and inpainting/outpainting, decomposing the depth map into two separate layers signified by the mask and the inverse mask.
We empirically show that our method is robust to different types of masks and initial depth predictions, accurately refining depth values in inner and outer mask boundary regions.
arXiv Detail & Related papers (2022-06-07T06:42:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.