Magicremover: Tuning-free Text-guided Image inpainting with Diffusion
Models
- URL: http://arxiv.org/abs/2310.02848v1
- Date: Wed, 4 Oct 2023 14:34:11 GMT
- Title: Magicremover: Tuning-free Text-guided Image inpainting with Diffusion
Models
- Authors: Siyuan Yang, Lu Zhang, Liqian Ma, Yu Liu, JingJing Fu and You He
- Abstract summary: We propose MagicRemover, a tuning-free method that leverages the powerful diffusion models for text-guided image inpainting.
We introduce an attention guidance strategy to constrain the sampling process of diffusion models, enabling the erasing of instructed areas and the restoration of occluded content.
- Score: 24.690863845885367
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Image inpainting aims to fill in the missing pixels with visually coherent
and semantically plausible content. Despite the great progress brought from
deep generative models, this task still suffers from i. the difficulties in
large-scale realistic data collection and costly model training; and ii. the
intrinsic limitations in the traditionally user-defined binary masks on objects
with unclear boundaries or transparent texture. In this paper, we propose
MagicRemover, a tuning-free method that leverages the powerful diffusion models
for text-guided image inpainting. We introduce an attention guidance strategy
to constrain the sampling process of diffusion models, enabling the erasing of
instructed areas and the restoration of occluded content. We further propose a
classifier optimization algorithm to facilitate the denoising stability within
less sampling steps. Extensive comparisons are conducted among our MagicRemover
and state-of-the-art methods including quantitative evaluation and user study,
demonstrating the significant improvement of MagicRemover on high-quality image
inpainting. We will release our code at https://github.com/exisas/Magicremover.
Related papers
- DiffUHaul: A Training-Free Method for Object Dragging in Images [78.93531472479202]
We propose a training-free method, dubbed DiffUHaul, for the object dragging task.
We first apply attention masking in each denoising step to make the generation more disentangled across different objects.
In the early denoising steps, we interpolate the attention features between source and target images to smoothly fuse new layouts with the original appearance.
arXiv Detail & Related papers (2024-06-03T17:59:53Z) - Paint by Inpaint: Learning to Add Image Objects by Removing Them First [8.399234415641319]
We train a diffusion model to inverse the inpainting process, effectively adding objects into images.
We provide detailed descriptions of the removed objects and a Large Language Model to convert these descriptions into diverse, natural-language instructions.
arXiv Detail & Related papers (2024-04-28T15:07:53Z) - Fill in the ____ (a Diffusion-based Image Inpainting Pipeline) [0.0]
Inpainting is the process of taking an image and generating lost or intentionally occluded portions.
Modern inpainting techniques have shown remarkable ability in generating sensible completions.
A critical gap in these existing models will be addressed, focusing on the ability to prompt and control what exactly is generated.
arXiv Detail & Related papers (2024-03-24T05:26:55Z) - BrushNet: A Plug-and-Play Image Inpainting Model with Decomposed
Dual-Branch Diffusion [61.90969199199739]
BrushNet is a novel plug-and-play dual-branch model engineered to embed pixel-level masked image features into any pre-trained DM.
BrushNet's superior performance over existing models across seven key metrics, including image quality, mask region preservation, and textual coherence.
arXiv Detail & Related papers (2024-03-11T17:59:31Z) - MetaCloak: Preventing Unauthorized Subject-driven Text-to-image Diffusion-based Synthesis via Meta-learning [59.988458964353754]
Text-to-image diffusion models allow seamless generation of personalized images from scant reference photos.
Existing approaches perturb user images in imperceptible way to render them "unlearnable" from malicious uses.
We propose MetaCloak, which solves the bi-level poisoning problem with a meta-learning framework.
arXiv Detail & Related papers (2023-11-22T03:31:31Z) - SuperInpaint: Learning Detail-Enhanced Attentional Implicit
Representation for Super-resolutional Image Inpainting [26.309834304515544]
We introduce a challenging image restoration task, referred to as SuperInpaint.
This task aims to reconstruct missing regions in low-resolution images and generate completed images with arbitrarily higher resolutions.
We propose the detail-enhanced attentional implicit representation that can achieve SuperInpaint with a single model.
arXiv Detail & Related papers (2023-07-26T20:28:58Z) - Inst-Inpaint: Instructing to Remove Objects with Diffusion Models [18.30057229657246]
In this work, we are interested in an image inpainting algorithm that estimates which object to be removed based on natural language input and removes it, simultaneously.
We present a novel inpainting framework, Inst-Inpaint, that can remove objects from images based on the instructions given as text prompts.
arXiv Detail & Related papers (2023-04-06T17:29:50Z) - In&Out : Diverse Image Outpainting via GAN Inversion [89.84841983778672]
Image outpainting seeks for a semantically consistent extension of the input image beyond its available content.
In this work, we formulate the problem from the perspective of inverting generative adversarial networks.
Our generator renders micro-patches conditioned on their joint latent code as well as their individual positions in the image.
arXiv Detail & Related papers (2021-04-01T17:59:10Z) - Semantic Layout Manipulation with High-Resolution Sparse Attention [106.59650698907953]
We tackle the problem of semantic image layout manipulation, which aims to manipulate an input image by editing its semantic label map.
A core problem of this task is how to transfer visual details from the input images to the new semantic layout while making the resulting image visually realistic.
We propose a high-resolution sparse attention module that effectively transfers visual details to new layouts at a resolution up to 512x512.
arXiv Detail & Related papers (2020-12-14T06:50:43Z) - High-Resolution Image Inpainting with Iterative Confidence Feedback and
Guided Upsampling [122.06593036862611]
Existing image inpainting methods often produce artifacts when dealing with large holes in real applications.
We propose an iterative inpainting method with a feedback mechanism.
Experiments show that our method significantly outperforms existing methods in both quantitative and qualitative evaluations.
arXiv Detail & Related papers (2020-05-24T13:23:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.