Magicremover: Tuning-free Text-guided Image inpainting with Diffusion
Models
- URL: http://arxiv.org/abs/2310.02848v1
- Date: Wed, 4 Oct 2023 14:34:11 GMT
- Title: Magicremover: Tuning-free Text-guided Image inpainting with Diffusion
Models
- Authors: Siyuan Yang, Lu Zhang, Liqian Ma, Yu Liu, JingJing Fu and You He
- Abstract summary: We propose MagicRemover, a tuning-free method that leverages the powerful diffusion models for text-guided image inpainting.
We introduce an attention guidance strategy to constrain the sampling process of diffusion models, enabling the erasing of instructed areas and the restoration of occluded content.
- Score: 24.690863845885367
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Image inpainting aims to fill in the missing pixels with visually coherent
and semantically plausible content. Despite the great progress brought from
deep generative models, this task still suffers from i. the difficulties in
large-scale realistic data collection and costly model training; and ii. the
intrinsic limitations in the traditionally user-defined binary masks on objects
with unclear boundaries or transparent texture. In this paper, we propose
MagicRemover, a tuning-free method that leverages the powerful diffusion models
for text-guided image inpainting. We introduce an attention guidance strategy
to constrain the sampling process of diffusion models, enabling the erasing of
instructed areas and the restoration of occluded content. We further propose a
classifier optimization algorithm to facilitate the denoising stability within
less sampling steps. Extensive comparisons are conducted among our MagicRemover
and state-of-the-art methods including quantitative evaluation and user study,
demonstrating the significant improvement of MagicRemover on high-quality image
inpainting. We will release our code at https://github.com/exisas/Magicremover.
Related papers
- MMAR: Towards Lossless Multi-Modal Auto-Regressive Probabilistic Modeling [64.09238330331195]
We propose a novel Multi-Modal Auto-Regressive (MMAR) probabilistic modeling framework.
Unlike discretization line of method, MMAR takes in continuous-valued image tokens to avoid information loss.
We show that MMAR demonstrates much more superior performance than other joint multi-modal models.
arXiv Detail & Related papers (2024-10-14T17:57:18Z) - ZePo: Zero-Shot Portrait Stylization with Faster Sampling [61.14140480095604]
This paper presents an inversion-free portrait stylization framework based on diffusion models that accomplishes content and style feature fusion in merely four sampling steps.
We propose a feature merging strategy to amalgamate redundant features in Consistency Features, thereby reducing the computational load of attention control.
arXiv Detail & Related papers (2024-08-10T08:53:41Z) - DiffUHaul: A Training-Free Method for Object Dragging in Images [78.93531472479202]
We propose a training-free method, dubbed DiffUHaul, for the object dragging task.
We first apply attention masking in each denoising step to make the generation more disentangled across different objects.
In the early denoising steps, we interpolate the attention features between source and target images to smoothly fuse new layouts with the original appearance.
arXiv Detail & Related papers (2024-06-03T17:59:53Z) - Paint by Inpaint: Learning to Add Image Objects by Removing Them First [8.399234415641319]
We train a diffusion model to inverse the inpainting process, effectively adding objects into images.
We provide detailed descriptions of the removed objects and a Large Language Model to convert these descriptions into diverse, natural-language instructions.
arXiv Detail & Related papers (2024-04-28T15:07:53Z) - Fill in the ____ (a Diffusion-based Image Inpainting Pipeline) [0.0]
Inpainting is the process of taking an image and generating lost or intentionally occluded portions.
Modern inpainting techniques have shown remarkable ability in generating sensible completions.
A critical gap in these existing models will be addressed, focusing on the ability to prompt and control what exactly is generated.
arXiv Detail & Related papers (2024-03-24T05:26:55Z) - BrushNet: A Plug-and-Play Image Inpainting Model with Decomposed
Dual-Branch Diffusion [61.90969199199739]
BrushNet is a novel plug-and-play dual-branch model engineered to embed pixel-level masked image features into any pre-trained DM.
BrushNet's superior performance over existing models across seven key metrics, including image quality, mask region preservation, and textual coherence.
arXiv Detail & Related papers (2024-03-11T17:59:31Z) - MetaCloak: Preventing Unauthorized Subject-driven Text-to-image Diffusion-based Synthesis via Meta-learning [59.988458964353754]
Text-to-image diffusion models allow seamless generation of personalized images from scant reference photos.
Existing approaches perturb user images in imperceptible way to render them "unlearnable" from malicious uses.
We propose MetaCloak, which solves the bi-level poisoning problem with a meta-learning framework.
arXiv Detail & Related papers (2023-11-22T03:31:31Z) - SuperInpaint: Learning Detail-Enhanced Attentional Implicit
Representation for Super-resolutional Image Inpainting [26.309834304515544]
We introduce a challenging image restoration task, referred to as SuperInpaint.
This task aims to reconstruct missing regions in low-resolution images and generate completed images with arbitrarily higher resolutions.
We propose the detail-enhanced attentional implicit representation that can achieve SuperInpaint with a single model.
arXiv Detail & Related papers (2023-07-26T20:28:58Z) - Inst-Inpaint: Instructing to Remove Objects with Diffusion Models [18.30057229657246]
In this work, we are interested in an image inpainting algorithm that estimates which object to be removed based on natural language input and removes it, simultaneously.
We present a novel inpainting framework, Inst-Inpaint, that can remove objects from images based on the instructions given as text prompts.
arXiv Detail & Related papers (2023-04-06T17:29:50Z) - High-Resolution Image Inpainting with Iterative Confidence Feedback and
Guided Upsampling [122.06593036862611]
Existing image inpainting methods often produce artifacts when dealing with large holes in real applications.
We propose an iterative inpainting method with a feedback mechanism.
Experiments show that our method significantly outperforms existing methods in both quantitative and qualitative evaluations.
arXiv Detail & Related papers (2020-05-24T13:23:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.