Shape-guided Object Inpainting
- URL: http://arxiv.org/abs/2204.07845v1
- Date: Sat, 16 Apr 2022 17:19:11 GMT
- Title: Shape-guided Object Inpainting
- Authors: Yu Zeng, Zhe Lin, Vishal M. Patel
- Abstract summary: This work studies a new image inpainting task, i.e. shape-guided object inpainting.
We propose a new data preparation method and a novel Contextual Object Generator (CogNet) for the object inpainting task.
Experiments demonstrate that the proposed method can generate realistic objects that fit the context in terms of both visual appearance and semantic meanings.
- Score: 84.18768707298105
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Previous works on image inpainting mainly focus on inpainting background or
partially missing objects, while the problem of inpainting an entire missing
object remains unexplored. This work studies a new image inpainting task, i.e.
shape-guided object inpainting. Given an incomplete input image, the goal is to
fill in the hole by generating an object based on the context and implicit
guidance given by the hole shape. Since previous methods for image inpainting
are mainly designed for background inpainting, they are not suitable for this
task. Therefore, we propose a new data preparation method and a novel
Contextual Object Generator (CogNet) for the object inpainting task. On the
data side, we incorporate object priors into training data by using object
instances as holes. The CogNet has a two-stream architecture that combines the
standard bottom-up image completion process with a top-down object generation
process. A predictive class embedding module bridges the two streams by
predicting the class of the missing object from the bottom-up features, from
which a semantic object map is derived as the input of the top-down stream.
Experiments demonstrate that the proposed method can generate realistic objects
that fit the context in terms of both visual appearance and semantic meanings.
Code can be found at the project page
\url{https://zengxianyu.github.io/objpaint}
Related papers
- Improving Text-guided Object Inpainting with Semantic Pre-inpainting [95.17396565347936]
We decompose the typical single-stage object inpainting into two cascaded processes: semantic pre-inpainting and high-fieldity object generation.
To achieve this, we cascade a Transformer-based semantic inpainter and an object inpainting diffusion model, leading to a novel CAscaded Transformer-Diffusion framework.
arXiv Detail & Related papers (2024-09-12T17:55:37Z) - In Defense of Lazy Visual Grounding for Open-Vocabulary Semantic Segmentation [50.79940712523551]
We present lazy visual grounding, a two-stage approach of unsupervised object mask discovery followed by object grounding.
Our model requires no additional training yet shows great performance on five public datasets.
arXiv Detail & Related papers (2024-08-09T09:28:35Z) - Diffree: Text-Guided Shape Free Object Inpainting with Diffusion Model [81.96954332787655]
We introduce Diffree, a Text-to-Image (T2I) model that facilitates text-guided object addition with only text control.
In experiments, Diffree adds new objects with a high success rate while maintaining background consistency, spatial, and object relevance and quality.
arXiv Detail & Related papers (2024-07-24T03:58:58Z) - FaithFill: Faithful Inpainting for Object Completion Using a Single Reference Image [6.742568054626032]
FaithFill is a diffusion-based inpainting approach for realistic generation of missing object parts.
We demonstrate that FaithFill produces faithful generation of the object's missing parts, together with background/scene preservation, from a single reference image.
arXiv Detail & Related papers (2024-06-12T04:45:33Z) - Salient Object-Aware Background Generation using Text-Guided Diffusion Models [4.747826159446815]
We present a model for adapting inpainting diffusion models to the salient object outpainting task using Stable Diffusion and ControlNet architectures.
Our proposed approach reduces object expansion by 3.6x on average with no degradation in standard visual metrics across multiple datasets.
arXiv Detail & Related papers (2024-04-15T22:13:35Z) - DreamCom: Finetuning Text-guided Inpainting Model for Image Composition [24.411003826961686]
We propose DreamCom by treating image composition as text-guided image inpainting customized for certain object.
Specifically, we finetune pretrained text-guided image inpainting model based on a few reference images containing the same object.
In practice, the inserted object may be adversely affected by the background, so we propose masked attention mechanisms to avoid negative background interference.
arXiv Detail & Related papers (2023-09-27T09:23:50Z) - Inst-Inpaint: Instructing to Remove Objects with Diffusion Models [18.30057229657246]
In this work, we are interested in an image inpainting algorithm that estimates which object to be removed based on natural language input and removes it, simultaneously.
We present a novel inpainting framework, Inst-Inpaint, that can remove objects from images based on the instructions given as text prompts.
arXiv Detail & Related papers (2023-04-06T17:29:50Z) - Context-Aware Image Inpainting with Learned Semantic Priors [100.99543516733341]
We introduce pretext tasks that are semantically meaningful to estimating the missing contents.
We propose a context-aware image inpainting model, which adaptively integrates global semantics and local features.
arXiv Detail & Related papers (2021-06-14T08:09:43Z) - Holistic 3D Scene Understanding from a Single Image with Implicit
Representation [112.40630836979273]
We present a new pipeline for holistic 3D scene understanding from a single image.
We propose an image-based local structured implicit network to improve the object shape estimation.
We also refine 3D object pose and scene layout via a novel implicit scene graph neural network.
arXiv Detail & Related papers (2021-03-11T02:52:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.