Instance-Aware Image Completion
- URL: http://arxiv.org/abs/2210.12350v3
- Date: Fri, 26 May 2023 09:06:06 GMT
- Title: Instance-Aware Image Completion
- Authors: Jinoh Cho, Minguk Kang, Vibhav Vineet and Jaesik Park
- Abstract summary: We propose a novel image completion model, dubbed ImComplete, that hallucinates the missing instance that harmonizes well with - and thus preserves - the original context.
ImComplete first adopts a transformer architecture that considers the visible instances and the location of the missing region.
Then, ImComplete completes the semantic segmentation masks within the missing region, providing pixel-level semantic and structural guidance.
- Score: 15.64981939298373
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Image completion is a task that aims to fill in the missing region of a
masked image with plausible contents. However, existing image completion
methods tend to fill in the missing region with the surrounding texture instead
of hallucinating a visual instance that is suitable in accordance with the
context of the scene. In this work, we propose a novel image completion model,
dubbed ImComplete, that hallucinates the missing instance that harmonizes well
with - and thus preserves - the original context. ImComplete first adopts a
transformer architecture that considers the visible instances and the location
of the missing region. Then, ImComplete completes the semantic segmentation
masks within the missing region, providing pixel-level semantic and structural
guidance. Finally, the image synthesis blocks generate photo-realistic content.
We perform a comprehensive evaluation of the results in terms of visual quality
(LPIPS and FID) and contextual preservation scores (CLIPscore and object
detection accuracy) with COCO-panoptic and Visual Genome datasets. Experimental
results show the superiority of ImComplete on various natural images.
Related papers
- Structure-preserving Image Translation for Depth Estimation in Colonoscopy Video [1.0485739694839669]
We propose a pipeline of structure-preserving synthetic-to-real (sim2real) image translation.
This allows us to generate large quantities of realistic-looking synthetic images for supervised depth estimation.
We also propose a dataset of hand-picked sequences from clinical colonoscopies to improve the image translation process.
arXiv Detail & Related papers (2024-08-19T17:02:16Z) - CompNVS: Novel View Synthesis with Scene Completion [83.19663671794596]
We propose a generative pipeline performing on a sparse grid-based neural scene representation to complete unobserved scene parts.
We process encoded image features in 3D space with a geometry completion network and a subsequent texture inpainting network to extrapolate the missing area.
Photorealistic image sequences can be finally obtained via consistency-relevant differentiable rendering.
arXiv Detail & Related papers (2022-07-23T09:03:13Z) - Fully Context-Aware Image Inpainting with a Learned Semantic Pyramid [102.24539566851809]
Restoring reasonable and realistic content for arbitrary missing regions in images is an important yet challenging task.
Recent image inpainting models have made significant progress in generating vivid visual details, but they can still lead to texture blurring or structural distortions.
We propose the Semantic Pyramid Network (SPN) motivated by the idea that learning multi-scale semantic priors can greatly benefit the recovery of locally missing content in images.
arXiv Detail & Related papers (2021-12-08T04:33:33Z) - Context-Aware Image Inpainting with Learned Semantic Priors [100.99543516733341]
We introduce pretext tasks that are semantically meaningful to estimating the missing contents.
We propose a context-aware image inpainting model, which adaptively integrates global semantics and local features.
arXiv Detail & Related papers (2021-06-14T08:09:43Z) - Image Inpainting Guided by Coherence Priors of Semantics and Textures [62.92586889409379]
We introduce coherence priors between the semantics and textures which make it possible to concentrate on completing separate textures in a semantic-wise manner.
We also propose two coherence losses to constrain the consistency between the semantics and the inpainted image in terms of the overall structure and detailed textures.
arXiv Detail & Related papers (2020-12-15T02:59:37Z) - Guidance and Evaluation: Semantic-Aware Image Inpainting for Mixed
Scenes [54.836331922449666]
We propose a Semantic Guidance and Evaluation Network (SGE-Net) to update the structural priors and the inpainted image.
It utilizes semantic segmentation map as guidance in each scale of inpainting, under which location-dependent inferences are re-evaluated.
Experiments on real-world images of mixed scenes demonstrated the superiority of our proposed method over state-of-the-art approaches.
arXiv Detail & Related papers (2020-03-15T17:49:20Z) - Grounded and Controllable Image Completion by Incorporating Lexical
Semantics [111.47374576372813]
Lexical Semantic Image Completion (LSIC) may have potential applications in art, design, and heritage conservation.
We advocate generating results faithful to both visual and lexical semantic context.
One major challenge for LSIC comes from modeling and aligning the structure of visual-semantic context.
arXiv Detail & Related papers (2020-02-29T16:54:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.