Fully Context-Aware Image Inpainting with a Learned Semantic Pyramid
- URL: http://arxiv.org/abs/2112.04107v2
- Date: Mon, 5 Jun 2023 10:07:34 GMT
- Title: Fully Context-Aware Image Inpainting with a Learned Semantic Pyramid
- Authors: Wendong Zhang, Yunbo Wang, Bingbing Ni, Xiaokang Yang
- Abstract summary: Restoring reasonable and realistic content for arbitrary missing regions in images is an important yet challenging task.
Recent image inpainting models have made significant progress in generating vivid visual details, but they can still lead to texture blurring or structural distortions.
We propose the Semantic Pyramid Network (SPN) motivated by the idea that learning multi-scale semantic priors can greatly benefit the recovery of locally missing content in images.
- Score: 102.24539566851809
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Restoring reasonable and realistic content for arbitrary missing regions in
images is an important yet challenging task. Although recent image inpainting
models have made significant progress in generating vivid visual details, they
can still lead to texture blurring or structural distortions due to contextual
ambiguity when dealing with more complex scenes. To address this issue, we
propose the Semantic Pyramid Network (SPN) motivated by the idea that learning
multi-scale semantic priors from specific pretext tasks can greatly benefit the
recovery of locally missing content in images. SPN consists of two components.
First, it distills semantic priors from a pretext model into a multi-scale
feature pyramid, achieving a consistent understanding of the global context and
local structures. Within the prior learner, we present an optional module for
variational inference to realize probabilistic image inpainting driven by
various learned priors. The second component of SPN is a fully context-aware
image generator, which adaptively and progressively refines low-level visual
representations at multiple scales with the (stochastic) prior pyramid. We
train the prior learner and the image generator as a unified model without any
post-processing. Our approach achieves the state of the art on multiple
datasets, including Places2, Paris StreetView, CelebA, and CelebA-HQ, under
both deterministic and probabilistic inpainting setups.
Related papers
- Large Spatial Model: End-to-end Unposed Images to Semantic 3D [79.94479633598102]
Large Spatial Model (LSM) processes unposed RGB images directly into semantic radiance fields.
LSM simultaneously estimates geometry, appearance, and semantics in a single feed-forward operation.
It can generate versatile label maps by interacting with language at novel viewpoints.
arXiv Detail & Related papers (2024-10-24T17:54:42Z) - Unbiased Multi-Modality Guidance for Image Inpainting [27.286351511243502]
We develop an end-to-end multi-modality guided transformer network for image inpainting.
Within each transformer block, the proposed spatial-aware attention module can learn the multi-modal structural features efficiently.
Our method enriches semantically consistent context in an image based on discriminative information from multiple modalities.
arXiv Detail & Related papers (2022-08-25T03:13:43Z) - Boosting Image Outpainting with Semantic Layout Prediction [18.819765707811904]
We train a GAN to extend regions in semantic segmentation domain instead of image domain.
Another GAN model is trained to synthesize real images based on the extended semantic layouts.
Our approach can handle semantic clues more easily and hence works better in complex scenarios.
arXiv Detail & Related papers (2021-10-18T13:09:31Z) - Harnessing the Conditioning Sensorium for Improved Image Translation [2.9631016562930546]
Multi-modal domain translation typically refers to a novel image that inherits certain localized attributes from a 'content' image.
We propose a new approach to learn disentangled 'content' and'style' representations from scratch.
We define 'content' based on conditioning information extracted by off-the-shelf pre-trained models.
We then train our style extractor and image decoder with an easy to optimize set of reconstruction objectives.
arXiv Detail & Related papers (2021-10-13T02:07:43Z) - Context-Aware Image Inpainting with Learned Semantic Priors [100.99543516733341]
We introduce pretext tasks that are semantically meaningful to estimating the missing contents.
We propose a context-aware image inpainting model, which adaptively integrates global semantics and local features.
arXiv Detail & Related papers (2021-06-14T08:09:43Z) - Bridging Composite and Real: Towards End-to-end Deep Image Matting [88.79857806542006]
We study the roles of semantics and details for image matting.
We propose a novel Glance and Focus Matting network (GFM), which employs a shared encoder and two separate decoders.
Comprehensive empirical studies have demonstrated that GFM outperforms state-of-the-art methods.
arXiv Detail & Related papers (2020-10-30T10:57:13Z) - Free-Form Image Inpainting via Contrastive Attention Network [64.05544199212831]
In image inpainting tasks, masks with any shapes can appear anywhere in images which form complex patterns.
It is difficult for encoders to capture such powerful representations under this complex situation.
We propose a self-supervised Siamese inference network to improve the robustness and generalization.
arXiv Detail & Related papers (2020-10-29T14:46:05Z) - Structural-analogy from a Single Image Pair [118.61885732829117]
In this paper, we explore the capabilities of neural networks to understand image structure given only a single pair of images, A and B.
We generate an image that keeps the appearance and style of B, but has a structural arrangement that corresponds to A.
Our method can be used to generate high quality imagery in other conditional generation tasks utilizing images A and B only.
arXiv Detail & Related papers (2020-04-05T14:51:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.