Delving Globally into Texture and Structure for Image Inpainting
- URL: http://arxiv.org/abs/2209.08217v1
- Date: Sat, 17 Sep 2022 02:19:26 GMT
- Title: Delving Globally into Texture and Structure for Image Inpainting
- Authors: Haipeng Liu, Yang Wang, Meng Wang, Yong Rui
- Abstract summary: Image inpainting has achieved remarkable progress and inspired abundant methods, where the critical bottleneck is identified as how to fulfill the high-frequency structure and low-frequency texture information on the masked regions with semantics.
In this paper, we delve globally into texture and structure information to well capture the semantics for image inpainting.
Our model is tovolution to the fashionable arts, such as Conal Neural Networks (CNNs), Attention and Transformer model, from the perspective of texture and structure information for image inpainting.
- Score: 20.954875933730808
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Image inpainting has achieved remarkable progress and inspired abundant
methods, where the critical bottleneck is identified as how to fulfill the
high-frequency structure and low-frequency texture information on the masked
regions with semantics. To this end, deep models exhibit powerful superiority
to capture them, yet constrained on the local spatial regions. In this paper,
we delve globally into texture and structure information to well capture the
semantics for image inpainting. As opposed to the existing arts trapped on the
independent local patches, the texture information of each patch is
reconstructed from all other patches across the whole image, to match the
coarsely filled information, specially the structure information over the
masked regions. Unlike the current decoder-only transformer within the pixel
level for image inpainting, our model adopts the transformer pipeline paired
with both encoder and decoder. On one hand, the encoder captures the texture
semantic correlations of all patches across image via self-attention module. On
the other hand, an adaptive patch vocabulary is dynamically established in the
decoder for the filled patches over the masked regions. Building on this, a
structure-texture matching attention module anchored on the known regions comes
up to marry the best of these two worlds for progressive inpainting via a
probabilistic diffusion process. Our model is orthogonal to the fashionable
arts, such as Convolutional Neural Networks (CNNs), Attention and Transformer
model, from the perspective of texture and structure information for image
inpainting. The extensive experiments over the benchmarks validate its
superiority. Our code is available at
https://github.com/htyjers/DGTS-Inpainting.
Related papers
- Dense Feature Interaction Network for Image Inpainting Localization [28.028361409524457]
Inpainting can be used to conceal or alter image contents in malicious manipulation of images.
Existing methods mostly rely on a basic encoder-decoder structure, which often results in a high number of false positives.
In this paper, we describe a new method for inpainting detection based on a Dense Feature Interaction Network (DeFI-Net)
arXiv Detail & Related papers (2024-08-05T02:35:13Z) - T-former: An Efficient Transformer for Image Inpainting [50.43302925662507]
A class of attention-based network architectures, called transformer, has shown significant performance on natural language processing fields.
In this paper, we design a novel attention linearly related to the resolution according to Taylor expansion, and based on this attention, a network called $T$-former is designed for image inpainting.
Experiments on several benchmark datasets demonstrate that our proposed method achieves state-of-the-art accuracy while maintaining a relatively low number of parameters and computational complexity.
arXiv Detail & Related papers (2023-05-12T04:10:42Z) - Semantic Image Translation for Repairing the Texture Defects of Building
Models [16.764719266178655]
We introduce a novel approach for synthesizing faccade texture images that authentically reflect the architectural style from a structured label map.
Our proposed method is also capable of synthesizing texture images with specific styles for faccades that lack pre-existing textures.
arXiv Detail & Related papers (2023-03-30T14:38:53Z) - Reference-Guided Texture and Structure Inference for Image Inpainting [25.775006005766222]
We build a benchmark dataset containing 10K pairs of input and reference images for reference-guided inpainting.
We adopt an encoder-decoder structure to infer the texture and structure features of the input image.
A feature alignment module is further designed to refine these features of the input image with the guidance of a reference image.
arXiv Detail & Related papers (2022-07-29T06:26:03Z) - Modeling Image Composition for Complex Scene Generation [77.10533862854706]
We present a method that achieves state-of-the-art results on layout-to-image generation tasks.
After compressing RGB images into patch tokens, we propose the Transformer with Focal Attention (TwFA) for exploring dependencies of object-to-object, object-to-patch and patch-to-patch.
arXiv Detail & Related papers (2022-06-02T08:34:25Z) - Context-Aware Image Inpainting with Learned Semantic Priors [100.99543516733341]
We introduce pretext tasks that are semantically meaningful to estimating the missing contents.
We propose a context-aware image inpainting model, which adaptively integrates global semantics and local features.
arXiv Detail & Related papers (2021-06-14T08:09:43Z) - Texture Transform Attention for Realistic Image Inpainting [6.275013056564918]
We propose a Texture Transform Attention network that better produces the missing region inpainting with fine details.
Texture Transform Attention is used to create a new reassembled texture map using fine textures and coarse semantics.
We evaluate our model end-to-end with the publicly available datasets CelebA-HQ and Places2.
arXiv Detail & Related papers (2020-12-08T06:28:51Z) - Free-Form Image Inpainting via Contrastive Attention Network [64.05544199212831]
In image inpainting tasks, masks with any shapes can appear anywhere in images which form complex patterns.
It is difficult for encoders to capture such powerful representations under this complex situation.
We propose a self-supervised Siamese inference network to improve the robustness and generalization.
arXiv Detail & Related papers (2020-10-29T14:46:05Z) - Texture Memory-Augmented Deep Patch-Based Image Inpainting [121.41395272974611]
We propose a new deep inpainting framework where texture generation is guided by a texture memory of patch samples extracted from unmasked regions.
The framework has a novel design that allows texture memory retrieval to be trained end-to-end with the deep inpainting network.
The proposed method shows superior performance both qualitatively and quantitatively on three challenging image benchmarks.
arXiv Detail & Related papers (2020-09-28T12:09:08Z) - Guidance and Evaluation: Semantic-Aware Image Inpainting for Mixed
Scenes [54.836331922449666]
We propose a Semantic Guidance and Evaluation Network (SGE-Net) to update the structural priors and the inpainted image.
It utilizes semantic segmentation map as guidance in each scale of inpainting, under which location-dependent inferences are re-evaluated.
Experiments on real-world images of mixed scenes demonstrated the superiority of our proposed method over state-of-the-art approaches.
arXiv Detail & Related papers (2020-03-15T17:49:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.