Learning to Incorporate Structure Knowledge for Image Inpainting
- URL: http://arxiv.org/abs/2002.04170v2
- Date: Wed, 12 Feb 2020 03:12:04 GMT
- Title: Learning to Incorporate Structure Knowledge for Image Inpainting
- Authors: Jie Yang, Zhiquan Qi, Yong Shi
- Abstract summary: This paper develops a multi-task learning framework that attempts to incorporate the image structure knowledge to assist image inpainting.
The primary idea is to train a shared generator to simultaneously complete the corrupted image and corresponding structures.
In the meantime, we also introduce a structure embedding scheme to explicitly embed the learned structure features into the inpainting process.
- Score: 20.93448933499842
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper develops a multi-task learning framework that attempts to
incorporate the image structure knowledge to assist image inpainting, which is
not well explored in previous works. The primary idea is to train a shared
generator to simultaneously complete the corrupted image and corresponding
structures --- edge and gradient, thus implicitly encouraging the generator to
exploit relevant structure knowledge while inpainting. In the meantime, we also
introduce a structure embedding scheme to explicitly embed the learned
structure features into the inpainting process, thus to provide possible
preconditions for image completion. Specifically, a novel pyramid structure
loss is proposed to supervise structure learning and embedding. Moreover, an
attention mechanism is developed to further exploit the recurrent structures
and patterns in the image to refine the generated structures and contents.
Through multi-task learning, structure embedding besides with attention, our
framework takes advantage of the structure knowledge and outperforms several
state-of-the-art methods on benchmark datasets quantitatively and
qualitatively.
Related papers
- Learning Correlation Structures for Vision Transformers [93.22434535223587]
We introduce a new attention mechanism, dubbed structural self-attention (StructSA)
We generate attention maps by recognizing space-time structures of key-query correlations via convolution.
This effectively leverages rich structural patterns in images and videos such as scene layouts, object motion, and inter-object relations.
arXiv Detail & Related papers (2024-04-05T07:13:28Z) - ViStruct: Visual Structural Knowledge Extraction via Curriculum Guided
Code-Vision Representation [82.88378582161717]
State-of-the-art vision-language models (VLMs) still have limited performance in structural knowledge extraction.
We present ViStruct, a training framework to learn VLMs for effective visual structural knowledge extraction.
arXiv Detail & Related papers (2023-11-22T09:23:34Z) - Structural and Statistical Texture Knowledge Distillation for Semantic
Segmentation [72.67912031720358]
We propose a novel Structural and Statistical Texture Knowledge Distillation (SSTKD) framework for semantic segmentation.
For structural texture knowledge, we introduce a Contourlet Decomposition Module (CDM) that decomposes low-level features.
For statistical texture knowledge, we propose a Denoised Texture Intensity Equalization Module (DTIEM) to adaptively extract and enhance statistical texture knowledge.
arXiv Detail & Related papers (2023-05-06T06:01:11Z) - Joint Language Semantic and Structure Embedding for Knowledge Graph
Completion [66.15933600765835]
We propose to jointly embed the semantics in the natural language description of the knowledge triplets with their structure information.
Our method embeds knowledge graphs for the completion task via fine-tuning pre-trained language models.
Our experiments on a variety of knowledge graph benchmarks have demonstrated the state-of-the-art performance of our method.
arXiv Detail & Related papers (2022-09-19T02:41:02Z) - Unsupervised Structure-Consistent Image-to-Image Translation [6.282068591820945]
The Swapping Autoencoder achieved state-of-the-art performance in deep image manipulation and image-to-image translation.
We improve this work by introducing a simple yet effective auxiliary module based on gradient reversal layers.
The auxiliary module's loss forces the generator to learn to reconstruct an image with an all-zero texture code.
arXiv Detail & Related papers (2022-08-24T13:47:15Z) - Keys to Better Image Inpainting: Structure and Texture Go Hand in Hand [28.32208483559088]
We claim that the performance of inpainting algorithms can be better judged by the generated structures and textures.
In this paper, we propose a novel inpainting network combining the advantages of the two designs.
Our model achieves a remarkable visual quality to match state-of-the-art performance in both structure generation and repeating texture synthesis.
arXiv Detail & Related papers (2022-08-05T20:42:13Z) - Reference-Guided Texture and Structure Inference for Image Inpainting [25.775006005766222]
We build a benchmark dataset containing 10K pairs of input and reference images for reference-guided inpainting.
We adopt an encoder-decoder structure to infer the texture and structure features of the input image.
A feature alignment module is further designed to refine these features of the input image with the guidance of a reference image.
arXiv Detail & Related papers (2022-07-29T06:26:03Z) - Image Inpainting via Conditional Texture and Structure Dual Generation [26.97159780261334]
We propose a novel two-stream network for image inpainting, which models the structure-constrained texture synthesis and texture-guided structure reconstruction.
To enhance the global consistency, a Bi-directional Gated Feature Fusion (Bi-GFF) module is designed to exchange and combine the structure and texture information.
Experiments on the CelebA, Paris StreetView and Places2 datasets demonstrate the superiority of the proposed method.
arXiv Detail & Related papers (2021-08-22T15:44:37Z) - Retinal Image Segmentation with a Structure-Texture Demixing Network [62.69128827622726]
The complex structure and texture information are mixed in a retinal image, and distinguishing the information is difficult.
Existing methods handle texture and structure jointly, which may lead biased models toward recognizing textures and thus results in inferior segmentation performance.
We propose a segmentation strategy that seeks to separate structure and texture components and significantly improve the performance.
arXiv Detail & Related papers (2020-07-15T12:19:03Z) - Guidance and Evaluation: Semantic-Aware Image Inpainting for Mixed
Scenes [54.836331922449666]
We propose a Semantic Guidance and Evaluation Network (SGE-Net) to update the structural priors and the inpainted image.
It utilizes semantic segmentation map as guidance in each scale of inpainting, under which location-dependent inferences are re-evaluated.
Experiments on real-world images of mixed scenes demonstrated the superiority of our proposed method over state-of-the-art approaches.
arXiv Detail & Related papers (2020-03-15T17:49:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.