Unbiased Multi-Modality Guidance for Image Inpainting
- URL: http://arxiv.org/abs/2208.11844v1
- Date: Thu, 25 Aug 2022 03:13:43 GMT
- Title: Unbiased Multi-Modality Guidance for Image Inpainting
- Authors: Yongsheng Yu, Dawei Du, Libo Zhang, Tiejian Luo
- Abstract summary: We develop an end-to-end multi-modality guided transformer network for image inpainting.
Within each transformer block, the proposed spatial-aware attention module can learn the multi-modal structural features efficiently.
Our method enriches semantically consistent context in an image based on discriminative information from multiple modalities.
- Score: 27.286351511243502
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Image inpainting is an ill-posed problem to recover missing or damaged image
content based on incomplete images with masks. Previous works usually predict
the auxiliary structures (e.g., edges, segmentation and contours) to help fill
visually realistic patches in a multi-stage fashion. However, imprecise
auxiliary priors may yield biased inpainted results. Besides, it is
time-consuming for some methods to be implemented by multiple stages of complex
neural networks. To solve this issue, we develop an end-to-end multi-modality
guided transformer network, including one inpainting branch and two auxiliary
branches for semantic segmentation and edge textures. Within each transformer
block, the proposed multi-scale spatial-aware attention module can learn the
multi-modal structural features efficiently via auxiliary denormalization.
Different from previous methods relying on direct guidance from biased priors,
our method enriches semantically consistent context in an image based on
discriminative interplay information from multiple modalities. Comprehensive
experiments on several challenging image inpainting datasets show that our
method achieves state-of-the-art performance to deal with various
regular/irregular masks efficiently.
Related papers
- Dense Feature Interaction Network for Image Inpainting Localization [28.028361409524457]
Inpainting can be used to conceal or alter image contents in malicious manipulation of images.
Existing methods mostly rely on a basic encoder-decoder structure, which often results in a high number of false positives.
In this paper, we describe a new method for inpainting detection based on a Dense Feature Interaction Network (DeFI-Net)
arXiv Detail & Related papers (2024-08-05T02:35:13Z) - PC-GANs: Progressive Compensation Generative Adversarial Networks for
Pan-sharpening [50.943080184828524]
We propose a novel two-step model for pan-sharpening that sharpens the MS image through the progressive compensation of the spatial and spectral information.
The whole model is composed of triple GANs, and based on the specific architecture, a joint compensation loss function is designed to enable the triple GANs to be trained simultaneously.
arXiv Detail & Related papers (2022-07-29T03:09:21Z) - MAT: Mask-Aware Transformer for Large Hole Image Inpainting [79.67039090195527]
We present a novel model for large hole inpainting, which unifies the merits of transformers and convolutions.
Experiments demonstrate the state-of-the-art performance of the new model on multiple benchmark datasets.
arXiv Detail & Related papers (2022-03-29T06:36:17Z) - Multi-scale Sparse Representation-Based Shadow Inpainting for Retinal
OCT Images [0.261990490798442]
Inpainting shadowed regions cast by superficial blood vessels in retinal optical coherence tomography ( OCT) images is critical for accurate and robust machine analysis and clinical diagnosis.
Traditional sequence-based approaches such as propagating neighboring information to gradually fill in the missing regions are cost-effective.
Deep learning-based methods such as encoder-decoder networks have shown promising results in natural image inpainting tasks.
We propose a novel multi-scale shadow inpainting framework for OCT images by synergically applying sparse representation and deep learning.
arXiv Detail & Related papers (2022-02-23T09:37:14Z) - Adaptive Image Inpainting [43.02281823557039]
Inpainting methods have shown significant improvements by using deep neural networks.
The problem is rooted in the encoder layers' ineffectiveness in building a complete and faithful embedding of the missing regions.
We propose a distillation based approach for inpainting, where we provide direct feature level supervision for the encoder layers.
arXiv Detail & Related papers (2022-01-01T12:16:01Z) - Fully Context-Aware Image Inpainting with a Learned Semantic Pyramid [102.24539566851809]
Restoring reasonable and realistic content for arbitrary missing regions in images is an important yet challenging task.
Recent image inpainting models have made significant progress in generating vivid visual details, but they can still lead to texture blurring or structural distortions.
We propose the Semantic Pyramid Network (SPN) motivated by the idea that learning multi-scale semantic priors can greatly benefit the recovery of locally missing content in images.
arXiv Detail & Related papers (2021-12-08T04:33:33Z) - Image Inpainting with Edge-guided Learnable Bidirectional Attention Maps [85.67745220834718]
We present an edge-guided learnable bidirectional attention map (Edge-LBAM) for improving image inpainting of irregular holes.
Our Edge-LBAM method contains dual procedures,including structure-aware mask-updating guided by predict edges.
Extensive experiments show that our Edge-LBAM is effective in generating coherent image structures and preventing color discrepancy and blurriness.
arXiv Detail & Related papers (2021-04-25T07:25:16Z) - Attention-Based Multimodal Image Matching [16.335191345543063]
We propose an attention-based approach for multimodal image patch matching using a Transformer encoder.
Our encoder is shown to efficiently aggregate multiscale image embeddings while emphasizing task-specific appearance-invariant image cues.
This is the first successful implementation of the Transformer encoder architecture to the multimodal image patch matching task.
arXiv Detail & Related papers (2021-03-20T21:14:24Z) - Free-Form Image Inpainting via Contrastive Attention Network [64.05544199212831]
In image inpainting tasks, masks with any shapes can appear anywhere in images which form complex patterns.
It is difficult for encoders to capture such powerful representations under this complex situation.
We propose a self-supervised Siamese inference network to improve the robustness and generalization.
arXiv Detail & Related papers (2020-10-29T14:46:05Z) - Guidance and Evaluation: Semantic-Aware Image Inpainting for Mixed
Scenes [54.836331922449666]
We propose a Semantic Guidance and Evaluation Network (SGE-Net) to update the structural priors and the inpainted image.
It utilizes semantic segmentation map as guidance in each scale of inpainting, under which location-dependent inferences are re-evaluated.
Experiments on real-world images of mixed scenes demonstrated the superiority of our proposed method over state-of-the-art approaches.
arXiv Detail & Related papers (2020-03-15T17:49:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.