Related papers: Image Inpainting by End-to-End Cascaded Refinement with Mask Awareness

Image Inpainting by End-to-End Cascaded Refinement with Mask Awareness

URL: http://arxiv.org/abs/2104.13743v1
Date: Wed, 28 Apr 2021 13:17:47 GMT
Title: Image Inpainting by End-to-End Cascaded Refinement with Mask Awareness
Authors: Manyu Zhu, Dongliang He, Xin Li, Chao Li, Fu Li, Xiao Liu, Errui Ding and Zhaoxiang Zhang
Abstract summary: Inpainting arbitrary missing regions is challenging because learning valid features for various masked regions is nontrivial. We propose a novel mask-aware inpainting solution that learns multi-scale features for missing regions in the encoding phase. Our framework is validated both quantitatively and qualitatively via extensive experiments on three public datasets.
Score: 66.55719330810547
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Inpainting arbitrary missing regions is challenging because learning valid features for various masked regions is nontrivial. Though U-shaped encoder-decoder frameworks have been witnessed to be successful, most of them share a common drawback of mask unawareness in feature extraction because all convolution windows (or regions), including those with various shapes of missing pixels, are treated equally and filtered with fixed learned kernels. To this end, we propose our novel mask-aware inpainting solution. Firstly, a Mask-Aware Dynamic Filtering (MADF) module is designed to effectively learn multi-scale features for missing regions in the encoding phase. Specifically, filters for each convolution window are generated from features of the corresponding region of the mask. The second fold of mask awareness is achieved by adopting Point-wise Normalization (PN) in our decoding phase, considering that statistical natures of features at masked points differentiate from those of unmasked points. The proposed PN can tackle this issue by dynamically assigning point-wise scaling factor and bias. Lastly, our model is designed to be an end-to-end cascaded refinement one. Supervision information such as reconstruction loss, perceptual loss and total variation loss is incrementally leveraged to boost the inpainting results from coarse to fine. Effectiveness of the proposed framework is validated both quantitatively and qualitatively via extensive experiments on three public datasets including Places2, CelebA and Paris StreetView.

Related papers

ColorMAE: Exploring data-independent masking strategies in Masked AutoEncoders [53.3185750528969]
Masked AutoEncoders (MAE) have emerged as a robust self-supervised framework. We introduce a data-independent method, termed ColorMAE, which generates different binary mask patterns by filtering random noise. We demonstrate our strategy's superiority in downstream tasks compared to random masking.
arXiv Detail & Related papers (2024-07-17T22:04:00Z)
Mask2Anomaly: Mask Transformer for Universal Open-set Segmentation [29.43462426812185]
We propose a paradigm change by shifting from a per-pixel classification to a mask classification. Our mask-based method, Mask2Anomaly, demonstrates the feasibility of integrating a mask-classification architecture. By comprehensive qualitative and qualitative evaluation, we show Mask2Anomaly achieves new state-of-the-art results.
arXiv Detail & Related papers (2023-09-08T20:07:18Z)
Unmasking Anomalies in Road-Scene Segmentation [18.253109627901566]
Anomaly segmentation is a critical task for driving applications. We propose a paradigm change by shifting from a per-pixel classification to a mask classification. Mask2Anomaly demonstrates the feasibility of integrating an anomaly detection method in a mask-classification architecture.
arXiv Detail & Related papers (2023-07-25T08:23:10Z)
MixMask: Revisiting Masking Strategy for Siamese ConvNets [23.946791390657875]
This work introduces a novel filling-based masking approach, termed textbfMixMask. The proposed method replaces erased areas with content from a different image, effectively countering the information depletion seen in traditional masking methods. We empirically validate our framework's enhanced performance in areas such as linear probing, semi-supervised and supervised finetuning, object detection and segmentation.
arXiv Detail & Related papers (2022-10-20T17:54:03Z)
Layered Depth Refinement with Mask Guidance [61.10654666344419]
We formulate a novel problem of mask-guided depth refinement that utilizes a generic mask to refine the depth prediction of SIDE models. Our framework performs layered refinement and inpainting/outpainting, decomposing the depth map into two separate layers signified by the mask and the inverse mask. We empirically show that our method is robust to different types of masks and initial depth predictions, accurately refining depth values in inner and outer mask boundary regions.
arXiv Detail & Related papers (2022-06-07T06:42:44Z)
What You See is What You Classify: Black Box Attributions [61.998683569022006]
We train a deep network, the Explainer, to predict attributions for a pre-trained black-box classifier, the Explanandum. Unlike most existing approaches, ours is capable of directly generating very distinct class-specific masks. We show that our attributions are superior to established methods both visually and quantitatively.
arXiv Detail & Related papers (2022-05-23T12:30:04Z)
FT-TDR: Frequency-guided Transformer and Top-Down Refinement Network for Blind Face Inpainting [77.78305705925376]
Blind face inpainting refers to the task of reconstructing visual contents without explicitly indicating the corrupted regions in a face image. We propose a novel two-stage blind face inpainting method named Frequency-guided Transformer and Top-Down Refinement Network (FT-TDR) to tackle these challenges.
arXiv Detail & Related papers (2021-08-10T03:12:01Z)
Image Inpainting with Edge-guided Learnable Bidirectional Attention Maps [85.67745220834718]
We present an edge-guided learnable bidirectional attention map (Edge-LBAM) for improving image inpainting of irregular holes. Our Edge-LBAM method contains dual procedures,including structure-aware mask-updating guided by predict edges. Extensive experiments show that our Edge-LBAM is effective in generating coherent image structures and preventing color discrepancy and blurriness.
arXiv Detail & Related papers (2021-04-25T07:25:16Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.