Image Inpainting by End-to-End Cascaded Refinement with Mask Awareness
- URL: http://arxiv.org/abs/2104.13743v1
- Date: Wed, 28 Apr 2021 13:17:47 GMT
- Title: Image Inpainting by End-to-End Cascaded Refinement with Mask Awareness
- Authors: Manyu Zhu, Dongliang He, Xin Li, Chao Li, Fu Li, Xiao Liu, Errui Ding
and Zhaoxiang Zhang
- Abstract summary: Inpainting arbitrary missing regions is challenging because learning valid features for various masked regions is nontrivial.
We propose a novel mask-aware inpainting solution that learns multi-scale features for missing regions in the encoding phase.
Our framework is validated both quantitatively and qualitatively via extensive experiments on three public datasets.
- Score: 66.55719330810547
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Inpainting arbitrary missing regions is challenging because learning valid
features for various masked regions is nontrivial. Though U-shaped
encoder-decoder frameworks have been witnessed to be successful, most of them
share a common drawback of mask unawareness in feature extraction because all
convolution windows (or regions), including those with various shapes of
missing pixels, are treated equally and filtered with fixed learned kernels. To
this end, we propose our novel mask-aware inpainting solution. Firstly, a
Mask-Aware Dynamic Filtering (MADF) module is designed to effectively learn
multi-scale features for missing regions in the encoding phase. Specifically,
filters for each convolution window are generated from features of the
corresponding region of the mask. The second fold of mask awareness is achieved
by adopting Point-wise Normalization (PN) in our decoding phase, considering
that statistical natures of features at masked points differentiate from those
of unmasked points. The proposed PN can tackle this issue by dynamically
assigning point-wise scaling factor and bias. Lastly, our model is designed to
be an end-to-end cascaded refinement one. Supervision information such as
reconstruction loss, perceptual loss and total variation loss is incrementally
leveraged to boost the inpainting results from coarse to fine. Effectiveness of
the proposed framework is validated both quantitatively and qualitatively via
extensive experiments on three public datasets including Places2, CelebA and
Paris StreetView.
Related papers
- ColorMAE: Exploring data-independent masking strategies in Masked AutoEncoders [53.3185750528969]
Masked AutoEncoders (MAE) have emerged as a robust self-supervised framework.
We introduce a data-independent method, termed ColorMAE, which generates different binary mask patterns by filtering random noise.
We demonstrate our strategy's superiority in downstream tasks compared to random masking.
arXiv Detail & Related papers (2024-07-17T22:04:00Z) - Mask2Anomaly: Mask Transformer for Universal Open-set Segmentation [29.43462426812185]
We propose a paradigm change by shifting from a per-pixel classification to a mask classification.
Our mask-based method, Mask2Anomaly, demonstrates the feasibility of integrating a mask-classification architecture.
By comprehensive qualitative and qualitative evaluation, we show Mask2Anomaly achieves new state-of-the-art results.
arXiv Detail & Related papers (2023-09-08T20:07:18Z) - Unmasking Anomalies in Road-Scene Segmentation [18.253109627901566]
Anomaly segmentation is a critical task for driving applications.
We propose a paradigm change by shifting from a per-pixel classification to a mask classification.
Mask2Anomaly demonstrates the feasibility of integrating an anomaly detection method in a mask-classification architecture.
arXiv Detail & Related papers (2023-07-25T08:23:10Z) - Layered Depth Refinement with Mask Guidance [61.10654666344419]
We formulate a novel problem of mask-guided depth refinement that utilizes a generic mask to refine the depth prediction of SIDE models.
Our framework performs layered refinement and inpainting/outpainting, decomposing the depth map into two separate layers signified by the mask and the inverse mask.
We empirically show that our method is robust to different types of masks and initial depth predictions, accurately refining depth values in inner and outer mask boundary regions.
arXiv Detail & Related papers (2022-06-07T06:42:44Z) - What You See is What You Classify: Black Box Attributions [61.998683569022006]
We train a deep network, the Explainer, to predict attributions for a pre-trained black-box classifier, the Explanandum.
Unlike most existing approaches, ours is capable of directly generating very distinct class-specific masks.
We show that our attributions are superior to established methods both visually and quantitatively.
arXiv Detail & Related papers (2022-05-23T12:30:04Z) - FT-TDR: Frequency-guided Transformer and Top-Down Refinement Network for
Blind Face Inpainting [77.78305705925376]
Blind face inpainting refers to the task of reconstructing visual contents without explicitly indicating the corrupted regions in a face image.
We propose a novel two-stage blind face inpainting method named Frequency-guided Transformer and Top-Down Refinement Network (FT-TDR) to tackle these challenges.
arXiv Detail & Related papers (2021-08-10T03:12:01Z) - Image Inpainting with Edge-guided Learnable Bidirectional Attention Maps [85.67745220834718]
We present an edge-guided learnable bidirectional attention map (Edge-LBAM) for improving image inpainting of irregular holes.
Our Edge-LBAM method contains dual procedures,including structure-aware mask-updating guided by predict edges.
Extensive experiments show that our Edge-LBAM is effective in generating coherent image structures and preventing color discrepancy and blurriness.
arXiv Detail & Related papers (2021-04-25T07:25:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.