Layered Depth Refinement with Mask Guidance
- URL: http://arxiv.org/abs/2206.03048v1
- Date: Tue, 7 Jun 2022 06:42:44 GMT
- Title: Layered Depth Refinement with Mask Guidance
- Authors: Soo Ye Kim, Jianming Zhang, Simon Niklaus, Yifei Fan, Simon Chen, Zhe
Lin, Munchurl Kim
- Abstract summary: We formulate a novel problem of mask-guided depth refinement that utilizes a generic mask to refine the depth prediction of SIDE models.
Our framework performs layered refinement and inpainting/outpainting, decomposing the depth map into two separate layers signified by the mask and the inverse mask.
We empirically show that our method is robust to different types of masks and initial depth predictions, accurately refining depth values in inner and outer mask boundary regions.
- Score: 61.10654666344419
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Depth maps are used in a wide range of applications from 3D rendering to 2D
image effects such as Bokeh. However, those predicted by single image depth
estimation (SIDE) models often fail to capture isolated holes in objects and/or
have inaccurate boundary regions. Meanwhile, high-quality masks are much easier
to obtain, using commercial auto-masking tools or off-the-shelf methods of
segmentation and matting or even by manual editing. Hence, in this paper, we
formulate a novel problem of mask-guided depth refinement that utilizes a
generic mask to refine the depth prediction of SIDE models. Our framework
performs layered refinement and inpainting/outpainting, decomposing the depth
map into two separate layers signified by the mask and the inverse mask. As
datasets with both depth and mask annotations are scarce, we propose a
self-supervised learning scheme that uses arbitrary masks and RGB-D datasets.
We empirically show that our method is robust to different types of masks and
initial depth predictions, accurately refining depth values in inner and outer
mask boundary regions. We further analyze our model with an ablation study and
demonstrate results on real applications. More information can be found at
https://sooyekim.github.io/MaskDepth/ .
Related papers
- DiffSTR: Controlled Diffusion Models for Scene Text Removal [5.790630195329777]
Scene Text Removal (STR) aims to prevent unauthorized use of text in images.
STR faces several challenges, including boundary artifacts, inconsistent texture and color, and preserving correct shadows.
We introduce a ControlNet diffusion model, treating STR as an inpainting task.
We develop a mask pretraining pipeline to condition our diffusion model.
arXiv Detail & Related papers (2024-10-29T04:20:21Z) - ColorMAE: Exploring data-independent masking strategies in Masked AutoEncoders [53.3185750528969]
Masked AutoEncoders (MAE) have emerged as a robust self-supervised framework.
We introduce a data-independent method, termed ColorMAE, which generates different binary mask patterns by filtering random noise.
We demonstrate our strategy's superiority in downstream tasks compared to random masking.
arXiv Detail & Related papers (2024-07-17T22:04:00Z) - MonoMAE: Enhancing Monocular 3D Detection through Depth-Aware Masked Autoencoders [93.87585467898252]
We design MonoMAE, a monocular 3D detector inspired by Masked Autoencoders.
MonoMAE consists of two novel designs. The first is depth-aware masking that selectively masks certain parts of non-occluded object queries.
The second is lightweight query completion that works with the depth-aware masking to learn to reconstruct and complete the masked object queries.
arXiv Detail & Related papers (2024-05-13T12:32:45Z) - Mask Hierarchical Features For Self-Supervised Learning [23.140060988999352]
This paper shows that Masking the Deep hierarchical features is an efficient self-supervised method, denoted as MaskDeep.
We mask part of patches in the representation space and then utilize sparse visible patches to reconstruct high semantic image representation.
Trained on ResNet50 with 200 epochs, MaskDeep achieves state-of-the-art results of 71.2% Top1 accuracy linear classification on ImageNet.
arXiv Detail & Related papers (2023-04-01T04:14:57Z) - MM-3DScene: 3D Scene Understanding by Customizing Masked Modeling with
Informative-Preserved Reconstruction and Self-Distilled Consistency [120.9499803967496]
We propose a novel informative-preserved reconstruction, which explores local statistics to discover and preserve the representative structured points.
Our method can concentrate on modeling regional geometry and enjoy less ambiguity for masked reconstruction.
By combining informative-preserved reconstruction on masked areas and consistency self-distillation from unmasked areas, a unified framework called MM-3DScene is yielded.
arXiv Detail & Related papers (2022-12-20T01:53:40Z) - Towards Improved Input Masking for Convolutional Neural Networks [66.99060157800403]
We propose a new masking method for CNNs we call layer masking.
We show that our method is able to eliminate or minimize the influence of the mask shape or color on the output of the model.
We also demonstrate how the shape of the mask may leak information about the class, thus affecting estimates of model reliance on class-relevant features.
arXiv Detail & Related papers (2022-11-26T19:31:49Z) - Masked Face Inpainting Through Residual Attention UNet [0.7868449549351486]
This paper proposes a blind mask face inpainting method using residual attention UNet.
A residual block feeds info to the next layer and directly into the layers about two hops away to solve the vanishing gradient problem.
Experiments on the publicly available CelebA dataset show the feasibility and robustness of our proposed model.
arXiv Detail & Related papers (2022-09-19T08:49:53Z) - A Deep Learning Framework to Reconstruct Face under Mask [0.0]
The purpose of this work is to extract the mask region from a masked image and rebuild the area that has been detected.
This problem is complex because (i) it is difficult to determine the gender of an image hidden behind a mask, which causes the network to become confused and reconstruct the male face as a female or vice versa.
To solve this complex task, we split the problem into three phases: landmark detection, object detection for the targeted mask area, and inpainting the addressed mask region.
arXiv Detail & Related papers (2022-03-23T15:23:24Z) - High-Accuracy RGB-D Face Recognition via Segmentation-Aware Face Depth
Estimation and Mask-Guided Attention Network [16.50097148165777]
Deep learning approaches have achieved highly accurate face recognition by training the models with very large face image datasets.
Unlike the availability of large 2D face image datasets, there is a lack of large 3D face datasets available to the public.
This paper proposes two CNN models to improve the RGB-D face recognition task.
arXiv Detail & Related papers (2021-12-22T07:46:23Z) - Image Inpainting by End-to-End Cascaded Refinement with Mask Awareness [66.55719330810547]
Inpainting arbitrary missing regions is challenging because learning valid features for various masked regions is nontrivial.
We propose a novel mask-aware inpainting solution that learns multi-scale features for missing regions in the encoding phase.
Our framework is validated both quantitatively and qualitatively via extensive experiments on three public datasets.
arXiv Detail & Related papers (2021-04-28T13:17:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.