Facial Depth and Normal Estimation using Single Dual-Pixel Camera
- URL: http://arxiv.org/abs/2111.12928v1
- Date: Thu, 25 Nov 2021 05:59:27 GMT
- Title: Facial Depth and Normal Estimation using Single Dual-Pixel Camera
- Authors: Minjun Kang, Jaesung Choe, Hyowon Ha, Hae-Gon Jeon, Sunghoon Im, In So
Kweon
- Abstract summary: We introduce a DP-oriented Depth/Normal network that reconstructs the 3D facial geometry.
It contains the corresponding ground-truth 3D models including depth map and surface normal in metric scale.
It achieves state-of-the-art performances over recent DP-based depth/normal estimation methods.
- Score: 81.02680586859105
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Many mobile manufacturers recently have adopted Dual-Pixel (DP) sensors in
their flagship models for faster auto-focus and aesthetic image captures.
Despite their advantages, research on their usage for 3D facial understanding
has been limited due to the lack of datasets and algorithmic designs that
exploit parallax in DP images. This is because the baseline of sub-aperture
images is extremely narrow and parallax exists in the defocus blur region. In
this paper, we introduce a DP-oriented Depth/Normal network that reconstructs
the 3D facial geometry. For this purpose, we collect a DP facial data with more
than 135K images for 101 persons captured with our multi-camera structured
light systems. It contains the corresponding ground-truth 3D models including
depth map and surface normal in metric scale. Our dataset allows the proposed
matching network to be generalized for 3D facial depth/normal estimation. The
proposed network consists of two novel modules: Adaptive Sampling Module and
Adaptive Normal Module, which are specialized in handling the defocus blur in
DP images. Finally, the proposed method achieves state-of-the-art performances
over recent DP-based depth/normal estimation methods. We also demonstrate the
applicability of the estimated depth/normal to face spoofing and relighting.
Related papers
- Pixel-Aligned Multi-View Generation with Depth Guided Decoder [86.1813201212539]
We propose a novel method for pixel-level image-to-multi-view generation.
Unlike prior work, we incorporate attention layers across multi-view images in the VAE decoder of a latent video diffusion model.
Our model enables better pixel alignment across multi-view images.
arXiv Detail & Related papers (2024-08-26T04:56:41Z) - Metric3Dv2: A Versatile Monocular Geometric Foundation Model for Zero-shot Metric Depth and Surface Normal Estimation [74.28509379811084]
Metric3D v2 is a geometric foundation model for zero-shot metric depth and surface normal estimation from a single image.
We propose solutions for both metric depth estimation and surface normal estimation.
Our method enables the accurate recovery of metric 3D structures on randomly collected internet images.
arXiv Detail & Related papers (2024-03-22T02:30:46Z) - Depth Map Denoising Network and Lightweight Fusion Network for Enhanced
3D Face Recognition [61.27785140017464]
We introduce an innovative Depth map denoising network (DMDNet) based on the Denoising Implicit Image Function (DIIF) to reduce noise.
We further design a powerful recognition network called Lightweight Depth and Normal Fusion network (LDNFNet) to learn unique and complementary features between different modalities.
arXiv Detail & Related papers (2024-01-01T10:46:42Z) - GEDepth: Ground Embedding for Monocular Depth Estimation [4.95394574147086]
This paper proposes a novel ground embedding module to decouple camera parameters from pictorial cues.
A ground attention is designed in the module to optimally combine ground depth with residual depth.
Experiments reveal that our approach achieves the state-of-the-art results on popular benchmarks.
arXiv Detail & Related papers (2023-09-18T17:56:06Z) - ARAI-MVSNet: A multi-view stereo depth estimation network with adaptive
depth range and depth interval [19.28042366225802]
Multi-View Stereo(MVS) is a fundamental problem in geometric computer vision.
We present a novel multi-stage coarse-to-fine framework to achieve adaptive all-pixel depth range and depth interval.
Our model achieves state-of-the-art performance and yields competitive generalization ability.
arXiv Detail & Related papers (2023-08-17T14:52:11Z) - CrossDTR: Cross-view and Depth-guided Transformers for 3D Object
Detection [10.696619570924778]
We propose Cross-view and Depth-guided Transformers for 3D Object Detection, CrossDTR.
Our method hugely surpassed existing multi-camera methods by 10 percent in pedestrian detection and about 3 percent in overall mAP and NDS metrics.
arXiv Detail & Related papers (2022-09-27T16:23:12Z) - Deep and Shallow Covariance Feature Quantization for 3D Facial
Expression Recognition [7.773399781313892]
We propose a multi-modal 2D + 3D feature-based method for facial expression recognition.
We extract shallow features from the 3D images, and deep features using Convolutional Neural Networks (CNN) from the transformed 2D images.
High classification performances have been achieved on the BU-3DFE and Bosphorus datasets.
arXiv Detail & Related papers (2021-05-12T14:48:39Z) - Dual Pixel Exploration: Simultaneous Depth Estimation and Image
Restoration [77.1056200937214]
We study the formation of the DP pair which links the blur and the depth information.
We propose an end-to-end DDDNet (DP-based Depth and De Network) to jointly estimate the depth and restore the image.
arXiv Detail & Related papers (2020-12-01T06:53:57Z) - Lightweight Multi-View 3D Pose Estimation through Camera-Disentangled
Representation [57.11299763566534]
We present a solution to recover 3D pose from multi-view images captured with spatially calibrated cameras.
We exploit 3D geometry to fuse input images into a unified latent representation of pose, which is disentangled from camera view-points.
Our architecture then conditions the learned representation on camera projection operators to produce accurate per-view 2d detections.
arXiv Detail & Related papers (2020-04-05T12:52:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.