Depth Map Denoising Network and Lightweight Fusion Network for Enhanced
3D Face Recognition
- URL: http://arxiv.org/abs/2401.00719v1
- Date: Mon, 1 Jan 2024 10:46:42 GMT
- Title: Depth Map Denoising Network and Lightweight Fusion Network for Enhanced
3D Face Recognition
- Authors: Ruizhuo Xu, Ke Wang, Chao Deng, Mei Wang, Xi Chen, Wenhui Huang,
Junlan Feng, Weihong Deng
- Abstract summary: We introduce an innovative Depth map denoising network (DMDNet) based on the Denoising Implicit Image Function (DIIF) to reduce noise.
We further design a powerful recognition network called Lightweight Depth and Normal Fusion network (LDNFNet) to learn unique and complementary features between different modalities.
- Score: 61.27785140017464
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: With the increasing availability of consumer depth sensors, 3D face
recognition (FR) has attracted more and more attention. However, the data
acquired by these sensors are often coarse and noisy, making them impractical
to use directly. In this paper, we introduce an innovative Depth map denoising
network (DMDNet) based on the Denoising Implicit Image Function (DIIF) to
reduce noise and enhance the quality of facial depth images for low-quality 3D
FR. After generating clean depth faces using DMDNet, we further design a
powerful recognition network called Lightweight Depth and Normal Fusion network
(LDNFNet), which incorporates a multi-branch fusion block to learn unique and
complementary features between different modalities such as depth and normal
images. Comprehensive experiments conducted on four distinct low-quality
databases demonstrate the effectiveness and robustness of our proposed methods.
Furthermore, when combining DMDNet and LDNFNet, we achieve state-of-the-art
results on the Lock3DFace database.
Related papers
- Pixel-Aligned Multi-View Generation with Depth Guided Decoder [86.1813201212539]
We propose a novel method for pixel-level image-to-multi-view generation.
Unlike prior work, we incorporate attention layers across multi-view images in the VAE decoder of a latent video diffusion model.
Our model enables better pixel alignment across multi-view images.
arXiv Detail & Related papers (2024-08-26T04:56:41Z) - Pyramid Deep Fusion Network for Two-Hand Reconstruction from RGB-D Images [11.100398985633754]
We propose an end-to-end framework for recovering dense meshes for both hands.
Our framework employs ResNet50 and PointNet++ to derive features from RGB and point cloud.
We also introduce a novel pyramid deep fusion network (PDFNet) to aggregate features at different scales.
arXiv Detail & Related papers (2023-07-12T09:33:21Z) - ConsistentNeRF: Enhancing Neural Radiance Fields with 3D Consistency for
Sparse View Synthesis [99.06490355990354]
We propose ConsistentNeRF, a method that leverages depth information to regularize both multi-view and single-view 3D consistency among pixels.
Our approach can considerably enhance model performance in sparse view conditions, achieving improvements of up to 94% in PSNR, in SSIM, and 31% in LPIPS.
arXiv Detail & Related papers (2023-05-18T15:18:01Z) - Joint Learning of Salient Object Detection, Depth Estimation and Contour
Extraction [91.43066633305662]
We propose a novel multi-task and multi-modal filtered transformer (MMFT) network for RGB-D salient object detection (SOD)
Specifically, we unify three complementary tasks: depth estimation, salient object detection and contour estimation. The multi-task mechanism promotes the model to learn the task-aware features from the auxiliary tasks.
Experiments show that it not only significantly surpasses the depth-based RGB-D SOD methods on multiple datasets, but also precisely predicts a high-quality depth map and salient contour at the same time.
arXiv Detail & Related papers (2022-03-09T17:20:18Z) - Facial Depth and Normal Estimation using Single Dual-Pixel Camera [81.02680586859105]
We introduce a DP-oriented Depth/Normal network that reconstructs the 3D facial geometry.
It contains the corresponding ground-truth 3D models including depth map and surface normal in metric scale.
It achieves state-of-the-art performances over recent DP-based depth/normal estimation methods.
arXiv Detail & Related papers (2021-11-25T05:59:27Z) - Robust super-resolution depth imaging via a multi-feature fusion deep
network [2.351601888896043]
Light detection and ranging (LIDAR) via single-photon sensitive detector (SPAD) arrays is an emerging technology that enables the acquisition of depth images at high frame rates.
We develop a deep network built specifically to take advantage of the multiple features that can be extracted from a camera's histogram data.
We apply the network to a range of 3D data, demonstrating denoising and a four-fold resolution enhancement of depth.
arXiv Detail & Related papers (2020-11-20T14:24:12Z) - Self-supervised Depth Denoising Using Lower- and Higher-quality RGB-D
sensors [8.34403807284064]
We propose a self-supervised depth denoising approach to denoise and refine depth coming from a low quality sensor.
We record simultaneous RGB-D sequences with unzynchronized lower- and higher-quality cameras and solve a challenging problem of aligning sequences both temporally and spatially.
We then learn a deep neural network to denoise the lower-quality depth using the matched higher-quality data as a source of supervision signal.
arXiv Detail & Related papers (2020-09-10T11:18:11Z) - 3dDepthNet: Point Cloud Guided Depth Completion Network for Sparse Depth
and Single Color Image [42.13930269841654]
Our network offers a novel 3D-to-2D coarse-to-fine dual densification design that is both accurate and lightweight.
Experiments on the KITTI dataset show our network achieves state-of-art accuracy while being more efficient.
arXiv Detail & Related papers (2020-03-20T10:19:32Z) - Depth Completion Using a View-constrained Deep Prior [73.21559000917554]
Recent work has shown that the structure of convolutional neural networks (CNNs) induces a strong prior that favors natural images.
This prior, known as a deep image prior (DIP), is an effective regularizer in inverse problems such as image denoising and inpainting.
We extend the concept of the DIP to depth images. Given color images and noisy and incomplete target depth maps, we reconstruct a depth map restored by virtue of using the CNN network structure as a prior.
arXiv Detail & Related papers (2020-01-21T21:56:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.