Deformable Model-Driven Neural Rendering for High-Fidelity 3D
Reconstruction of Human Heads Under Low-View Settings
- URL: http://arxiv.org/abs/2303.13855v2
- Date: Thu, 17 Aug 2023 09:33:47 GMT
- Title: Deformable Model-Driven Neural Rendering for High-Fidelity 3D
Reconstruction of Human Heads Under Low-View Settings
- Authors: Baixin Xu, Jiarui Zhang, Kwan-Yee Lin, Chen Qian and Ying He
- Abstract summary: Reconstructing 3D human heads in low-view settings presents technical challenges.
We propose geometry decomposition and adopt a two-stage, coarse-to-fine training strategy.
Our method outperforms existing neural rendering approaches in terms of reconstruction accuracy and novel view synthesis under low-view settings.
- Score: 20.07788905506271
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Reconstructing 3D human heads in low-view settings presents technical
challenges, mainly due to the pronounced risk of overfitting with limited views
and high-frequency signals. To address this, we propose geometry decomposition
and adopt a two-stage, coarse-to-fine training strategy, allowing for
progressively capturing high-frequency geometric details. We represent 3D human
heads using the zero level-set of a combined signed distance field, comprising
a smooth template, a non-rigid deformation, and a high-frequency displacement
field. The template captures features that are independent of both identity and
expression and is co-trained with the deformation network across multiple
individuals with sparse and randomly selected views. The displacement field,
capturing individual-specific details, undergoes separate training for each
person. Our network training does not require 3D supervision or object masks.
Experimental results demonstrate the effectiveness and robustness of our
geometry decomposition and two-stage training strategy. Our method outperforms
existing neural rendering approaches in terms of reconstruction accuracy and
novel view synthesis under low-view settings. Moreover, the pre-trained
template serves a good initialization for our model when encountering unseen
individuals.
Related papers
- Enhancing Generalizability of Representation Learning for Data-Efficient 3D Scene Understanding [50.448520056844885]
We propose a generative Bayesian network to produce diverse synthetic scenes with real-world patterns.
A series of experiments robustly display our method's consistent superiority over existing state-of-the-art pre-training approaches.
arXiv Detail & Related papers (2024-06-17T07:43:53Z) - MultiPly: Reconstruction of Multiple People from Monocular Video in the Wild [32.6521941706907]
We present MultiPly, a novel framework to reconstruct multiple people in 3D from monocular in-the-wild videos.
We first define a layered neural representation for the entire scene, composited by individual human and background models.
We learn the layered neural representation from videos via our layer-wise differentiable volume rendering.
arXiv Detail & Related papers (2024-06-03T17:59:57Z) - Preface: A Data-driven Volumetric Prior for Few-shot Ultra
High-resolution Face Synthesis [0.0]
NeRFs have enabled highly realistic synthesis of human faces including complex appearance and reflectance effects of hair and skin.
We propose a novel human face prior that enables the synthesis of ultra high-resolution novel views of subjects that are not part of the prior's training distribution.
arXiv Detail & Related papers (2023-09-28T21:21:44Z) - Learning Neural Parametric Head Models [7.679586286000453]
We propose a novel 3D morphable model for complete human heads based on hybrid neural fields.
We capture a person's identity in a canonical space as a signed distance field (SDF), and model facial expressions with a neural deformation field.
Our representation achieves high-fidelity local detail by introducing an ensemble of local fields centered around facial anchor points.
arXiv Detail & Related papers (2022-12-06T05:24:42Z) - NeuralReshaper: Single-image Human-body Retouching with Deep Neural
Networks [50.40798258968408]
We present NeuralReshaper, a novel method for semantic reshaping of human bodies in single images using deep generative networks.
Our approach follows a fit-then-reshape pipeline, which first fits a parametric 3D human model to a source human image.
To deal with the lack-of-data problem that no paired data exist, we introduce a novel self-supervised strategy to train our network.
arXiv Detail & Related papers (2022-03-20T09:02:13Z) - Scene Synthesis via Uncertainty-Driven Attribute Synchronization [52.31834816911887]
This paper introduces a novel neural scene synthesis approach that can capture diverse feature patterns of 3D scenes.
Our method combines the strength of both neural network-based and conventional scene synthesis approaches.
arXiv Detail & Related papers (2021-08-30T19:45:07Z) - H3D-Net: Few-Shot High-Fidelity 3D Head Reconstruction [27.66008315400462]
Recent learning approaches that implicitly represent surface geometry have shown impressive results in the problem of multi-view 3D reconstruction.
We tackle these limitations for the specific problem of few-shot full 3D head reconstruction.
We learn a shape model of 3D heads from thousands of incomplete raw scans using implicit representations.
arXiv Detail & Related papers (2021-07-26T23:04:18Z) - Neural Actor: Neural Free-view Synthesis of Human Actors with Pose
Control [80.79820002330457]
We propose a new method for high-quality synthesis of humans from arbitrary viewpoints and under arbitrary controllable poses.
Our method achieves better quality than the state-of-the-arts on playback as well as novel pose synthesis, and can even generalize well to new poses that starkly differ from the training poses.
arXiv Detail & Related papers (2021-06-03T17:40:48Z) - Shape My Face: Registering 3D Face Scans by Surface-to-Surface
Translation [75.59415852802958]
Shape-My-Face (SMF) is a powerful encoder-decoder architecture based on an improved point cloud encoder, a novel visual attention mechanism, graph convolutional decoders with skip connections, and a specialized mouth model.
Our model provides topologically-sound meshes with minimal supervision, offers faster training time, has orders of magnitude fewer trainable parameters, is more robust to noise, and can generalize to previously unseen datasets.
arXiv Detail & Related papers (2020-12-16T20:02:36Z) - Neural Descent for Visual 3D Human Pose and Shape [67.01050349629053]
We present deep neural network methodology to reconstruct the 3d pose and shape of people, given an input RGB image.
We rely on a recently introduced, expressivefull body statistical 3d human model, GHUM, trained end-to-end.
Central to our methodology, is a learning to learn and optimize approach, referred to as HUmanNeural Descent (HUND), which avoids both second-order differentiation.
arXiv Detail & Related papers (2020-08-16T13:38:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.