MonoNPHM: Dynamic Head Reconstruction from Monocular Videos
- URL: http://arxiv.org/abs/2312.06740v2
- Date: Wed, 29 May 2024 20:30:06 GMT
- Title: MonoNPHM: Dynamic Head Reconstruction from Monocular Videos
- Authors: Simon Giebenhain, Tobias Kirschstein, Markos Georgopoulos, Martin Rünz, Lourdes Agapito, Matthias Nießner,
- Abstract summary: We present Monocular Neural Parametric Head Models (MonoNPHM) for dynamic 3D head reconstructions from monocular RGB videos.
We constrain predicted color values to be correlated with the underlying geometry such that gradients from RGB effectively influence latent geometry codes during inverse rendering.
- Score: 47.504979561265536
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We present Monocular Neural Parametric Head Models (MonoNPHM) for dynamic 3D head reconstructions from monocular RGB videos. To this end, we propose a latent appearance space that parameterizes a texture field on top of a neural parametric model. We constrain predicted color values to be correlated with the underlying geometry such that gradients from RGB effectively influence latent geometry codes during inverse rendering. To increase the representational capacity of our expression space, we augment our backward deformation field with hyper-dimensions, thus improving color and geometry representation in topologically challenging expressions. Using MonoNPHM as a learned prior, we approach the task of 3D head reconstruction using signed distance field based volumetric rendering. By numerically inverting our backward deformation field, we incorporated a landmark loss using facial anchor points that are closely tied to our canonical geometry representation. To evaluate the task of dynamic face reconstruction from monocular RGB videos we record 20 challenging Kinect sequences under casual conditions. MonoNPHM outperforms all baselines with a significant margin, and makes an important step towards easily accessible neural parametric face models through RGB tracking.
Related papers
- Normal-guided Detail-Preserving Neural Implicit Functions for High-Fidelity 3D Surface Reconstruction [6.4279213810512665]
Current methods for learning neural implicit representations from RGB or RGBD images produce 3D surfaces with missing parts and details.
This paper demonstrates that training neural representations with first-order differential properties, i.e. surface normals, leads to highly accurate 3D surface reconstruction.
arXiv Detail & Related papers (2024-06-07T11:48:47Z) - MorpheuS: Neural Dynamic 360° Surface Reconstruction from Monocular RGB-D Video [14.678582015968916]
We introduce MorpheuS, a framework for dynamic 360deg surface reconstruction from a casually captured RGB-D video.
Our approach models the target scene as a canonical field that encodes its geometry and appearance.
We leverage a view-dependent diffusion prior and distill knowledge from it to achieve realistic completion of unobserved regions.
arXiv Detail & Related papers (2023-12-01T18:55:53Z) - DynamicSurf: Dynamic Neural RGB-D Surface Reconstruction with an
Optimizable Feature Grid [7.702806654565181]
DynamicSurf is a model-free neural implicit surface reconstruction method for high-fidelity 3D modelling of non-rigid surfaces from monocular RGB-D video.
We learn a neural deformation field that maps a canonical representation of the surface geometry to the current frame.
We demonstrate it can optimize sequences of varying frames with $6$ speedup over pure-based approaches.
arXiv Detail & Related papers (2023-11-14T13:39:01Z) - DiViNeT: 3D Reconstruction from Disparate Views via Neural Template
Regularization [7.488962492863031]
We present a volume rendering-based neural surface reconstruction method that takes as few as three disparate RGB images as input.
Our key idea is to regularize the reconstruction, which is severely ill-posed and leaving significant gaps between the sparse views.
Our approach achieves the best reconstruction quality among existing methods in the presence of such sparse views.
arXiv Detail & Related papers (2023-06-07T18:05:14Z) - Learning Personalized High Quality Volumetric Head Avatars from
Monocular RGB Videos [47.94545609011594]
We propose a method to learn a high-quality implicit 3D head avatar from a monocular RGB video captured in the wild.
Our hybrid pipeline combines the geometry prior and dynamic tracking of a 3DMM with a neural radiance field to achieve fine-grained control and photorealism.
arXiv Detail & Related papers (2023-04-04T01:10:04Z) - NeRF-DS: Neural Radiance Fields for Dynamic Specular Objects [63.04781030984006]
Dynamic Neural Radiance Field (NeRF) is a powerful algorithm capable of rendering photo-realistic novel view images from a monocular RGB video of a dynamic scene.
We address the limitation by reformulating the neural radiance field function to be conditioned on surface position and orientation in the observation space.
We evaluate our model based on the novel view synthesis quality with a self-collected dataset of different moving specular objects in realistic environments.
arXiv Detail & Related papers (2023-03-25T11:03:53Z) - NeuPhysics: Editable Neural Geometry and Physics from Monocular Videos [82.74918564737591]
We present a method for learning 3D geometry and physics parameters of a dynamic scene from only a monocular RGB video input.
Experiments show that our method achieves superior mesh and video reconstruction of dynamic scenes compared to competing Neural Field approaches.
arXiv Detail & Related papers (2022-10-22T04:57:55Z) - Learning Dynamic View Synthesis With Few RGBD Cameras [60.36357774688289]
We propose to utilize RGBD cameras to synthesize free-viewpoint videos of dynamic indoor scenes.
We generate point clouds from RGBD frames and then render them into free-viewpoint videos via a neural feature.
We introduce a simple Regional Depth-Inpainting module that adaptively inpaints missing depth values to render complete novel views.
arXiv Detail & Related papers (2022-04-22T03:17:35Z) - Learning Deformable Tetrahedral Meshes for 3D Reconstruction [78.0514377738632]
3D shape representations that accommodate learning-based 3D reconstruction are an open problem in machine learning and computer graphics.
Previous work on neural 3D reconstruction demonstrated benefits, but also limitations, of point cloud, voxel, surface mesh, and implicit function representations.
We introduce Deformable Tetrahedral Meshes (DefTet) as a particular parameterization that utilizes volumetric tetrahedral meshes for the reconstruction problem.
arXiv Detail & Related papers (2020-11-03T02:57:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.