Dual-Space NeRF: Learning Animatable Avatars and Scene Lighting in
Separate Spaces
- URL: http://arxiv.org/abs/2208.14851v1
- Date: Wed, 31 Aug 2022 13:35:04 GMT
- Title: Dual-Space NeRF: Learning Animatable Avatars and Scene Lighting in
Separate Spaces
- Authors: Yihao Zhi, Shenhan Qian, Xinhao Yan, Shenghua Gao
- Abstract summary: We propose a dual-space NeRF that models the scene lighting and the human body with two skinnings in two separate spaces.
To bridge these two spaces, previous methods mostly rely on the linear blend skinning (LBS) algorithm.
We propose to use the barycentric mapping, which can directly generalize to unseen poses and surprisingly superior results than LBS with neural blending weights.
- Score: 28.99602069185613
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Modeling the human body in a canonical space is a common practice for
capturing and animation. But when involving the neural radiance field (NeRF),
learning a static NeRF in the canonical space is not enough because the
lighting of the body changes when the person moves even though the scene
lighting is constant. Previous methods alleviate the inconsistency of lighting
by learning a per-frame embedding, but this operation does not generalize to
unseen poses. Given that the lighting condition is static in the world space
while the human body is consistent in the canonical space, we propose a
dual-space NeRF that models the scene lighting and the human body with two MLPs
in two separate spaces. To bridge these two spaces, previous methods mostly
rely on the linear blend skinning (LBS) algorithm. However, the blending
weights for LBS of a dynamic neural field are intractable and thus are usually
memorized with another MLP, which does not generalize to novel poses. Although
it is possible to borrow the blending weights of a parametric mesh such as
SMPL, the interpolation operation introduces more artifacts. In this paper, we
propose to use the barycentric mapping, which can directly generalize to unseen
poses and surprisingly achieves superior results than LBS with neural blending
weights. Quantitative and qualitative results on the Human3.6M and the
ZJU-MoCap datasets show the effectiveness of our method.
Related papers
- NECA: Neural Customizable Human Avatar [36.69012172745299]
We introduce NECA, an approach capable of learning versatile human representation from monocular or sparse-view videos.
The core of our approach is to represent humans in complementary dual spaces and predict disentangled neural fields of geometry, albedo, shadow, as well as an external lighting.
arXiv Detail & Related papers (2024-03-15T14:23:06Z) - Relightable Neural Actor with Intrinsic Decomposition and Pose Control [80.06094206522668]
We propose Relightable Neural Actor, a new video-based method for learning a pose-driven neural human model that can be relighted.
For training, our method solely requires a multi-view recording of the human under a known, but static lighting condition.
To evaluate our approach in real-world scenarios, we collect a new dataset with four identities recorded under different light conditions, indoors and outdoors.
arXiv Detail & Related papers (2023-12-18T14:30:13Z) - VINECS: Video-based Neural Character Skinning [82.39776643541383]
We propose a fully automated approach for creating a fully rigged character with pose-dependent skinning weights.
We show that our approach outperforms state-of-the-art while not relying on dense 4D scans.
arXiv Detail & Related papers (2023-07-03T08:35:53Z) - DiFaReli: Diffusion Face Relighting [13.000032155650835]
We present a novel approach to single-view face relighting in the wild.
Handling non-diffuse effects, such as global illumination or cast shadows, has long been a challenge in face relighting.
We achieve state-of-the-art performance on standard benchmark Multi-PIE and can photorealistically relight in-the-wild images.
arXiv Detail & Related papers (2023-04-19T08:03:20Z) - MoDA: Modeling Deformable 3D Objects from Casual Videos [84.29654142118018]
We propose neural dual quaternion blend skinning (NeuDBS) to achieve 3D point deformation without skin-collapsing artifacts.
In the endeavor to register 2D pixels across different frames, we establish a correspondence between canonical feature embeddings that encodes 3D points within the canonical space.
Our approach can reconstruct 3D models for humans and animals with better qualitative and quantitative performance than state-of-the-art methods.
arXiv Detail & Related papers (2023-04-17T13:49:04Z) - DINAR: Diffusion Inpainting of Neural Textures for One-Shot Human
Avatars [7.777410338143783]
We present an approach for creating realistic rigged fullbody avatars from single RGB images.
Our method uses neural textures combined with the SMPL-X body model to achieve photo-realistic quality of avatars.
In the experiments, our approach achieves state-of-the-art rendering quality and good generalization to new poses and viewpoints.
arXiv Detail & Related papers (2023-03-16T15:04:10Z) - SCANimate: Weakly Supervised Learning of Skinned Clothed Avatar Networks [54.94737477860082]
We present an end-to-end trainable framework that takes raw 3D scans of a clothed human and turns them into an animatable avatar.
SCANimate does not rely on a customized mesh template or surface mesh registration.
Our method can be applied to pose-aware appearance modeling to generate a fully textured avatar.
arXiv Detail & Related papers (2021-04-07T17:59:58Z) - STAR: Sparse Trained Articulated Human Body Regressor [62.71047277943326]
We introduce STAR, which is quantitatively and qualitatively superior to SMPL.
SMPL has a huge number of parameters resulting from its use of global blend shapes.
SMPL factors pose-dependent deformations from body shape while, in reality, people with different shapes deform differently.
We show that the shape space of SMPL is not rich enough to capture the variation in the human population.
arXiv Detail & Related papers (2020-08-19T16:27:55Z) - Fusing Wearable IMUs with Multi-View Images for Human Pose Estimation: A
Geometric Approach [76.10879433430466]
We propose to estimate 3D human pose from multi-view images and a few IMUs attached at person's limbs.
It operates by firstly detecting 2D poses from the two signals, and then lifting them to the 3D space.
The simple two-step approach reduces the error of the state-of-the-art by a large margin on a public dataset.
arXiv Detail & Related papers (2020-03-25T00:26:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.