NECA: Neural Customizable Human Avatar
- URL: http://arxiv.org/abs/2403.10335v1
- Date: Fri, 15 Mar 2024 14:23:06 GMT
- Title: NECA: Neural Customizable Human Avatar
- Authors: Junjin Xiao, Qing Zhang, Zhan Xu, Wei-Shi Zheng,
- Abstract summary: We introduce NECA, an approach capable of learning versatile human representation from monocular or sparse-view videos.
The core of our approach is to represent humans in complementary dual spaces and predict disentangled neural fields of geometry, albedo, shadow, as well as an external lighting.
- Score: 36.69012172745299
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Human avatar has become a novel type of 3D asset with various applications. Ideally, a human avatar should be fully customizable to accommodate different settings and environments. In this work, we introduce NECA, an approach capable of learning versatile human representation from monocular or sparse-view videos, enabling granular customization across aspects such as pose, shadow, shape, lighting and texture. The core of our approach is to represent humans in complementary dual spaces and predict disentangled neural fields of geometry, albedo, shadow, as well as an external lighting, from which we are able to derive realistic rendering with high-frequency details via volumetric rendering. Extensive experiments demonstrate the advantage of our method over the state-of-the-art methods in photorealistic rendering, as well as various editing tasks such as novel pose synthesis and relighting. The code is available at https://github.com/iSEE-Laboratory/NECA.
Related papers
- Deformable 3D Gaussian Splatting for Animatable Human Avatars [50.61374254699761]
We propose a fully explicit approach to construct a digital avatar from as little as a single monocular sequence.
ParDy-Human constitutes an explicit model for realistic dynamic human avatars which requires significantly fewer training views and images.
Our avatars learning is free of additional annotations such as Splat masks and can be trained with variable backgrounds while inferring full-resolution images efficiently even on consumer hardware.
arXiv Detail & Related papers (2023-12-22T20:56:46Z) - Reality's Canvas, Language's Brush: Crafting 3D Avatars from Monocular Video [14.140380599168628]
ReCaLaB is a pipeline that learns high-fidelity 3D human avatars from just a single RGB video.
A pose-conditioned NeRF is optimized to volumetrically represent a human subject in canonical T-pose.
An image-conditioned diffusion model thereby helps to animate appearance and pose of the 3D avatar to create video sequences with previously unseen human motion.
arXiv Detail & Related papers (2023-12-08T01:53:06Z) - Animatable 3D Gaussian: Fast and High-Quality Reconstruction of Multiple Human Avatars [18.55354901614876]
We propose Animatable 3D Gaussian, which learns human avatars from input images and poses.
On both novel view synthesis and novel pose synthesis tasks, our method achieves higher reconstruction quality than InstantAvatar with less training time.
Our method can be easily extended to multi-human scenes and achieve comparable novel view synthesis results on a scene with ten people in only 25 seconds of training.
arXiv Detail & Related papers (2023-11-27T08:17:09Z) - FLARE: Fast Learning of Animatable and Relightable Mesh Avatars [64.48254296523977]
Our goal is to efficiently learn personalized animatable 3D head avatars from videos that are geometrically accurate, realistic, relightable, and compatible with current rendering systems.
We introduce FLARE, a technique that enables the creation of animatable and relightable avatars from a single monocular video.
arXiv Detail & Related papers (2023-10-26T16:13:00Z) - Learning Locally Editable Virtual Humans [37.95173373011365]
We propose a novel hybrid representation and end-to-end trainable network architecture to model fully editable neural avatars.
At the core of our work lies a representation that combines the modeling power of neural fields with the ease of use and inherent 3D consistency of skinned meshes.
Our method generates diverse detailed avatars and achieves better model fitting performance compared to state-of-the-art methods.
arXiv Detail & Related papers (2023-04-28T23:06:17Z) - One-shot Implicit Animatable Avatars with Model-based Priors [31.385051428938585]
ELICIT is a novel method for learning human-specific neural radiance fields from a single image.
ELICIT has outperformed strong baseline methods of avatar creation when only a single image is available.
arXiv Detail & Related papers (2022-12-05T18:24:06Z) - Neural Novel Actor: Learning a Generalized Animatable Neural
Representation for Human Actors [98.24047528960406]
We propose a new method for learning a generalized animatable neural representation from a sparse set of multi-view imagery of multiple persons.
The learned representation can be used to synthesize novel view images of an arbitrary person from a sparse set of cameras, and further animate them with the user's pose control.
arXiv Detail & Related papers (2022-08-25T07:36:46Z) - DRaCoN -- Differentiable Rasterization Conditioned Neural Radiance
Fields for Articulated Avatars [92.37436369781692]
We present DRaCoN, a framework for learning full-body volumetric avatars.
It exploits the advantages of both the 2D and 3D neural rendering techniques.
Experiments on the challenging ZJU-MoCap and Human3.6M datasets indicate that DRaCoN outperforms state-of-the-art methods.
arXiv Detail & Related papers (2022-03-29T17:59:15Z) - PINA: Learning a Personalized Implicit Neural Avatar from a Single RGB-D
Video Sequence [60.46092534331516]
We present a novel method to learn Personalized Implicit Neural Avatars (PINA) from a short RGB-D sequence.
PINA does not require complete scans, nor does it require a prior learned from large datasets of clothed humans.
We propose a method to learn the shape and non-rigid deformations via a pose-conditioned implicit surface and a deformation field.
arXiv Detail & Related papers (2022-03-03T15:04:55Z) - Learning Indoor Inverse Rendering with 3D Spatially-Varying Lighting [149.1673041605155]
We address the problem of jointly estimating albedo, normals, depth and 3D spatially-varying lighting from a single image.
Most existing methods formulate the task as image-to-image translation, ignoring the 3D properties of the scene.
We propose a unified, learning-based inverse framework that formulates 3D spatially-varying lighting.
arXiv Detail & Related papers (2021-09-13T15:29:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.