Animatable 3D Gaussian: Fast and High-Quality Reconstruction of Multiple
Human Avatars
- URL: http://arxiv.org/abs/2311.16482v2
- Date: Wed, 29 Nov 2023 11:02:47 GMT
- Title: Animatable 3D Gaussian: Fast and High-Quality Reconstruction of Multiple
Human Avatars
- Authors: Yang Liu, Xiang Huang, Minghan Qin, Qinwei Lin, Haoqian Wang
- Abstract summary: We propose Animatable 3D Gaussian, which learns human avatars from input images and poses.
On both novel view synthesis and novel pose synthesis tasks, our method outperforms existing methods in terms of training time, rendering speed, and reconstruction quality.
Our method can be easily extended to multi-human scenes and achieve comparable novel view synthesis results on a scene with ten people in only 25 seconds of training.
- Score: 19.90509999642198
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Neural radiance fields are capable of reconstructing high-quality drivable
human avatars but are expensive to train and render. To reduce consumption, we
propose Animatable 3D Gaussian, which learns human avatars from input images
and poses. We extend 3D Gaussians to dynamic human scenes by modeling a set of
skinned 3D Gaussians and a corresponding skeleton in canonical space and
deforming 3D Gaussians to posed space according to the input poses. We
introduce hash-encoded shape and appearance to speed up training and propose
time-dependent ambient occlusion to achieve high-quality reconstructions in
scenes containing complex motions and dynamic shadows. On both novel view
synthesis and novel pose synthesis tasks, our method outperforms existing
methods in terms of training time, rendering speed, and reconstruction quality.
Our method can be easily extended to multi-human scenes and achieve comparable
novel view synthesis results on a scene with ten people in only 25 seconds of
training.
Related papers
- NECA: Neural Customizable Human Avatar [36.69012172745299]
We introduce NECA, an approach capable of learning versatile human representation from monocular or sparse-view videos.
The core of our approach is to represent humans in complementary dual spaces and predict disentangled neural fields of geometry, albedo, shadow, as well as an external lighting.
arXiv Detail & Related papers (2024-03-15T14:23:06Z) - GVA: Reconstructing Vivid 3D Gaussian Avatars from Monocular Videos [56.40776739573832]
We present a novel method that facilitates the creation of vivid 3D Gaussian avatars from monocular video inputs (GVA)
Our innovation lies in addressing the intricate challenges of delivering high-fidelity human body reconstructions.
We introduce a pose refinement technique to improve hand and foot pose accuracy by aligning normal maps and silhouettes.
arXiv Detail & Related papers (2024-02-26T14:40:15Z) - Deformable 3D Gaussian Splatting for Animatable Human Avatars [50.61374254699761]
We propose a fully explicit approach to construct a digital avatar from as little as a single monocular sequence.
ParDy-Human constitutes an explicit model for realistic dynamic human avatars which requires significantly fewer training views and images.
Our avatars learning is free of additional annotations such as Splat masks and can be trained with variable backgrounds while inferring full-resolution images efficiently even on consumer hardware.
arXiv Detail & Related papers (2023-12-22T20:56:46Z) - 3DGS-Avatar: Animatable Avatars via Deformable 3D Gaussian Splatting [32.63571465495127]
We introduce an approach that creates animatable human avatars from monocular videos using 3D Gaussian Splatting (3DGS)
We learn a non-rigid network to reconstruct animatable clothed human avatars that can be trained within 30 minutes and rendered at real-time frame rates (50+ FPS)
Experimental results show that our method achieves comparable and even better performance compared to state-of-the-art approaches on animatable avatar creation from a monocular input.
arXiv Detail & Related papers (2023-12-14T18:54:32Z) - ASH: Animatable Gaussian Splats for Efficient and Photoreal Human Rendering [62.81677824868519]
We propose an animatable Gaussian splatting approach for photorealistic rendering of dynamic humans in real-time.
We parameterize the clothed human as animatable 3D Gaussians, which can be efficiently splatted into image space to generate the final rendering.
We benchmark ASH with competing methods on pose-controllable avatars, demonstrating that our method outperforms existing real-time methods by a large margin and shows comparable or even better results than offline methods.
arXiv Detail & Related papers (2023-12-10T17:07:37Z) - GauHuman: Articulated Gaussian Splatting from Monocular Human Videos [58.553979884950834]
GauHuman is a 3D human model with Gaussian Splatting for both fast training (1 2 minutes) and real-time rendering (up to 189 FPS)
GauHuman encodes Gaussian Splatting in the canonical space and transforms 3D Gaussians from canonical space to posed space with linear blend skinning (LBS)
Experiments on ZJU_Mocap and MonoCap datasets demonstrate that GauHuman achieves state-of-the-art performance quantitatively and qualitatively with fast training and real-time rendering speed.
arXiv Detail & Related papers (2023-12-05T18:59:14Z) - Human Gaussian Splatting: Real-time Rendering of Animatable Avatars [8.719797382786464]
This work addresses the problem of real-time rendering of photorealistic human body avatars learned from multi-view videos.
We propose an animatable human model based on 3D Gaussian Splatting, that has recently emerged as a very efficient alternative to neural radiance fields.
Our method achieves 1.5 dB PSNR improvement over the state-of-the-art on THuman4 dataset while being able to render in real-time (80 fps for 512x512 resolution)
arXiv Detail & Related papers (2023-11-28T12:05:41Z) - Efficient Meshy Neural Fields for Animatable Human Avatars [87.68529918184494]
Efficiently digitizing high-fidelity animatable human avatars from videos is a challenging and active research topic.
Recent rendering-based neural representations open a new way for human digitization with their friendly usability and photo-varying reconstruction quality.
We present EMA, a method that Efficiently learns Meshy neural fields to reconstruct animatable human Avatars.
arXiv Detail & Related papers (2023-03-23T00:15:34Z) - HDHumans: A Hybrid Approach for High-fidelity Digital Humans [107.19426606778808]
HDHumans is the first method for HD human character synthesis that jointly produces an accurate and temporally coherent 3D deforming surface.
Our method is carefully designed to achieve a synergy between classical surface deformation and neural radiance fields (NeRF)
arXiv Detail & Related papers (2022-10-21T14:42:11Z) - Learning Compositional Radiance Fields of Dynamic Human Heads [13.272666180264485]
We propose a novel compositional 3D representation that combines the best of previous methods to produce both higher-resolution and faster results.
Differentiable volume rendering is employed to compute photo-realistic novel views of the human head and upper body.
Our approach achieves state-of-the-art results for synthesizing novel views of dynamic human heads and the upper body.
arXiv Detail & Related papers (2020-12-17T22:19:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.