AvatarMAV: Fast 3D Head Avatar Reconstruction Using Motion-Aware Neural
Voxels
- URL: http://arxiv.org/abs/2211.13206v3
- Date: Wed, 3 May 2023 06:47:11 GMT
- Title: AvatarMAV: Fast 3D Head Avatar Reconstruction Using Motion-Aware Neural
Voxels
- Authors: Yuelang Xu, Lizhen Wang, Xiaochen Zhao, Hongwen Zhang, Yebin Liu
- Abstract summary: We propose AvatarMAV, a fast 3D head avatar reconstruction method using Motion-Aware Neural Voxels.
AvatarMAV is the first to model both the canonical appearance and the decoupled expression motion by neural voxels for head avatar.
The proposed AvatarMAV can recover photo-realistic head avatars in just 5 minutes, which is significantly faster than the state-of-the-art facial reenactment methods.
- Score: 33.085274792188756
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: With NeRF widely used for facial reenactment, recent methods can recover
photo-realistic 3D head avatar from just a monocular video. Unfortunately, the
training process of the NeRF-based methods is quite time-consuming, as MLP used
in the NeRF-based methods is inefficient and requires too many iterations to
converge. To overcome this problem, we propose AvatarMAV, a fast 3D head avatar
reconstruction method using Motion-Aware Neural Voxels. AvatarMAV is the first
to model both the canonical appearance and the decoupled expression motion by
neural voxels for head avatar. In particular, the motion-aware neural voxels is
generated from the weighted concatenation of multiple 4D tensors. The 4D
tensors semantically correspond one-to-one with 3DMM expression basis and share
the same weights as 3DMM expression coefficients. Benefiting from our novel
representation, the proposed AvatarMAV can recover photo-realistic head avatars
in just 5 minutes (implemented with pure PyTorch), which is significantly
faster than the state-of-the-art facial reenactment methods. Project page:
https://www.liuyebin.com/avatarmav.
Related papers
- One2Avatar: Generative Implicit Head Avatar For Few-shot User Adaptation [31.310769289315648]
This paper introduces a novel approach to create high quality head avatar utilizing only a single or a few images per user.
We learn a generative model for 3D animatable photo-realistic head avatar from a multi-view dataset of expressions from 2407 subjects.
Our method demonstrates compelling results and outperforms existing state-of-the-art methods for few-shot avatar adaptation.
arXiv Detail & Related papers (2024-02-19T07:48:29Z) - PSAvatar: A Point-based Shape Model for Real-Time Head Avatar Animation with 3D Gaussian Splatting [17.78639236586134]
PSAvatar is a novel framework for animatable head avatar creation.
It employs 3D Gaussian for fine detail representation and high fidelity rendering.
We show that PSAvatar can reconstruct high-fidelity head avatars of a variety of subjects and the avatars can be animated in real-time.
arXiv Detail & Related papers (2024-01-23T16:40:47Z) - Deformable 3D Gaussian Splatting for Animatable Human Avatars [50.61374254699761]
We propose a fully explicit approach to construct a digital avatar from as little as a single monocular sequence.
ParDy-Human constitutes an explicit model for realistic dynamic human avatars which requires significantly fewer training views and images.
Our avatars learning is free of additional annotations such as Splat masks and can be trained with variable backgrounds while inferring full-resolution images efficiently even on consumer hardware.
arXiv Detail & Related papers (2023-12-22T20:56:46Z) - 3DGS-Avatar: Animatable Avatars via Deformable 3D Gaussian Splatting [32.63571465495127]
We introduce an approach that creates animatable human avatars from monocular videos using 3D Gaussian Splatting (3DGS)
We learn a non-rigid network to reconstruct animatable clothed human avatars that can be trained within 30 minutes and rendered at real-time frame rates (50+ FPS)
Experimental results show that our method achieves comparable and even better performance compared to state-of-the-art approaches on animatable avatar creation from a monocular input.
arXiv Detail & Related papers (2023-12-14T18:54:32Z) - DiffusionAvatars: Deferred Diffusion for High-fidelity 3D Head Avatars [48.50728107738148]
DiffusionAvatars synthesizes a high-fidelity 3D head avatar of a person, offering intuitive control over both pose and expression.
For coarse guidance of the expression and head pose, we render a neural parametric head model (NPHM) from the target viewpoint.
We condition DiffusionAvatars directly on the expression codes obtained from NPHM via cross-attention.
arXiv Detail & Related papers (2023-11-30T15:43:13Z) - OPHAvatars: One-shot Photo-realistic Head Avatars [0.0]
Given a portrait, our method synthesizes a coarse talking head video using driving keypoints features.
With rendered images of the coarse avatar, our method updates the low-quality images with a blind face restoration model.
After several iterations, our method can synthesize a photo-realistic animatable 3D neural head avatar.
arXiv Detail & Related papers (2023-07-18T11:24:42Z) - DreamWaltz: Make a Scene with Complex 3D Animatable Avatars [68.49935994384047]
We present DreamWaltz, a novel framework for generating and animating complex 3D avatars given text guidance and parametric human body prior.
For animation, our method learns an animatable 3D avatar representation from abundant image priors of diffusion model conditioned on various poses.
arXiv Detail & Related papers (2023-05-21T17:59:39Z) - OTAvatar: One-shot Talking Face Avatar with Controllable Tri-plane
Rendering [81.55960827071661]
Controllability, generalizability and efficiency are the major objectives of constructing face avatars represented by neural implicit field.
We propose One-shot Talking face Avatar (OTAvatar), which constructs face avatars by a generalized controllable tri-plane rendering solution.
arXiv Detail & Related papers (2023-03-26T09:12:03Z) - AvatarCLIP: Zero-Shot Text-Driven Generation and Animation of 3D Avatars [37.43588165101838]
AvatarCLIP is a zero-shot text-driven framework for 3D avatar generation and animation.
We take advantage of the powerful vision-language model CLIP for supervising neural human generation.
By leveraging the priors learned in the motion VAE, a CLIP-guided reference-based motion synthesis method is proposed for the animation of the generated 3D avatar.
arXiv Detail & Related papers (2022-05-17T17:59:19Z) - DRaCoN -- Differentiable Rasterization Conditioned Neural Radiance
Fields for Articulated Avatars [92.37436369781692]
We present DRaCoN, a framework for learning full-body volumetric avatars.
It exploits the advantages of both the 2D and 3D neural rendering techniques.
Experiments on the challenging ZJU-MoCap and Human3.6M datasets indicate that DRaCoN outperforms state-of-the-art methods.
arXiv Detail & Related papers (2022-03-29T17:59:15Z) - HVTR: Hybrid Volumetric-Textural Rendering for Human Avatars [65.82222842213577]
We propose a novel neural rendering pipeline, which synthesizes virtual human avatars from arbitrary poses efficiently and at high quality.
First, we learn to encode articulated human motions on a dense UV manifold of the human body surface.
We then leverage the encoded information on the UV manifold to construct a 3D volumetric representation.
arXiv Detail & Related papers (2021-12-19T17:34:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.