OTAvatar: One-shot Talking Face Avatar with Controllable Tri-plane
Rendering
- URL: http://arxiv.org/abs/2303.14662v1
- Date: Sun, 26 Mar 2023 09:12:03 GMT
- Title: OTAvatar: One-shot Talking Face Avatar with Controllable Tri-plane
Rendering
- Authors: Zhiyuan Ma, Xiangyu Zhu, Guojun Qi, Zhen Lei, Lei Zhang
- Abstract summary: Controllability, generalizability and efficiency are the major objectives of constructing face avatars represented by neural implicit field.
We propose One-shot Talking face Avatar (OTAvatar), which constructs face avatars by a generalized controllable tri-plane rendering solution.
- Score: 81.55960827071661
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Controllability, generalizability and efficiency are the major objectives of
constructing face avatars represented by neural implicit field. However,
existing methods have not managed to accommodate the three requirements
simultaneously. They either focus on static portraits, restricting the
representation ability to a specific subject, or suffer from substantial
computational cost, limiting their flexibility. In this paper, we propose
One-shot Talking face Avatar (OTAvatar), which constructs face avatars by a
generalized controllable tri-plane rendering solution so that each personalized
avatar can be constructed from only one portrait as the reference.
Specifically, OTAvatar first inverts a portrait image to a motion-free identity
code. Second, the identity code and a motion code are utilized to modulate an
efficient CNN to generate a tri-plane formulated volume, which encodes the
subject in the desired motion. Finally, volume rendering is employed to
generate an image in any view. The core of our solution is a novel
decoupling-by-inverting strategy that disentangles identity and motion in the
latent code via optimization-based inversion. Benefiting from the efficient
tri-plane representation, we achieve controllable rendering of generalized face
avatar at $35$ FPS on A100. Experiments show promising performance of
cross-identity reenactment on subjects out of the training set and better 3D
consistency.
Related papers
- Generalizable and Animatable Gaussian Head Avatar [50.34788590904843]
We propose Generalizable and Animatable Gaussian head Avatar (GAGAvatar) for one-shot animatable head avatar reconstruction.
We generate the parameters of 3D Gaussians from a single image in a single forward pass.
Our method exhibits superior performance compared to previous methods in terms of reconstruction quality and expression accuracy.
arXiv Detail & Related papers (2024-10-10T14:29:00Z) - RodinHD: High-Fidelity 3D Avatar Generation with Diffusion Models [56.13752698926105]
We present RodinHD, which can generate high-fidelity 3D avatars from a portrait image.
We first identify an overlooked problem of catastrophic forgetting that arises when fitting triplanes sequentially on many avatars.
We optimize the guiding effect of the portrait image by computing a finer-grained hierarchical representation that captures rich 2D texture cues, and injecting them to the 3D diffusion model at multiple layers via cross-attention.
When trained on 46K avatars with a noise schedule optimized for triplanes, the resulting model can generate 3D avatars with notably better details than previous methods and can generalize to in-the-wild
arXiv Detail & Related papers (2024-07-09T15:14:45Z) - GeneAvatar: Generic Expression-Aware Volumetric Head Avatar Editing from a Single Image [89.70322127648349]
We propose a generic avatar editing approach that can be universally applied to various 3DMM driving volumetric head avatars.
To achieve this goal, we design a novel expression-aware modification generative model, which enables lift 2D editing from a single image to a consistent 3D modification field.
arXiv Detail & Related papers (2024-04-02T17:58:35Z) - One2Avatar: Generative Implicit Head Avatar For Few-shot User Adaptation [31.310769289315648]
This paper introduces a novel approach to create high quality head avatar utilizing only a single or a few images per user.
We learn a generative model for 3D animatable photo-realistic head avatar from a multi-view dataset of expressions from 2407 subjects.
Our method demonstrates compelling results and outperforms existing state-of-the-art methods for few-shot avatar adaptation.
arXiv Detail & Related papers (2024-02-19T07:48:29Z) - GPAvatar: Generalizable and Precise Head Avatar from Image(s) [71.555405205039]
GPAvatar is a framework that reconstructs 3D head avatars from one or several images in a single forward pass.
The proposed method achieves faithful identity reconstruction, precise expression control, and multi-view consistency.
arXiv Detail & Related papers (2024-01-18T18:56:34Z) - AvatarBooth: High-Quality and Customizable 3D Human Avatar Generation [14.062402203105712]
AvatarBooth is a novel method for generating high-quality 3D avatars using text prompts or specific images.
Our key contribution is the precise avatar generation control by using dual fine-tuned diffusion models.
We present a multi-resolution rendering strategy that facilitates coarse-to-fine supervision of 3D avatar generation.
arXiv Detail & Related papers (2023-06-16T14:18:51Z) - Generalizable One-shot Neural Head Avatar [90.50492165284724]
We present a method that reconstructs and animates a 3D head avatar from a single-view portrait image.
We propose a framework that not only generalizes to unseen identities based on a single-view image, but also captures characteristic details within and beyond the face area.
arXiv Detail & Related papers (2023-06-14T22:33:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.