Related papers: LatentAvatar: Learning Latent Expression Code for Expressive Neural Head Avatar

LatentAvatar: Learning Latent Expression Code for Expressive Neural Head Avatar

URL: http://arxiv.org/abs/2305.01190v2
Date: Wed, 3 May 2023 06:41:43 GMT
Title: LatentAvatar: Learning Latent Expression Code for Expressive Neural Head Avatar
Authors: Yuelang Xu, Hongwen Zhang, Lizhen Wang, Xiaochen Zhao, Han Huang, Guojun Qi, Yebin Liu
Abstract summary: We present LatentAvatar, an expressive neural head avatar driven by latent expression codes. LatentAvatar is able to capture challenging expressions and the subtle movement of teeth and even eyeballs.
Score: 60.363572621347565
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Existing approaches to animatable NeRF-based head avatars are either built upon face templates or use the expression coefficients of templates as the driving signal. Despite the promising progress, their performances are heavily bound by the expression power and the tracking accuracy of the templates. In this work, we present LatentAvatar, an expressive neural head avatar driven by latent expression codes. Such latent expression codes are learned in an end-to-end and self-supervised manner without templates, enabling our method to get rid of expression and tracking issues. To achieve this, we leverage a latent head NeRF to learn the person-specific latent expression codes from a monocular portrait video, and further design a Y-shaped network to learn the shared latent expression codes of different subjects for cross-identity reenactment. By optimizing the photometric reconstruction objectives in NeRF, the latent expression codes are learned to be 3D-aware while faithfully capturing the high-frequency detailed expressions. Moreover, by learning a mapping between the latent expression code learned in shared and person-specific settings, LatentAvatar is able to perform expressive reenactment between different subjects. Experimental results show that our LatentAvatar is able to capture challenging expressions and the subtle movement of teeth and even eyeballs, which outperforms previous state-of-the-art solutions in both quantitative and qualitative comparisons. Project page: https://www.liuyebin.com/latentavatar.

Related papers

ScaffoldAvatar: High-Fidelity Gaussian Avatars with Patch Expressions [49.34398022152462]
We propose to couple locally-defined facial expressions with 3D Gaussian splatting to enable creating ultra-high fidelity, expressive and photorealistic 3D head avatars.<n>In particular, we leverage a patch-based geometric 3D face model to extract patch expressions and learn how to translate these into local dynamic skin appearance and motion.<n>We employ color-based densification and progressive training to obtain high-quality results and faster convergence for high resolution 3K training images.
arXiv Detail & Related papers (2025-07-14T17:59:03Z)
SEREP: Semantic Facial Expression Representation for Robust In-the-Wild Capture and Retargeting [4.083283519300837]
We propose SEREP (Semantic Expression Representation), a model that disentangles expression from identity at the semantic level. We train a model to predict expression from monocular images using a novel semi-supervised scheme that relies on domain adaptation. Our experiments show that SEREP outperforms state-of-the-art methods, capturing challenging expressions and transferring them to novel identities.
arXiv Detail & Related papers (2024-12-18T22:12:28Z)
FLIER: Few-shot Language Image Models Embedded with Latent Representations [2.443383032451177]
Few-shot Language Image model embedded with latent representations (FLIER) for image recognition. We first generate images and corresponding latent representations via Stable Diffusion with the textual inputs from GPT-3. With latent representations as "models-understandable pixels", we introduce a flexible convolutional neural network with two convolutional layers to be the latent encoder.
arXiv Detail & Related papers (2024-10-10T06:27:46Z)
GaussianHeads: End-to-End Learning of Drivable Gaussian Head Avatars from Coarse-to-fine Representations [54.94362657501809]
We propose a new method to generate highly dynamic and deformable human head avatars from multi-view imagery in real-time. At the core of our method is a hierarchical representation of head models that allows to capture the complex dynamics of facial expressions and head movements. We train this coarse-to-fine facial avatar model along with the head pose as a learnable parameter in an end-to-end framework.
arXiv Detail & Related papers (2024-09-18T13:05:43Z)
DiffusionAvatars: Deferred Diffusion for High-fidelity 3D Head Avatars [48.50728107738148]
DiffusionAvatars synthesizes a high-fidelity 3D head avatar of a person, offering intuitive control over both pose and expression. For coarse guidance of the expression and head pose, we render a neural parametric head model (NPHM) from the target viewpoint. We condition DiffusionAvatars directly on the expression codes obtained from NPHM via cross-attention.
arXiv Detail & Related papers (2023-11-30T15:43:13Z)
BakedAvatar: Baking Neural Fields for Real-Time Head Avatar Synthesis [7.485318043174123]
We introduce BakedAvatar, a novel representation for real-time neural head avatar. Our approach extracts layered meshes from learned isosurfaces of the head and computes expression-, pose-, and view-dependent appearances. Experimental results demonstrate that our representation generates photorealistic results of comparable quality to other state-the-art methods.
arXiv Detail & Related papers (2023-11-09T17:05:53Z)
MA-NeRF: Motion-Assisted Neural Radiance Fields for Face Synthesis from Sparse Images [21.811067296567252]
We propose a novel framework that can reconstruct a high-fidelity drivable face avatar and handle unseen expressions. At the core of our implementation are structured displacement feature and semantic-aware learning module. Our method achieves much better results than the current state-of-the-arts.
arXiv Detail & Related papers (2023-06-17T13:49:56Z)
Generalizable One-shot Neural Head Avatar [90.50492165284724]
We present a method that reconstructs and animates a 3D head avatar from a single-view portrait image. We propose a framework that not only generalizes to unseen identities based on a single-view image, but also captures characteristic details within and beyond the face area.
arXiv Detail & Related papers (2023-06-14T22:33:09Z)
HQ3DAvatar: High Quality Controllable 3D Head Avatar [65.70885416855782]
This paper presents a novel approach to building highly photorealistic digital head avatars. Our method learns a canonical space via an implicit function parameterized by a neural network. At test time, our method is driven by a monocular RGB video.
arXiv Detail & Related papers (2023-03-25T13:56:33Z)
I M Avatar: Implicit Morphable Head Avatars from Videos [68.13409777995392]
We propose IMavatar, a novel method for learning implicit head avatars from monocular videos. Inspired by the fine-grained control mechanisms afforded by conventional 3DMMs, we represent the expression- and pose-related deformations via learned blendshapes and skinning fields. We show quantitatively and qualitatively that our method improves geometry and covers a more complete expression space compared to state-of-the-art methods.
arXiv Detail & Related papers (2021-12-14T15:30:32Z)
VariTex: Variational Neural Face Textures [0.0]
VariTex is a method that learns a variational latent feature space of neural face textures. To generate images of complete human heads, we propose an additive decoder that generates plausible additional details such as hair. The resulting method can generate geometrically consistent images of novel identities allowing fine-grained control over head pose, face shape, and facial expressions.
arXiv Detail & Related papers (2021-04-13T07:47:53Z)

This list is automatically generated from the titles and abstracts of the papers in this site.