Related papers: FlexAvatar: Learning Complete 3D Head Avatars with Partial Supervision

FlexAvatar: Learning Complete 3D Head Avatars with Partial Supervision

URL: http://arxiv.org/abs/2512.15599v1
Date: Wed, 17 Dec 2025 17:09:52 GMT
Title: FlexAvatar: Learning Complete 3D Head Avatars with Partial Supervision
Authors: Tobias Kirschstein, Simon Giebenhain, Matthias Nießner,
Abstract summary: We introduce FlexAvatar, a method for creating high-quality and complete 3D head avatars from a single image.<n>Our training procedure yields a smooth latent avatar space that facilitates identity and flexible fitting to an arbitrary number of input observations.
Score: 54.69512425050288
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We introduce FlexAvatar, a method for creating high-quality and complete 3D head avatars from a single image. A core challenge lies in the limited availability of multi-view data and the tendency of monocular training to yield incomplete 3D head reconstructions. We identify the root cause of this issue as the entanglement between driving signal and target viewpoint when learning from monocular videos. To address this, we propose a transformer-based 3D portrait animation model with learnable data source tokens, so-called bias sinks, which enables unified training across monocular and multi-view datasets. This design leverages the strengths of both data sources during inference: strong generalization from monocular data and full 3D completeness from multi-view supervision. Furthermore, our training procedure yields a smooth latent avatar space that facilitates identity interpolation and flexible fitting to an arbitrary number of input observations. In extensive evaluations on single-view, few-shot, and monocular avatar creation tasks, we verify the efficacy of FlexAvatar. Many existing methods struggle with view extrapolation while FlexAvatar generates complete 3D head avatars with realistic facial animations. Website: https://tobias-kirschstein.github.io/flexavatar/

Related papers

LiftAvatar: Kinematic-Space Completion for Expression-Controlled 3D Gaussian Avatar Animation [9.736861648552408]
We present LiftAvatar, a new paradigm that completes sparse monocular observations in kinematic space.<n>It uses the completed signals to drive high-fidelity avatar animation.
arXiv Detail & Related papers (2026-03-02T17:46:32Z)
Generalizable and Animatable 3D Full-Head Gaussian Avatar from a Single Image [9.505520774467263]
Building 3D animatable head avatars from a single image is an important yet challenging problem.<n>Existing methods generally collapse under large camera pose variations, compromising the realism of 3D avatars.<n>We propose a new framework to tackle the novel setting of one-shot 3D full-head animatable avatar reconstruction in a single feed-forward pass.
arXiv Detail & Related papers (2026-01-19T06:56:58Z)
Avat3r: Large Animatable Gaussian Reconstruction Model for High-fidelity 3D Head Avatars [60.0866477932976]
We present Avat3r, which regresses a high-quality and animatable 3D head avatar from just a few input images.<n>We make Large Reconstruction Models animatable and learn a powerful prior over 3D human heads from a large multi-view video dataset.<n>We increase robustness by feeding input images with different expressions to our model during training, enabling the reconstruction of 3D head avatars from inconsistent inputs.
arXiv Detail & Related papers (2025-02-27T16:00:11Z)
FAGhead: Fully Animate Gaussian Head from Monocular Videos [2.9979421496374683]
FAGhead is a method that enables fully controllable human portraits from monocular videos. We explicit the traditional 3D morphable meshes (3DMM) and optimize the neutral 3D Gaussians to reconstruct with complex expressions. To effectively manage the edges of avatars, we introduced the alpha rendering to supervise the alpha value of each pixel.
arXiv Detail & Related papers (2024-06-27T10:40:35Z)
GPAvatar: Generalizable and Precise Head Avatar from Image(s) [71.555405205039]
GPAvatar is a framework that reconstructs 3D head avatars from one or several images in a single forward pass. The proposed method achieves faithful identity reconstruction, precise expression control, and multi-view consistency.
arXiv Detail & Related papers (2024-01-18T18:56:34Z)
OTAvatar: One-shot Talking Face Avatar with Controllable Tri-plane Rendering [81.55960827071661]
Controllability, generalizability and efficiency are the major objectives of constructing face avatars represented by neural implicit field. We propose One-shot Talking face Avatar (OTAvatar), which constructs face avatars by a generalized controllable tri-plane rendering solution.
arXiv Detail & Related papers (2023-03-26T09:12:03Z)
PointAvatar: Deformable Point-based Head Avatars from Videos [103.43941945044294]
PointAvatar is a deformable point-based representation that disentangles the source color into intrinsic albedo and normal-dependent shading. We show that our method is able to generate animatable 3D avatars using monocular videos from multiple sources.
arXiv Detail & Related papers (2022-12-16T10:05:31Z)
DRaCoN -- Differentiable Rasterization Conditioned Neural Radiance Fields for Articulated Avatars [92.37436369781692]
We present DRaCoN, a framework for learning full-body volumetric avatars. It exploits the advantages of both the 2D and 3D neural rendering techniques. Experiments on the challenging ZJU-MoCap and Human3.6M datasets indicate that DRaCoN outperforms state-of-the-art methods.
arXiv Detail & Related papers (2022-03-29T17:59:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.