Related papers: SelfieAvatar: Real-time Head Avatar reenactment from a Selfie Video

SelfieAvatar: Real-time Head Avatar reenactment from a Selfie Video

URL: http://arxiv.org/abs/2601.18851v1
Date: Mon, 26 Jan 2026 14:26:16 GMT
Title: SelfieAvatar: Real-time Head Avatar reenactment from a Selfie Video
Authors: Wei Liang, Hui Yu, Derui Ding, Rachael E. Jack, Philippe G. Schyns,
Abstract summary: This study introduces a method for detailed head avatar reenactment using a selfie video.<n>A detailed reconstruction model is proposed, incorporating mixed loss functions for foreground reconstruction and avatar image generation.
Score: 8.770698303337428
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Head avatar reenactment focuses on creating animatable personal avatars from monocular videos, serving as a foundational element for applications like social signal understanding, gaming, human-machine interaction, and computer vision. Recent advances in 3D Morphable Model (3DMM)-based facial reconstruction methods have achieved remarkable high-fidelity face estimation. However, on the one hand, they struggle to capture the entire head, including non-facial regions and background details in real time, which is an essential aspect for producing realistic, high-fidelity head avatars. On the other hand, recent approaches leveraging generative adversarial networks (GANs) for head avatar generation from videos can achieve high-quality reenactments but encounter limitations in reproducing fine-grained head details, such as wrinkles and hair textures. In addition, existing methods generally rely on a large amount of training data, and rarely focus on using only a simple selfie video to achieve avatar reenactment. To address these challenges, this study introduces a method for detailed head avatar reenactment using a selfie video. The approach combines 3DMMs with a StyleGAN-based generator. A detailed reconstruction model is proposed, incorporating mixed loss functions for foreground reconstruction and avatar image generation during adversarial training to recover high-frequency details. Qualitative and quantitative evaluations on self-reenactment and cross-reenactment tasks demonstrate that the proposed method achieves superior head avatar reconstruction with rich and intricate textures compared to existing approaches.

Related papers

Generalizable and Animatable Gaussian Head Avatar [50.34788590904843]
We propose Generalizable and Animatable Gaussian head Avatar (GAGAvatar) for one-shot animatable head avatar reconstruction. We generate the parameters of 3D Gaussians from a single image in a single forward pass. Our method exhibits superior performance compared to previous methods in terms of reconstruction quality and expression accuracy.
arXiv Detail & Related papers (2024-10-10T14:29:00Z)
GaussianHeads: End-to-End Learning of Drivable Gaussian Head Avatars from Coarse-to-fine Representations [54.94362657501809]
We propose a new method to generate highly dynamic and deformable human head avatars from multi-view imagery in real-time. At the core of our method is a hierarchical representation of head models that allows to capture the complex dynamics of facial expressions and head movements. We train this coarse-to-fine facial avatar model along with the head pose as a learnable parameter in an end-to-end framework.
arXiv Detail & Related papers (2024-09-18T13:05:43Z)
VOODOO XP: Expressive One-Shot Head Reenactment for VR Telepresence [14.010324388059866]
VOODOO XP is a 3D-aware one-shot head reenactment method that can generate highly expressive facial expressions from any input driver video and a single 2D portrait. We show our solution on a monocular video setting and an end-to-end VR telepresence system for two-way communication.
arXiv Detail & Related papers (2024-05-25T12:33:40Z)
One2Avatar: Generative Implicit Head Avatar For Few-shot User Adaptation [31.310769289315648]
This paper introduces a novel approach to create high quality head avatar utilizing only a single or a few images per user. We learn a generative model for 3D animatable photo-realistic head avatar from a multi-view dataset of expressions from 2407 subjects. Our method demonstrates compelling results and outperforms existing state-of-the-art methods for few-shot avatar adaptation.
arXiv Detail & Related papers (2024-02-19T07:48:29Z)
GPAvatar: Generalizable and Precise Head Avatar from Image(s) [71.555405205039]
GPAvatar is a framework that reconstructs 3D head avatars from one or several images in a single forward pass. The proposed method achieves faithful identity reconstruction, precise expression control, and multi-view consistency.
arXiv Detail & Related papers (2024-01-18T18:56:34Z)
Generalizable One-shot Neural Head Avatar [90.50492165284724]
We present a method that reconstructs and animates a 3D head avatar from a single-view portrait image. We propose a framework that not only generalizes to unseen identities based on a single-view image, but also captures characteristic details within and beyond the face area.
arXiv Detail & Related papers (2023-06-14T22:33:09Z)
HeadSculpt: Crafting 3D Head Avatars with Text [143.14548696613886]
We introduce a versatile pipeline dubbed HeadSculpt for crafting 3D head avatars from textual prompts. We first equip the diffusion model with 3D awareness by leveraging landmark-based control and a learned textual embedding. We propose a novel identity-aware editing score distillation strategy to optimize a textured mesh with a high-resolution differentiable rendering technique.
arXiv Detail & Related papers (2023-06-05T16:53:58Z)
High-fidelity Facial Avatar Reconstruction from Monocular Video with Generative Priors [29.293166730794606]
We propose a new method for NeRF-based facial avatar reconstruction that utilizes 3D-aware generative prior. Compared with existing works, we obtain superior novel view synthesis results and faithfully face reenactment performance.
arXiv Detail & Related papers (2022-11-28T04:49:46Z)
Head2HeadFS: Video-based Head Reenactment with Few-shot Learning [64.46913473391274]
Head reenactment is a challenging task, which aims at transferring the entire head pose from a source person to a target. We propose head2headFS, a novel easily adaptable pipeline for head reenactment. Our video-based rendering network is fine-tuned under a few-shot learning strategy, using only a few samples.
arXiv Detail & Related papers (2021-03-30T10:19:41Z)
Head2Head++: Deep Facial Attributes Re-Targeting [6.230979482947681]
We leverage the 3D geometry of faces and Generative Adversarial Networks (GANs) to design a novel deep learning architecture for the task of facial and head reenactment. We manage to capture the complex non-rigid facial motion from the driving monocular performances and synthesise temporally consistent videos. Our system performs end-to-end reenactment in nearly real-time speed (18 fps)
arXiv Detail & Related papers (2020-06-17T23:38:37Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.