Learning Semantic Facial Descriptors for Accurate Face Animation
- URL: http://arxiv.org/abs/2501.17718v1
- Date: Wed, 29 Jan 2025 15:40:42 GMT
- Title: Learning Semantic Facial Descriptors for Accurate Face Animation
- Authors: Lei Zhu, Yuanqi Chen, Xiaohang Liu, Thomas H. Li, Ge Li,
- Abstract summary: We introduce the semantic facial descriptors in learnable disentangled vector space to address the dilemma.
We obtain basis vector coefficients by employing an encoder on the source and driving faces, leading to effective facial descriptors in the identity and motion subspaces.
Our approach successfully addresses the issue of model-based methods' limitations in high-fidelity identity and the challenges faced by model-free methods in accurate motion transfer.
- Score: 43.370084532812044
- License:
- Abstract: Face animation is a challenging task. Existing model-based methods (utilizing 3DMMs or landmarks) often result in a model-like reconstruction effect, which doesn't effectively preserve identity. Conversely, model-free approaches face challenges in attaining a decoupled and semantically rich feature space, thereby making accurate motion transfer difficult to achieve. We introduce the semantic facial descriptors in learnable disentangled vector space to address the dilemma. The approach involves decoupling the facial space into identity and motion subspaces while endowing each of them with semantics by learning complete orthogonal basis vectors. We obtain basis vector coefficients by employing an encoder on the source and driving faces, leading to effective facial descriptors in the identity and motion subspaces. Ultimately, these descriptors can be recombined as latent codes to animate faces. Our approach successfully addresses the issue of model-based methods' limitations in high-fidelity identity and the challenges faced by model-free methods in accurate motion transfer. Extensive experiments are conducted on three challenging benchmarks (i.e. VoxCeleb, HDTF, CelebV). Comprehensive quantitative and qualitative results demonstrate that our model outperforms SOTA methods with superior identity preservation and motion transfer.
Related papers
- OSDFace: One-Step Diffusion Model for Face Restoration [72.5045389847792]
Diffusion models have demonstrated impressive performance in face restoration.
We propose OSDFace, a novel one-step diffusion model for face restoration.
Results demonstrate that OSDFace surpasses current state-of-the-art (SOTA) methods in both visual quality and quantitative metrics.
arXiv Detail & Related papers (2024-11-26T07:07:48Z) - Realistic and Efficient Face Swapping: A Unified Approach with Diffusion Models [69.50286698375386]
We propose a novel approach that better harnesses diffusion models for face-swapping.
We introduce a mask shuffling technique during inpainting training, which allows us to create a so-called universal model for swapping.
Ours is a relatively unified approach and so it is resilient to errors in other off-the-shelf models.
arXiv Detail & Related papers (2024-09-11T13:43:53Z) - ScanTalk: 3D Talking Heads from Unregistered Scans [13.003073077799835]
We present textbfScanTalk, a novel framework capable of animating 3D faces in arbitrary topologies including scanned data.
Our approach relies on the DiffusionNet architecture to overcome the fixed topology constraint, offering promising avenues for more flexible and realistic 3D animations.
arXiv Detail & Related papers (2024-03-16T14:58:58Z) - High-Fidelity Face Swapping with Style Blending [16.024260677867076]
We propose an innovative end-to-end framework for high-fidelity face swapping.
First, we introduce a StyleGAN-based facial attributes encoder that extracts essential features from faces and inverts them into a latent style code.
Second, we introduce an attention-based style blending module to effectively transfer Face IDs from source to target.
arXiv Detail & Related papers (2023-12-17T23:22:37Z) - ImFace++: A Sophisticated Nonlinear 3D Morphable Face Model with Implicit Neural Representations [25.016000421755162]
This paper presents a novel 3D morphable face model, named ImFace++, to learn a sophisticated and continuous space with implicit neural representations.
ImFace++ first constructs two explicitly disentangled deformation fields to model complex shapes associated with identities and expressions.
A refinement displacement field within the template space is further incorporated, enabling fine-grained learning of individual-specific facial details.
arXiv Detail & Related papers (2023-12-07T03:53:53Z) - Decaf: Monocular Deformation Capture for Face and Hand Interactions [77.75726740605748]
This paper introduces the first method that allows tracking human hands interacting with human faces in 3D from single monocular RGB videos.
We model hands as articulated objects inducing non-rigid face deformations during an active interaction.
Our method relies on a new hand-face motion and interaction capture dataset with realistic face deformations acquired with a markerless multi-view camera system.
arXiv Detail & Related papers (2023-09-28T17:59:51Z) - GaFET: Learning Geometry-aware Facial Expression Translation from
In-The-Wild Images [55.431697263581626]
We introduce a novel Geometry-aware Facial Expression Translation framework, which is based on parametric 3D facial representations and can stably decoupled expression.
We achieve higher-quality and more accurate facial expression transfer results compared to state-of-the-art methods, and demonstrate applicability of various poses and complex textures.
arXiv Detail & Related papers (2023-08-07T09:03:35Z) - CodeTalker: Speech-Driven 3D Facial Animation with Discrete Motion Prior [27.989344587876964]
Speech-driven 3D facial animation has been widely studied, yet there is still a gap to achieving realism and vividness.
We propose to cast speech-driven facial animation as a code query task in a finite proxy space of the learned codebook.
We demonstrate that our approach outperforms current state-of-the-art methods both qualitatively and quantitatively.
arXiv Detail & Related papers (2023-01-06T05:04:32Z) - Neural Face Models for Example-Based Visual Speech Synthesis [2.2817442144155207]
We present a marker-less approach for facial motion capture based on multi-view video.
We learn a neural representation of facial expressions, which is used to seamlessly facial performances during the animation procedure.
arXiv Detail & Related papers (2020-09-22T07:35:33Z) - InterFaceGAN: Interpreting the Disentangled Face Representation Learned
by GANs [73.27299786083424]
We propose a framework called InterFaceGAN to interpret the disentangled face representation learned by state-of-the-art GAN models.
We first find that GANs learn various semantics in some linear subspaces of the latent space.
We then conduct a detailed study on the correlation between different semantics and manage to better disentangle them via subspace projection.
arXiv Detail & Related papers (2020-05-18T18:01:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.