Pose-Controllable 3D Facial Animation Synthesis using Hierarchical
Audio-Vertex Attention
- URL: http://arxiv.org/abs/2302.12532v1
- Date: Fri, 24 Feb 2023 09:36:31 GMT
- Title: Pose-Controllable 3D Facial Animation Synthesis using Hierarchical
Audio-Vertex Attention
- Authors: Bin Liu, Xiaolin Wei, Bo Li, Junjie Cao, Yu-Kun Lai
- Abstract summary: A novel pose-controllable 3D facial animation synthesis method is proposed by utilizing hierarchical audio-vertex attention.
The proposed method can produce more realistic facial expressions and head posture movements.
- Score: 52.63080543011595
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Most of the existing audio-driven 3D facial animation methods suffered from
the lack of detailed facial expression and head pose, resulting in
unsatisfactory experience of human-robot interaction. In this paper, a novel
pose-controllable 3D facial animation synthesis method is proposed by utilizing
hierarchical audio-vertex attention. To synthesize real and detailed
expression, a hierarchical decomposition strategy is proposed to encode the
audio signal into both a global latent feature and a local vertex-wise control
feature. Then the local and global audio features combined with vertex spatial
features are used to predict the final consistent facial animation via a graph
convolutional neural network by fusing the intrinsic spatial topology structure
of the face model and the corresponding semantic feature of the audio. To
accomplish pose-controllable animation, we introduce a novel pose attribute
augmentation method by utilizing the 2D talking face technique. Experimental
results indicate that the proposed method can produce more realistic facial
expressions and head posture movements. Qualitative and quantitative
experiments show that the proposed method achieves competitive performance
against state-of-the-art methods.
Related papers
- Speech2UnifiedExpressions: Synchronous Synthesis of Co-Speech Affective Face and Body Expressions from Affordable Inputs [67.27840327499625]
We present a multimodal learning-based method to simultaneously synthesize co-speech facial expressions and upper-body gestures for digital characters.
Our approach learns from sparse face landmarks and upper-body joints, estimated directly from video data, to generate plausible emotive character motions.
arXiv Detail & Related papers (2024-06-26T04:53:11Z) - NeRFFaceSpeech: One-shot Audio-driven 3D Talking Head Synthesis via Generative Prior [5.819784482811377]
We propose a novel method, NeRFFaceSpeech, which enables to produce high-quality 3D-aware talking head.
Our method can craft a 3D-consistent facial feature space corresponding to a single image.
We also introduce LipaintNet that can replenish the lacking information in the inner-mouth area.
arXiv Detail & Related papers (2024-05-09T13:14:06Z) - Talk3D: High-Fidelity Talking Portrait Synthesis via Personalized 3D Generative Prior [29.120669908374424]
We introduce a novel audio-driven talking head synthesis framework, called Talk3D.
It can faithfully reconstruct its plausible facial geometries by effectively adopting the pre-trained 3D-aware generative prior.
Compared to existing methods, our method excels in generating realistic facial geometries even under extreme head poses.
arXiv Detail & Related papers (2024-03-29T12:49:40Z) - FaceTalk: Audio-Driven Motion Diffusion for Neural Parametric Head Models [85.16273912625022]
We introduce FaceTalk, a novel generative approach designed for synthesizing high-fidelity 3D motion sequences of talking human heads from audio signal.
To the best of our knowledge, this is the first work to propose a generative approach for realistic and high-quality motion synthesis of human heads.
arXiv Detail & Related papers (2023-12-13T19:01:07Z) - DF-3DFace: One-to-Many Speech Synchronized 3D Face Animation with
Diffusion [68.85904927374165]
We propose DF-3DFace, a diffusion-driven speech-to-3D face mesh synthesis.
It captures the complex one-to-many relationships between speech and 3D face based on diffusion.
It simultaneously achieves more realistic facial animation than the state-of-the-art methods.
arXiv Detail & Related papers (2023-08-23T04:14:55Z) - Parametric Implicit Face Representation for Audio-Driven Facial
Reenactment [52.33618333954383]
We propose a novel audio-driven facial reenactment framework that is both controllable and can generate high-quality talking heads.
Specifically, our parametric implicit representation parameterizes the implicit representation with interpretable parameters of 3D face models.
Our method can generate more realistic results than previous methods with greater fidelity to the identities and talking styles of speakers.
arXiv Detail & Related papers (2023-06-13T07:08:22Z) - Live Speech Portraits: Real-Time Photorealistic Talking-Head Animation [12.552355581481999]
We first present a live system that generates personalized photorealistic talking-head animation only driven by audio signals at over 30 fps.
The first stage is a deep neural network that extracts deep audio features along with a manifold projection to project the features to the target person's speech space.
In the second stage, we learn facial dynamics and motions from the projected audio features.
In the final stage, we generate conditional feature maps from previous predictions and send them with a candidate image set to an image-to-image translation network to synthesize photorealistic renderings.
arXiv Detail & Related papers (2021-09-22T08:47:43Z) - MeshTalk: 3D Face Animation from Speech using Cross-Modality
Disentanglement [142.9900055577252]
We propose a generic audio-driven facial animation approach that achieves highly realistic motion synthesis results for the entire face.
Our approach ensures highly accurate lip motion, while also plausible animation of the parts of the face that are uncorrelated to the audio signal, such as eye blinks and eye brow motion.
arXiv Detail & Related papers (2021-04-16T17:05:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.