Parametric Implicit Face Representation for Audio-Driven Facial
Reenactment
- URL: http://arxiv.org/abs/2306.07579v1
- Date: Tue, 13 Jun 2023 07:08:22 GMT
- Title: Parametric Implicit Face Representation for Audio-Driven Facial
Reenactment
- Authors: Ricong Huang, Peiwen Lai, Yipeng Qin, Guanbin Li
- Abstract summary: We propose a novel audio-driven facial reenactment framework that is both controllable and can generate high-quality talking heads.
Specifically, our parametric implicit representation parameterizes the implicit representation with interpretable parameters of 3D face models.
Our method can generate more realistic results than previous methods with greater fidelity to the identities and talking styles of speakers.
- Score: 52.33618333954383
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Audio-driven facial reenactment is a crucial technique that has a range of
applications in film-making, virtual avatars and video conferences. Existing
works either employ explicit intermediate face representations (e.g., 2D facial
landmarks or 3D face models) or implicit ones (e.g., Neural Radiance Fields),
thus suffering from the trade-offs between interpretability and expressive
power, hence between controllability and quality of the results. In this work,
we break these trade-offs with our novel parametric implicit face
representation and propose a novel audio-driven facial reenactment framework
that is both controllable and can generate high-quality talking heads.
Specifically, our parametric implicit representation parameterizes the implicit
representation with interpretable parameters of 3D face models, thereby taking
the best of both explicit and implicit methods. In addition, we propose several
new techniques to improve the three components of our framework, including i)
incorporating contextual information into the audio-to-expression parameters
encoding; ii) using conditional image synthesis to parameterize the implicit
representation and implementing it with an innovative tri-plane structure for
efficient learning; iii) formulating facial reenactment as a conditional image
inpainting problem and proposing a novel data augmentation technique to improve
model generalizability. Extensive experiments demonstrate that our method can
generate more realistic results than previous methods with greater fidelity to
the identities and talking styles of speakers.
Related papers
- Talk3D: High-Fidelity Talking Portrait Synthesis via Personalized 3D Generative Prior [29.120669908374424]
We introduce a novel audio-driven talking head synthesis framework, called Talk3D.
It can faithfully reconstruct its plausible facial geometries by effectively adopting the pre-trained 3D-aware generative prior.
Compared to existing methods, our method excels in generating realistic facial geometries even under extreme head poses.
arXiv Detail & Related papers (2024-03-29T12:49:40Z) - FaceTalk: Audio-Driven Motion Diffusion for Neural Parametric Head Models [85.16273912625022]
We introduce FaceTalk, a novel generative approach designed for synthesizing high-fidelity 3D motion sequences of talking human heads from audio signal.
To the best of our knowledge, this is the first work to propose a generative approach for realistic and high-quality motion synthesis of human heads.
arXiv Detail & Related papers (2023-12-13T19:01:07Z) - GSmoothFace: Generalized Smooth Talking Face Generation via Fine Grained
3D Face Guidance [83.43852715997596]
GSmoothFace is a novel two-stage generalized talking face generation model guided by a fine-grained 3d face model.
It can synthesize smooth lip dynamics while preserving the speaker's identity.
Both quantitative and qualitative experiments confirm the superiority of our method in terms of realism, lip synchronization, and visual quality.
arXiv Detail & Related papers (2023-12-12T16:00:55Z) - Probabilistic Speech-Driven 3D Facial Motion Synthesis: New Benchmarks,
Methods, and Applications [20.842799581850617]
We consider the task of animating 3D facial geometry from speech signal.
Existing works are primarily deterministic, focusing on learning a one-to-one mapping from speech signal to 3D face meshes on small datasets with limited speakers.
arXiv Detail & Related papers (2023-11-30T01:14:43Z) - Realistic Speech-to-Face Generation with Speech-Conditioned Latent
Diffusion Model with Face Prior [13.198105709331617]
We propose a novel speech-to-face generation framework, which leverages a Speech-Conditioned Latent Diffusion Model, called SCLDM.
This is the first work to harness the exceptional modeling capabilities of diffusion models for speech-to-face generation.
We show that our method can produce more realistic face images while preserving the identity of the speaker better than state-of-the-art methods.
arXiv Detail & Related papers (2023-10-05T07:44:49Z) - One-Shot High-Fidelity Talking-Head Synthesis with Deformable Neural
Radiance Field [81.07651217942679]
Talking head generation aims to generate faces that maintain the identity information of the source image and imitate the motion of the driving image.
We propose HiDe-NeRF, which achieves high-fidelity and free-view talking-head synthesis.
arXiv Detail & Related papers (2023-04-11T09:47:35Z) - Pose-Controllable 3D Facial Animation Synthesis using Hierarchical
Audio-Vertex Attention [52.63080543011595]
A novel pose-controllable 3D facial animation synthesis method is proposed by utilizing hierarchical audio-vertex attention.
The proposed method can produce more realistic facial expressions and head posture movements.
arXiv Detail & Related papers (2023-02-24T09:36:31Z) - Everything's Talkin': Pareidolia Face Reenactment [119.49707201178633]
Pareidolia Face Reenactment is defined as animating a static illusory face to move in tandem with a human face in the video.
For the large differences between pareidolia face reenactment and traditional human face reenactment, shape variance and texture variance are introduced.
We propose a novel Parametric Unsupervised Reenactment Algorithm to tackle these two challenges.
arXiv Detail & Related papers (2021-04-07T11:19:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.