Video-Driven Animation of Neural Head Avatars
- URL: http://arxiv.org/abs/2403.04380v1
- Date: Thu, 7 Mar 2024 10:13:48 GMT
- Title: Video-Driven Animation of Neural Head Avatars
- Authors: Wolfgang Paier and Paul Hinzer and Anna Hilsmann and Peter Eisert
- Abstract summary: We present a new approach for video-driven animation of high-quality neural 3D head models.
We introduce an LSTM-based animation network capable of translating person-independent expression features into personalized animation parameters.
- Score: 3.5229503563299915
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: We present a new approach for video-driven animation of high-quality neural
3D head models, addressing the challenge of person-independent animation from
video input. Typically, high-quality generative models are learned for specific
individuals from multi-view video footage, resulting in person-specific latent
representations that drive the generation process. In order to achieve
person-independent animation from video input, we introduce an LSTM-based
animation network capable of translating person-independent expression features
into personalized animation parameters of person-specific 3D head models. Our
approach combines the advantages of personalized head models (high quality and
realism) with the convenience of video-driven animation employing multi-person
facial performance capture. We demonstrate the effectiveness of our approach on
synthesized animations with high quality based on different source videos as
well as an ablation study.
Related papers
- MMHead: Towards Fine-grained Multi-modal 3D Facial Animation [68.04052669266174]
We construct a large-scale multi-modal 3D facial animation dataset, MMHead.
MMHead consists of 49 hours of 3D facial motion sequences, speech audios, and rich hierarchical text annotations.
Based on the MMHead dataset, we establish benchmarks for two new tasks: text-induced 3D talking head animation and text-to-3D facial motion generation.
arXiv Detail & Related papers (2024-10-10T09:37:01Z) - Bring Your Own Character: A Holistic Solution for Automatic Facial
Animation Generation of Customized Characters [24.615066741391125]
We propose a holistic solution to automatically animate virtual human faces.
A deep learning model was first trained to retarget the facial expression from input face images to virtual human faces.
A practical toolkit was developed using Unity 3D, making it compatible with the most popular VR applications.
arXiv Detail & Related papers (2024-02-21T11:35:20Z) - GaussianAvatar: Towards Realistic Human Avatar Modeling from a Single Video via Animatable 3D Gaussians [51.46168990249278]
We present an efficient approach to creating realistic human avatars with dynamic 3D appearances from a single video.
GustafAvatar is validated on both the public dataset and our collected dataset.
arXiv Detail & Related papers (2023-12-04T18:55:45Z) - Animate Anyone: Consistent and Controllable Image-to-Video Synthesis for Character Animation [27.700371215886683]
diffusion models have become the mainstream in visual generation research, owing to their robust generative capabilities.
In this paper, we propose a novel framework tailored for character animation.
By expanding the training data, our approach can animate arbitrary characters, yielding superior results in character animation compared to other image-to-video methods.
arXiv Detail & Related papers (2023-11-28T12:27:15Z) - MagicAnimate: Temporally Consistent Human Image Animation using
Diffusion Model [74.84435399451573]
This paper studies the human image animation task, which aims to generate a video of a certain reference identity following a particular motion sequence.
Existing animation works typically employ the frame-warping technique to animate the reference image towards the target motion.
We introduce MagicAnimate, a diffusion-based framework that aims at enhancing temporal consistency, preserving reference image faithfully, and improving animation fidelity.
arXiv Detail & Related papers (2023-11-27T18:32:31Z) - Audio-Driven 3D Facial Animation from In-the-Wild Videos [16.76533748243908]
Given an arbitrary audio clip, audio-driven 3D facial animation aims to generate lifelike lip motions and facial expressions for a 3D head.
Existing methods typically rely on training their models using limited public 3D datasets that contain a restricted number of audio-3D scan pairs.
We propose a novel method that leverages in-the-wild 2D talking-head videos to train our 3D facial animation model.
arXiv Detail & Related papers (2023-06-20T13:53:05Z) - AnimeCeleb: Large-Scale Animation CelebFaces Dataset via Controllable 3D
Synthetic Models [19.6347170450874]
We present a large-scale animation celebfaces dataset (AnimeCeleb) via controllable synthetic animation models.
To facilitate the data generation process, we build a semi-automatic pipeline based on an open 3D software.
This leads to constructing a large-scale animation face dataset that includes multi-pose and multi-style animation faces with rich annotations.
arXiv Detail & Related papers (2021-11-15T10:00:06Z) - MeshTalk: 3D Face Animation from Speech using Cross-Modality
Disentanglement [142.9900055577252]
We propose a generic audio-driven facial animation approach that achieves highly realistic motion synthesis results for the entire face.
Our approach ensures highly accurate lip motion, while also plausible animation of the parts of the face that are uncorrelated to the audio signal, such as eye blinks and eye brow motion.
arXiv Detail & Related papers (2021-04-16T17:05:40Z) - Going beyond Free Viewpoint: Creating Animatable Volumetric Video of
Human Performances [7.7824496657259665]
We present an end-to-end pipeline for the creation of high-quality animatable volumetric video content of human performances.
Semantic enrichment and geometric animation ability are achieved by establishing temporal consistency in the 3D data.
For pose editing, we exploit the captured data as much as possible and kinematically deform the captured frames to fit a desired pose.
arXiv Detail & Related papers (2020-09-02T09:46:12Z) - Audio- and Gaze-driven Facial Animation of Codec Avatars [149.0094713268313]
We describe the first approach to animate Codec Avatars in real-time using audio and/or eye tracking.
Our goal is to display expressive conversations between individuals that exhibit important social signals.
arXiv Detail & Related papers (2020-08-11T22:28:48Z) - Audio-driven Talking Face Video Generation with Learning-based
Personalized Head Pose [67.31838207805573]
We propose a deep neural network model that takes an audio signal A of a source person and a short video V of a target person as input.
We outputs a synthesized high-quality talking face video with personalized head pose.
Our method can generate high-quality talking face videos with more distinguishing head movement effects than state-of-the-art methods.
arXiv Detail & Related papers (2020-02-24T10:02:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.