Style Transfer for 2D Talking Head Animation
- URL: http://arxiv.org/abs/2303.09799v2
- Date: Wed, 22 Mar 2023 16:34:57 GMT
- Title: Style Transfer for 2D Talking Head Animation
- Authors: Trong-Thang Pham, Nhat Le, Tuong Do, Hung Nguyen, Erman Tjiputra,
Quang D. Tran, Anh Nguyen
- Abstract summary: We present a new method to generate talking head animation with learnable style references.
Our framework can reconstruct 2D talking head animation based on a single input image and an audio stream.
- Score: 11.740847190449314
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Audio-driven talking head animation is a challenging research topic with many
real-world applications. Recent works have focused on creating photo-realistic
2D animation, while learning different talking or singing styles remains an
open problem. In this paper, we present a new method to generate talking head
animation with learnable style references. Given a set of style reference
frames, our framework can reconstruct 2D talking head animation based on a
single input image and an audio stream. Our method first produces facial
landmarks motion from the audio stream and constructs the intermediate style
patterns from the style reference images. We then feed both outputs into a
style-aware image generator to generate the photo-realistic and fidelity 2D
animation. In practice, our framework can extract the style information of a
specific character and transfer it to any new static image for talking head
animation. The intensive experimental results show that our method achieves
better results than recent state-of-the-art approaches qualitatively and
quantitatively.
Related papers
- FlipSketch: Flipping Static Drawings to Text-Guided Sketch Animations [65.64014682930164]
Sketch animations offer a powerful medium for visual storytelling, from simple flip-book doodles to professional studio productions.
We present FlipSketch, a system that brings back the magic of flip-book animation -- just draw your idea and describe how you want it to move!
arXiv Detail & Related papers (2024-11-16T14:53:03Z) - Mimic: Speaking Style Disentanglement for Speech-Driven 3D Facial
Animation [41.489700112318864]
Speech-driven 3D facial animation aims to synthesize vivid facial animations that accurately synchronize with speech and match the unique speaking style.
We introduce an innovative speaking style disentanglement method, which enables arbitrary-subject speaking style encoding.
We also propose a novel framework called textbfMimic to learn disentangled representations of the speaking style and content from facial motions.
arXiv Detail & Related papers (2023-12-18T01:49:42Z) - AnimateZero: Video Diffusion Models are Zero-Shot Image Animators [63.938509879469024]
We propose AnimateZero to unveil the pre-trained text-to-video diffusion model, i.e., AnimateDiff.
For appearance control, we borrow intermediate latents and their features from the text-to-image (T2I) generation.
For temporal control, we replace the global temporal attention of the original T2V model with our proposed positional-corrected window attention.
arXiv Detail & Related papers (2023-12-06T13:39:35Z) - Unsupervised Learning of Style-Aware Facial Animation from Real Acting
Performances [3.95944314850151]
We present a novel approach for text/speech-driven animation of a photo-realistic head model based on blend-shape geometry, dynamic textures, and neural rendering.
Our animation method is based on a conditional CNN that transforms text or speech into a sequence of animation parameters.
For realistic real-time rendering, we train a U-Net that refines pixelization-based renderings by computing improved colors and a foreground matte.
arXiv Detail & Related papers (2023-06-16T17:58:04Z) - StyleTalk: One-shot Talking Head Generation with Controllable Speaking
Styles [43.12918949398099]
We propose a one-shot style-controllable talking face generation framework.
We aim to attain a speaking style from an arbitrary reference speaking video.
We then drive the one-shot portrait to speak with the reference speaking style and another piece of audio.
arXiv Detail & Related papers (2023-01-03T13:16:24Z) - Language-Guided Face Animation by Recurrent StyleGAN-based Generator [87.56260982475564]
We study a novel task, language-guided face animation, that aims to animate a static face image with the help of languages.
We propose a recurrent motion generator to extract a series of semantic and motion information from the language and feed it along with visual information to a pre-trained StyleGAN to generate high-quality frames.
arXiv Detail & Related papers (2022-08-11T02:57:30Z) - MeshTalk: 3D Face Animation from Speech using Cross-Modality
Disentanglement [142.9900055577252]
We propose a generic audio-driven facial animation approach that achieves highly realistic motion synthesis results for the entire face.
Our approach ensures highly accurate lip motion, while also plausible animation of the parts of the face that are uncorrelated to the audio signal, such as eye blinks and eye brow motion.
arXiv Detail & Related papers (2021-04-16T17:05:40Z) - MakeItTalk: Speaker-Aware Talking-Head Animation [49.77977246535329]
We present a method that generates expressive talking heads from a single facial image with audio as the only input.
Based on this intermediate representation, our method is able to synthesize photorealistic videos of entire talking heads with full range of motion.
arXiv Detail & Related papers (2020-04-27T17:56:15Z) - First Order Motion Model for Image Animation [90.712718329677]
Image animation consists of generating a video sequence so that an object in a source image is animated according to the motion of a driving video.
Our framework addresses this problem without using any annotation or prior information about the specific object to animate.
arXiv Detail & Related papers (2020-02-29T07:08:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.