Learning to Generate Customized Dynamic 3D Facial Expressions
- URL: http://arxiv.org/abs/2007.09805v2
- Date: Tue, 21 Jul 2020 16:18:15 GMT
- Title: Learning to Generate Customized Dynamic 3D Facial Expressions
- Authors: Rolandos Alexandros Potamias, Jiali Zheng, Stylianos Ploumpis, Giorgos
Bouritsas, Evangelos Ververas, Stefanos Zafeiriou
- Abstract summary: We study 3D image-to-video translation with a particular focus on 4D facial expressions.
We employ a deep mesh-decoder like architecture to synthesize realistic high resolution facial expressions.
We trained our model using a high resolution dataset with 4D scans of six facial expressions from 180 subjects.
- Score: 47.5220752079009
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent advances in deep learning have significantly pushed the
state-of-the-art in photorealistic video animation given a single image. In
this paper, we extrapolate those advances to the 3D domain, by studying 3D
image-to-video translation with a particular focus on 4D facial expressions.
Although 3D facial generative models have been widely explored during the past
years, 4D animation remains relatively unexplored. To this end, in this study
we employ a deep mesh encoder-decoder like architecture to synthesize realistic
high resolution facial expressions by using a single neutral frame along with
an expression identification. In addition, processing 3D meshes remains a
non-trivial task compared to data that live on grid-like structures, such as
images. Given the recent progress in mesh processing with graph convolutions,
we make use of a recently introduced learnable operator which acts directly on
the mesh structure by taking advantage of local vertex orderings. In order to
generalize to 4D facial expressions across subjects, we trained our model using
a high resolution dataset with 4D scans of six facial expressions from 180
subjects. Experimental results demonstrate that our approach preserves the
subject's identity information even for unseen subjects and generates high
quality expressions. To the best of our knowledge, this is the first study
tackling the problem of 4D facial expression synthesis.
Related papers
- 4-LEGS: 4D Language Embedded Gaussian Splatting [12.699978393733309]
We show how to lift-temporal features to a 4D representation based on 3D Gaussianting.
This enables an interactive interface where the user cantemporally localize events in the video from text prompts.
We demonstrate our system on public 3D video datasets of people and animals performing various actions.
arXiv Detail & Related papers (2024-10-14T17:00:53Z) - Segment Any 4D Gaussians [69.53172192552508]
We propose Segment Any 4D Gaussians (SA4D) to segment anything in the 4D digital world based on 4D Gaussians.
SA4D achieves precise, high-quality segmentation within seconds in 4D Gaussians and shows the ability to remove, recolor, compose, and render high-quality anything masks.
arXiv Detail & Related papers (2024-07-05T13:44:15Z) - AnimateMe: 4D Facial Expressions via Diffusion Models [72.63383191654357]
Recent advances in diffusion models have enhanced the capabilities of generative models in 2D animation.
We employ Graph Neural Networks (GNNs) as denoising diffusion models in a novel approach, formulating the diffusion process directly on the mesh space.
This facilitates the generation of facial deformations through a mesh-diffusion-based model.
arXiv Detail & Related papers (2024-03-25T21:40:44Z) - Comp4D: LLM-Guided Compositional 4D Scene Generation [65.5810466788355]
We present Comp4D, a novel framework for Compositional 4D Generation.
Unlike conventional methods that generate a singular 4D representation of the entire scene, Comp4D innovatively constructs each 4D object within the scene separately.
Our method employs a compositional score distillation technique guided by the pre-defined trajectories.
arXiv Detail & Related papers (2024-03-25T17:55:52Z) - 4DGen: Grounded 4D Content Generation with Spatial-temporal Consistency [118.15258850780417]
This work introduces 4DGen, a novel framework for grounded 4D content creation.
We identify static 3D assets and monocular video sequences as key components in constructing the 4D content.
Our pipeline facilitates conditional 4D generation, enabling users to specify geometry (3D assets) and motion (monocular videos)
arXiv Detail & Related papers (2023-12-28T18:53:39Z) - DaGAN++: Depth-Aware Generative Adversarial Network for Talking Head
Video Generation [18.511092587156657]
We present a novel self-supervised method for learning dense 3D facial geometry from face videos.
We also propose a strategy to learn pixel-level uncertainties to perceive more reliable rigid-motion pixels for geometry learning.
We develop a 3D-aware cross-modal (ie, appearance and depth) attention mechanism to capture facial geometries in a coarse-to-fine manner.
arXiv Detail & Related papers (2023-05-10T14:58:33Z) - AniFaceGAN: Animatable 3D-Aware Face Image Generation for Video Avatars [71.00322191446203]
2D generative models often suffer from undesirable artifacts when rendering images from different camera viewpoints.
Recently, 3D-aware GANs extend 2D GANs for explicit disentanglement of camera pose by leveraging 3D scene representations.
We propose an animatable 3D-aware GAN for multiview consistent face animation generation.
arXiv Detail & Related papers (2022-10-12T17:59:56Z) - Generating Multiple 4D Expression Transitions by Learning Face Landmark
Trajectories [26.63401369410327]
In the real world, people show more complex expressions, and switch from one expression to another.
We propose a new model that generates transitions between different expressions, and synthesizes long and composed 4D expressions.
arXiv Detail & Related papers (2022-07-29T19:33:56Z) - 3D to 4D Facial Expressions Generation Guided by Landmarks [35.61963927340274]
Given one input 3D neutral face, can we generate dynamic 3D (4D) facial expressions from it?
We first propose a mesh encoder-decoder architecture (Expr-ED) that exploits a set of 3D landmarks to generate an expressive 3D face from its neutral counterpart.
We extend it to 4D by modeling the temporal dynamics of facial expressions using a manifold-valued GAN.
arXiv Detail & Related papers (2021-05-16T15:52:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.