4D Facial Expression Diffusion Model
- URL: http://arxiv.org/abs/2303.16611v2
- Date: Mon, 15 Apr 2024 13:29:47 GMT
- Title: 4D Facial Expression Diffusion Model
- Authors: Kaifeng Zou, Sylvain Faisan, Boyang Yu, Sébastien Valette, Hyewon Seo,
- Abstract summary: We introduce a generative framework for generating 3D facial expression sequences.
It is composed of two tasks: Learning the generative model that is trained over a set of 3D landmark sequences, and Generating 3D mesh sequences of an input facial mesh driven by the generated landmark sequences.
Experiments show that our model has learned to generate realistic, quality expressions solely from the dataset of relatively small size, improving over the state-of-the-art methods.
- Score: 3.507793603897647
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Facial expression generation is one of the most challenging and long-sought aspects of character animation, with many interesting applications. The challenging task, traditionally having relied heavily on digital craftspersons, remains yet to be explored. In this paper, we introduce a generative framework for generating 3D facial expression sequences (i.e. 4D faces) that can be conditioned on different inputs to animate an arbitrary 3D face mesh. It is composed of two tasks: (1) Learning the generative model that is trained over a set of 3D landmark sequences, and (2) Generating 3D mesh sequences of an input facial mesh driven by the generated landmark sequences. The generative model is based on a Denoising Diffusion Probabilistic Model (DDPM), which has achieved remarkable success in generative tasks of other domains. While it can be trained unconditionally, its reverse process can still be conditioned by various condition signals. This allows us to efficiently develop several downstream tasks involving various conditional generation, by using expression labels, text, partial sequences, or simply a facial geometry. To obtain the full mesh deformation, we then develop a landmark-guided encoder-decoder to apply the geometrical deformation embedded in landmarks on a given facial mesh. Experiments show that our model has learned to generate realistic, quality expressions solely from the dataset of relatively small size, improving over the state-of-the-art methods. Videos and qualitative comparisons with other methods can be found at \url{https://github.com/ZOUKaifeng/4DFM}.
Related papers
- MeshXL: Neural Coordinate Field for Generative 3D Foundation Models [51.1972329762843]
We present a family of generative pre-trained auto-regressive models, which addresses the process of 3D mesh generation with modern large language model approaches.
MeshXL is able to generate high-quality 3D meshes, and can also serve as foundation models for various down-stream applications.
arXiv Detail & Related papers (2024-05-31T14:35:35Z) - ID-to-3D: Expressive ID-guided 3D Heads via Score Distillation Sampling [96.87575334960258]
ID-to-3D is a method to generate identity- and text-guided 3D human heads with disentangled expressions.
Results achieve an unprecedented level of identity-consistent and high-quality texture and geometry generation.
arXiv Detail & Related papers (2024-05-26T13:36:45Z) - AnimateMe: 4D Facial Expressions via Diffusion Models [72.63383191654357]
Recent advances in diffusion models have enhanced the capabilities of generative models in 2D animation.
We employ Graph Neural Networks (GNNs) as denoising diffusion models in a novel approach, formulating the diffusion process directly on the mesh space.
This facilitates the generation of facial deformations through a mesh-diffusion-based model.
arXiv Detail & Related papers (2024-03-25T21:40:44Z) - CGOF++: Controllable 3D Face Synthesis with Conditional Generative
Occupancy Fields [52.14985242487535]
We propose a new conditional 3D face synthesis framework, which enables 3D controllability over generated face images.
At its core is a conditional Generative Occupancy Field (cGOF++) that effectively enforces the shape of the generated face to conform to a given 3D Morphable Model (3DMM) mesh.
Experiments validate the effectiveness of the proposed method and show more precise 3D controllability than state-of-the-art 2D-based controllable face synthesis methods.
arXiv Detail & Related papers (2022-11-23T19:02:50Z) - Next3D: Generative Neural Texture Rasterization for 3D-Aware Head
Avatars [36.4402388864691]
3D-aware generative adversarial networks (GANs) synthesize high-fidelity and multi-view-consistent facial images using only collections of single-view 2D imagery.
Recent efforts incorporate 3D Morphable Face Model (3DMM) to describe deformation in generative radiance fields either explicitly or implicitly.
We propose a novel 3D GAN framework for unsupervised learning of generative, high-quality and 3D-consistent facial avatars from unstructured 2D images.
arXiv Detail & Related papers (2022-11-21T06:40:46Z) - Controllable 3D Generative Adversarial Face Model via Disentangling
Shape and Appearance [63.13801759915835]
3D face modeling has been an active area of research in computer vision and computer graphics.
This paper proposes a new 3D face generative model that can decouple identity and expression.
arXiv Detail & Related papers (2022-08-30T13:40:48Z) - Generating Multiple 4D Expression Transitions by Learning Face Landmark
Trajectories [26.63401369410327]
In the real world, people show more complex expressions, and switch from one expression to another.
We propose a new model that generates transitions between different expressions, and synthesizes long and composed 4D expressions.
arXiv Detail & Related papers (2022-07-29T19:33:56Z) - Topologically Consistent Multi-View Face Inference Using Volumetric
Sampling [25.001398662643986]
ToFu is a geometry inference framework that can produce topologically consistent meshes across identities and expressions.
A novel progressive mesh generation network embeds the topological structure of the face in a feature volume.
These high-quality assets are readily usable by production studios for avatar creation, animation and physically-based skin rendering.
arXiv Detail & Related papers (2021-10-06T17:55:08Z) - 3D to 4D Facial Expressions Generation Guided by Landmarks [35.61963927340274]
Given one input 3D neutral face, can we generate dynamic 3D (4D) facial expressions from it?
We first propose a mesh encoder-decoder architecture (Expr-ED) that exploits a set of 3D landmarks to generate an expressive 3D face from its neutral counterpart.
We extend it to 4D by modeling the temporal dynamics of facial expressions using a manifold-valued GAN.
arXiv Detail & Related papers (2021-05-16T15:52:29Z) - OSTeC: One-Shot Texture Completion [86.23018402732748]
We propose an unsupervised approach for one-shot 3D facial texture completion.
The proposed approach rotates an input image in 3D and fill-in the unseen regions by reconstructing the rotated image in a 2D face generator.
We frontalize the target image by projecting the completed texture into the generator.
arXiv Detail & Related papers (2020-12-30T23:53:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.