Controllable 3D Face Generation with Conditional Style Code Diffusion
- URL: http://arxiv.org/abs/2312.13941v2
- Date: Thu, 11 Jan 2024 08:30:50 GMT
- Title: Controllable 3D Face Generation with Conditional Style Code Diffusion
- Authors: Xiaolong Shen, Jianxin Ma, Chang Zhou, Zongxin Yang
- Abstract summary: TEx-Face(TExt & Expression-to-Face) addresses challenges by dividing the task into three components, i.e., 3D GAN Inversion, Conditional Style Code Diffusion, and 3D Face Decoding.
Experiments conducted on FFHQ, CelebA-HQ, and CelebA-Dialog demonstrate the promising performance of our TEx-Face.
- Score: 51.24656496304069
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Generating photorealistic 3D faces from given conditions is a challenging
task. Existing methods often rely on time-consuming one-by-one optimization
approaches, which are not efficient for modeling the same distribution content,
e.g., faces. Additionally, an ideal controllable 3D face generation model
should consider both facial attributes and expressions. Thus we propose a novel
approach called TEx-Face(TExt & Expression-to-Face) that addresses these
challenges by dividing the task into three components, i.e., 3D GAN Inversion,
Conditional Style Code Diffusion, and 3D Face Decoding. For 3D GAN inversion,
we introduce two methods which aim to enhance the representation of style codes
and alleviate 3D inconsistencies. Furthermore, we design a style code denoiser
to incorporate multiple conditions into the style code and propose a data
augmentation strategy to address the issue of insufficient paired
visual-language data. Extensive experiments conducted on FFHQ, CelebA-HQ, and
CelebA-Dialog demonstrate the promising performance of our TEx-Face in
achieving the efficient and controllable generation of photorealistic 3D faces.
The code will be available at https://github.com/sxl142/TEx-Face.
Related papers
- Towards High-Fidelity Text-Guided 3D Face Generation and Manipulation
Using only Images [105.92311979305065]
TG-3DFace creates more realistic and aesthetically pleasing 3D faces, boosting 9% multi-view consistency (MVIC) over Latent3D.
The rendered face images generated by TG-3DFace achieve higher FID and CLIP score than text-to-2D face/image generation models.
arXiv Detail & Related papers (2023-08-31T14:26:33Z) - Fake It Without Making It: Conditioned Face Generation for Accurate 3D
Face Reconstruction [5.079602839359523]
We present a method to generate a large-scale synthesised dataset of 250K photorealistic images and their corresponding shape parameters and depth maps, which we call SynthFace.
Our synthesis method conditions Stable Diffusion on depth maps sampled from the FLAME 3D Morphable Model (3DMM) of the human face, allowing us to generate a diverse set of shape-consistent facial images that is designed to be balanced in race and gender.
We propose ControlFace, a deep neural network, trained on SynthFace, which achieves competitive performance on the NoW benchmark, without requiring 3D supervision or manual 3D asset creation.
arXiv Detail & Related papers (2023-07-25T16:42:06Z) - Self-Supervised Geometry-Aware Encoder for Style-Based 3D GAN Inversion [115.82306502822412]
StyleGAN has achieved great progress in 2D face reconstruction and semantic editing via image inversion and latent editing.
A corresponding generic 3D GAN inversion framework is still missing, limiting the applications of 3D face reconstruction and semantic editing.
We study the challenging problem of 3D GAN inversion where a latent code is predicted given a single face image to faithfully recover its 3D shapes and detailed textures.
arXiv Detail & Related papers (2022-12-14T18:49:50Z) - Generating 2D and 3D Master Faces for Dictionary Attacks with a
Network-Assisted Latent Space Evolution [68.8204255655161]
A master face is a face image that passes face-based identity authentication for a high percentage of the population.
We optimize these faces for 2D and 3D face verification models.
In 3D, we generate faces using the 2D StyleGAN2 generator and predict a 3D structure using a deep 3D face reconstruction network.
arXiv Detail & Related papers (2022-11-25T09:15:38Z) - CGOF++: Controllable 3D Face Synthesis with Conditional Generative
Occupancy Fields [52.14985242487535]
We propose a new conditional 3D face synthesis framework, which enables 3D controllability over generated face images.
At its core is a conditional Generative Occupancy Field (cGOF++) that effectively enforces the shape of the generated face to conform to a given 3D Morphable Model (3DMM) mesh.
Experiments validate the effectiveness of the proposed method and show more precise 3D controllability than state-of-the-art 2D-based controllable face synthesis methods.
arXiv Detail & Related papers (2022-11-23T19:02:50Z) - Controllable 3D Face Synthesis with Conditional Generative Occupancy
Fields [40.2714783162419]
We propose a new conditional 3D face synthesis framework, which enables 3D controllability over generated face images.
At its core is a conditional Generative Occupancy Field (cGOF) that effectively enforces the shape of the generated face to commit to a given 3D Morphable Model (3DMM) mesh.
Experiments validate the effectiveness of the proposed method, which is able to generate high-fidelity face images.
arXiv Detail & Related papers (2022-06-16T17:58:42Z) - DeepFaceFlow: In-the-wild Dense 3D Facial Motion Estimation [56.56575063461169]
DeepFaceFlow is a robust, fast, and highly-accurate framework for the estimation of 3D non-rigid facial flow.
Our framework was trained and tested on two very large-scale facial video datasets.
Given registered pairs of images, our framework generates 3D flow maps at 60 fps.
arXiv Detail & Related papers (2020-05-14T23:56:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.