Related papers: High-Fidelity 3D Face Generation from Natural Language Descriptions

High-Fidelity 3D Face Generation from Natural Language Descriptions

URL: http://arxiv.org/abs/2305.03302v1
Date: Fri, 5 May 2023 06:10:15 GMT
Title: High-Fidelity 3D Face Generation from Natural Language Descriptions
Authors: Menghua Wu, Hao Zhu, Linjia Huang, Yiyu Zhuang, Yuanxun Lu, Xun Cao
Abstract summary: We argue the major obstacle lies in 1) the lack of high-quality 3D face data with descriptive text annotation, and 2) the complex mapping relationship between descriptive language space and shape/appearance space. We build Describe3D dataset, the first large-scale dataset with fine-grained text descriptions for text-to-3D face generation task. We propose a two-stage framework to first generate a 3D face that matches the concrete descriptions, then optimize the parameters in the 3D shape and texture space with abstract description to refine the 3D face model.
Score: 12.22081892575208
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Synthesizing high-quality 3D face models from natural language descriptions is very valuable for many applications, including avatar creation, virtual reality, and telepresence. However, little research ever tapped into this task. We argue the major obstacle lies in 1) the lack of high-quality 3D face data with descriptive text annotation, and 2) the complex mapping relationship between descriptive language space and shape/appearance space. To solve these problems, we build Describe3D dataset, the first large-scale dataset with fine-grained text descriptions for text-to-3D face generation task. Then we propose a two-stage framework to first generate a 3D face that matches the concrete descriptions, then optimize the parameters in the 3D shape and texture space with abstract description to refine the 3D face model. Extensive experimental results show that our method can produce a faithful 3D face that conforms to the input descriptions with higher accuracy and quality than previous methods. The code and Describe3D dataset are released at https://github.com/zhuhao-nju/describe3d .

Related papers

Text-based Animatable 3D Avatars with Morphable Model Alignment [19.523681764512357]
We propose a novel framework, Anim3D, for text-based realistic animatable 3DGS avatar generation with morphable model alignment. Our method outperforms existing approaches in terms of synthesis quality, alignment, and animation fidelity.
arXiv Detail & Related papers (2025-04-22T12:29:14Z)
DIRECT-3D: Learning Direct Text-to-3D Generation on Massive Noisy 3D Data [50.164670363633704]
We present DIRECT-3D, a diffusion-based 3D generative model for creating high-quality 3D assets from text prompts. Our model is directly trained on extensive noisy and unaligned in-the-wild' 3D assets. We achieve state-of-the-art performance in both single-class generation and text-to-3D generation.
arXiv Detail & Related papers (2024-06-06T17:58:15Z)
SceneWiz3D: Towards Text-guided 3D Scene Composition [134.71933134180782]
Existing approaches either leverage large text-to-image models to optimize a 3D representation or train 3D generators on object-centric datasets. We introduce SceneWiz3D, a novel approach to synthesize high-fidelity 3D scenes from text.
arXiv Detail & Related papers (2023-12-13T18:59:30Z)
Towards High-Fidelity Text-Guided 3D Face Generation and Manipulation Using only Images [105.92311979305065]
TG-3DFace creates more realistic and aesthetically pleasing 3D faces, boosting 9% multi-view consistency (MVIC) over Latent3D. The rendered face images generated by TG-3DFace achieve higher FID and CLIP score than text-to-2D face/image generation models.
arXiv Detail & Related papers (2023-08-31T14:26:33Z)
TADA! Text to Animatable Digital Avatars [57.52707683788961]
TADA takes textual descriptions and produces expressive 3D avatars with high-quality geometry and lifelike textures. We derive an optimizable high-resolution body model from SMPL-X with 3D displacements and a texture map. We render normals and RGB images of the generated character and exploit their latent embeddings in the SDS training process.
arXiv Detail & Related papers (2023-08-21T17:59:10Z)
Articulated 3D Head Avatar Generation using Text-to-Image Diffusion Models [107.84324544272481]
The ability to generate diverse 3D articulated head avatars is vital to a plethora of applications, including augmented reality, cinematography, and education. Recent work on text-guided 3D object generation has shown great promise in addressing these needs. We show that our diffusion-based articulated head avatars outperform state-of-the-art approaches for this task.
arXiv Detail & Related papers (2023-07-10T19:15:32Z)
Fantasia3D: Disentangling Geometry and Appearance for High-quality Text-to-3D Content Creation [45.69270771487455]
We propose a new method of Fantasia3D for high-quality text-to-3D content creation. Key to Fantasia3D is the disentangled modeling and learning of geometry and appearance. Our framework is more compatible with popular graphics engines, supporting relighting, editing, and physical simulation of the generated 3D assets.
arXiv Detail & Related papers (2023-03-24T09:30:09Z)
RAFaRe: Learning Robust and Accurate Non-parametric 3D Face Reconstruction from Pseudo 2D&3D Pairs [13.11105614044699]
We propose a robust and accurate non-parametric method for single-view 3D face reconstruction (SVFR) A large-scale pseudo 2D&3D dataset is created by first rendering the detailed 3D faces, then swapping the face in the wild images with the rendered face. Our model outperforms previous methods on FaceScape-wild/lab and MICC benchmarks.
arXiv Detail & Related papers (2023-02-10T19:40:26Z)
3D-TOGO: Towards Text-Guided Cross-Category 3D Object Generation [107.46972849241168]
3D-TOGO model generates 3D objects in the form of the neural radiance field with good texture. Experiments on the largest 3D object dataset (i.e., ABO) are conducted to verify that 3D-TOGO can better generate high-quality 3D objects.
arXiv Detail & Related papers (2022-12-02T11:31:49Z)
FaceScape: 3D Facial Dataset and Benchmark for Single-View 3D Face Reconstruction [29.920622006999732]
We present a large-scale detailed 3D face dataset, FaceScape, and the corresponding benchmark to evaluate single-view facial 3D reconstruction. By training on FaceScape data, a novel algorithm is proposed to predict elaborate riggable 3D face models from a single image input. We also use FaceScape data to generate the in-the-wild and in-the-lab benchmark to evaluate recent methods of single-view face reconstruction.
arXiv Detail & Related papers (2021-11-01T16:48:34Z)
FaceScape: a Large-scale High Quality 3D Face Dataset and Detailed Riggable 3D Face Prediction [39.95272819738226]
We present a novel algorithm that is able to predict elaborate riggable 3D face models from a single image input. FaceScape dataset provides 18,760 textured 3D faces, captured from 938 subjects and each with 20 specific expressions.
arXiv Detail & Related papers (2020-03-31T07:11:08Z)

This list is automatically generated from the titles and abstracts of the papers in this site.