HeadSculpt: Crafting 3D Head Avatars with Text
- URL: http://arxiv.org/abs/2306.03038v2
- Date: Tue, 29 Aug 2023 11:08:59 GMT
- Title: HeadSculpt: Crafting 3D Head Avatars with Text
- Authors: Xiao Han, Yukang Cao, Kai Han, Xiatian Zhu, Jiankang Deng, Yi-Zhe
Song, Tao Xiang, Kwan-Yee K. Wong
- Abstract summary: We introduce a versatile pipeline dubbed HeadSculpt for crafting 3D head avatars from textual prompts.
We first equip the diffusion model with 3D awareness by leveraging landmark-based control and a learned textual embedding.
We propose a novel identity-aware editing score distillation strategy to optimize a textured mesh with a high-resolution differentiable rendering technique.
- Score: 143.14548696613886
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recently, text-guided 3D generative methods have made remarkable advancements
in producing high-quality textures and geometry, capitalizing on the
proliferation of large vision-language and image diffusion models. However,
existing methods still struggle to create high-fidelity 3D head avatars in two
aspects: (1) They rely mostly on a pre-trained text-to-image diffusion model
whilst missing the necessary 3D awareness and head priors. This makes them
prone to inconsistency and geometric distortions in the generated avatars. (2)
They fall short in fine-grained editing. This is primarily due to the inherited
limitations from the pre-trained 2D image diffusion models, which become more
pronounced when it comes to 3D head avatars. In this work, we address these
challenges by introducing a versatile coarse-to-fine pipeline dubbed HeadSculpt
for crafting (i.e., generating and editing) 3D head avatars from textual
prompts. Specifically, we first equip the diffusion model with 3D awareness by
leveraging landmark-based control and a learned textual embedding representing
the back view appearance of heads, enabling 3D-consistent head avatar
generations. We further propose a novel identity-aware editing score
distillation strategy to optimize a textured mesh with a high-resolution
differentiable rendering technique. This enables identity preservation while
following the editing instruction. We showcase HeadSculpt's superior fidelity
and editing capabilities through comprehensive experiments and comparisons with
existing methods.
Related papers
- UltrAvatar: A Realistic Animatable 3D Avatar Diffusion Model with Authenticity Guided Textures [80.047065473698]
We propose a novel 3D avatar generation approach termed UltrAvatar with enhanced fidelity of geometry, and superior quality of physically based rendering (PBR) textures without unwanted lighting.
We demonstrate the effectiveness and robustness of the proposed method, outperforming the state-of-the-art methods by a large margin in the experiments.
arXiv Detail & Related papers (2024-01-20T01:55:17Z) - AniPortraitGAN: Animatable 3D Portrait Generation from 2D Image
Collections [78.81539337399391]
We present an animatable 3D-aware GAN that generates portrait images with controllable facial expression, head pose, and shoulder movements.
It is a generative model trained on unstructured 2D image collections without using 3D or video data.
A dual-camera rendering and adversarial learning scheme is proposed to improve the quality of the generated faces.
arXiv Detail & Related papers (2023-09-05T12:44:57Z) - Articulated 3D Head Avatar Generation using Text-to-Image Diffusion
Models [107.84324544272481]
The ability to generate diverse 3D articulated head avatars is vital to a plethora of applications, including augmented reality, cinematography, and education.
Recent work on text-guided 3D object generation has shown great promise in addressing these needs.
We show that our diffusion-based articulated head avatars outperform state-of-the-art approaches for this task.
arXiv Detail & Related papers (2023-07-10T19:15:32Z) - AvatarBooth: High-Quality and Customizable 3D Human Avatar Generation [14.062402203105712]
AvatarBooth is a novel method for generating high-quality 3D avatars using text prompts or specific images.
Our key contribution is the precise avatar generation control by using dual fine-tuned diffusion models.
We present a multi-resolution rendering strategy that facilitates coarse-to-fine supervision of 3D avatar generation.
arXiv Detail & Related papers (2023-06-16T14:18:51Z) - DreamAvatar: Text-and-Shape Guided 3D Human Avatar Generation via
Diffusion Models [55.71306021041785]
We present DreamAvatar, a text-and-shape guided framework for generating high-quality 3D human avatars.
We leverage the SMPL model to provide shape and pose guidance for the generation.
We also jointly optimize the losses computed from the full body and from the zoomed-in 3D head to alleviate the common multi-face ''Janus'' problem.
arXiv Detail & Related papers (2023-04-03T12:11:51Z) - Rodin: A Generative Model for Sculpting 3D Digital Avatars Using
Diffusion [66.26780039133122]
This paper presents a 3D generative model that uses diffusion models to automatically generate 3D digital avatars.
The memory and processing costs in 3D are prohibitive for producing the rich details required for high-quality avatars.
We can generate highly detailed avatars with realistic hairstyles and facial hair like beards.
arXiv Detail & Related papers (2022-12-12T18:59:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.