Text2Avatar: Text to 3D Human Avatar Generation with Codebook-Driven
Body Controllable Attribute
- URL: http://arxiv.org/abs/2401.00711v1
- Date: Mon, 1 Jan 2024 09:39:57 GMT
- Title: Text2Avatar: Text to 3D Human Avatar Generation with Codebook-Driven
Body Controllable Attribute
- Authors: Chaoqun Gong, Yuqin Dai, Ronghui Li, Achun Bao, Jun Li, Jian Yang,
Yachao Zhang, Xiu Li
- Abstract summary: We propose Text2Avatar, which can generate realistic-style 3D avatars based on the coupled text prompts.
To alleviate the scarcity of realistic style 3D human avatar data, we utilize a pre-trained unconditional 3D human avatar generation model.
- Score: 33.330629835556664
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Generating 3D human models directly from text helps reduce the cost and time
of character modeling. However, achieving multi-attribute controllable and
realistic 3D human avatar generation is still challenging due to feature
coupling and the scarcity of realistic 3D human avatar datasets. To address
these issues, we propose Text2Avatar, which can generate realistic-style 3D
avatars based on the coupled text prompts. Text2Avatar leverages a discrete
codebook as an intermediate feature to establish a connection between text and
avatars, enabling the disentanglement of features. Furthermore, to alleviate
the scarcity of realistic style 3D human avatar data, we utilize a pre-trained
unconditional 3D human avatar generation model to obtain a large amount of 3D
avatar pseudo data, which allows Text2Avatar to achieve realistic style
generation. Experimental results demonstrate that our method can generate
realistic 3D avatars from coupled textual data, which is challenging for other
existing methods in this field.
Related papers
- WildAvatar: Web-scale In-the-wild Video Dataset for 3D Avatar Creation [55.85887047136534]
WildAvatar is a web-scale in-the-wild human avatar creation dataset extracted from YouTube.
We evaluate several state-of-the-art avatar creation methods on our dataset, highlighting the unexplored challenges in real-world applications on avatar creation.
arXiv Detail & Related papers (2024-07-02T11:17:48Z) - AvatarStudio: High-fidelity and Animatable 3D Avatar Creation from Text [71.09533176800707]
AvatarStudio is a coarse-to-fine generative model that generates explicit textured 3D meshes for animatable human avatars.
By effectively leveraging the synergy between the articulated mesh representation and the DensePose-conditional diffusion model, AvatarStudio can create high-quality avatars.
arXiv Detail & Related papers (2023-11-29T18:59:32Z) - AvatarVerse: High-quality & Stable 3D Avatar Creation from Text and Pose [23.76390935089982]
We present AvatarVerse, a stable pipeline for generating high expressivequality 3D avatars from text descriptions and pose guidance.
To this end, we propose zero-fidelity 3D modeling of 3D avatars that are not only more expressive, but also higher quality stablizes.
arXiv Detail & Related papers (2023-08-07T14:09:46Z) - AvatarBooth: High-Quality and Customizable 3D Human Avatar Generation [14.062402203105712]
AvatarBooth is a novel method for generating high-quality 3D avatars using text prompts or specific images.
Our key contribution is the precise avatar generation control by using dual fine-tuned diffusion models.
We present a multi-resolution rendering strategy that facilitates coarse-to-fine supervision of 3D avatar generation.
arXiv Detail & Related papers (2023-06-16T14:18:51Z) - DreamWaltz: Make a Scene with Complex 3D Animatable Avatars [68.49935994384047]
We present DreamWaltz, a novel framework for generating and animating complex 3D avatars given text guidance and parametric human body prior.
For animation, our method learns an animatable 3D avatar representation from abundant image priors of diffusion model conditioned on various poses.
arXiv Detail & Related papers (2023-05-21T17:59:39Z) - DreamAvatar: Text-and-Shape Guided 3D Human Avatar Generation via
Diffusion Models [55.71306021041785]
We present DreamAvatar, a text-and-shape guided framework for generating high-quality 3D human avatars.
We leverage the SMPL model to provide shape and pose guidance for the generation.
We also jointly optimize the losses computed from the full body and from the zoomed-in 3D head to alleviate the common multi-face ''Janus'' problem.
arXiv Detail & Related papers (2023-04-03T12:11:51Z) - Rodin: A Generative Model for Sculpting 3D Digital Avatars Using
Diffusion [66.26780039133122]
This paper presents a 3D generative model that uses diffusion models to automatically generate 3D digital avatars.
The memory and processing costs in 3D are prohibitive for producing the rich details required for high-quality avatars.
We can generate highly detailed avatars with realistic hairstyles and facial hair like beards.
arXiv Detail & Related papers (2022-12-12T18:59:40Z) - AvatarGen: a 3D Generative Model for Animatable Human Avatars [108.11137221845352]
AvatarGen is the first method that enables not only non-rigid human generation with diverse appearance but also full control over poses and viewpoints.
To model non-rigid dynamics, it introduces a deformation network to learn pose-dependent deformations in the canonical space.
Our method can generate animatable human avatars with high-quality appearance and geometry modeling, significantly outperforming previous 3D GANs.
arXiv Detail & Related papers (2022-08-01T01:27:02Z) - AvatarCLIP: Zero-Shot Text-Driven Generation and Animation of 3D Avatars [37.43588165101838]
AvatarCLIP is a zero-shot text-driven framework for 3D avatar generation and animation.
We take advantage of the powerful vision-language model CLIP for supervising neural human generation.
By leveraging the priors learned in the motion VAE, a CLIP-guided reference-based motion synthesis method is proposed for the animation of the generated 3D avatar.
arXiv Detail & Related papers (2022-05-17T17:59:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.