DivAvatar: Diverse 3D Avatar Generation with a Single Prompt
- URL: http://arxiv.org/abs/2402.17292v1
- Date: Tue, 27 Feb 2024 08:10:31 GMT
- Title: DivAvatar: Diverse 3D Avatar Generation with a Single Prompt
- Authors: Weijing Tao, Biwen Lei, Kunhao Liu, Shijian Lu, Miaomiao Cui, Xuansong
Xie, Chunyan Miao
- Abstract summary: DivAvatar is a framework that generates diverse avatars from a single text prompt.
It has two key designs that help achieve generation diversity and visual quality.
Extensive experiments show that DivAvatar is highly versatile in generating avatars of diverse appearances.
- Score: 95.9978722953278
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Text-to-Avatar generation has recently made significant strides due to
advancements in diffusion models. However, most existing work remains
constrained by limited diversity, producing avatars with subtle differences in
appearance for a given text prompt. We design DivAvatar, a novel framework that
generates diverse avatars, empowering 3D creatives with a multitude of distinct
and richly varied 3D avatars from a single text prompt. Different from most
existing work that exploits scene-specific 3D representations such as NeRF,
DivAvatar finetunes a 3D generative model (i.e., EVA3D), allowing diverse
avatar generation from simply noise sampling in inference time. DivAvatar has
two key designs that help achieve generation diversity and visual quality. The
first is a noise sampling technique during training phase which is critical in
generating diverse appearances. The second is a semantic-aware zoom mechanism
and a novel depth loss, the former producing appearances of high textual
fidelity by separate fine-tuning of specific body parts and the latter
improving geometry quality greatly by smoothing the generated mesh in the
features space. Extensive experiments show that DivAvatar is highly versatile
in generating avatars of diverse appearances.
Related papers
- DreamWaltz-G: Expressive 3D Gaussian Avatars from Skeleton-Guided 2D
Diffusion [69.67970568012599]
We present DreamWaltz-G, a novel learning framework for animatable 3D avatar generation from text.
The core of this framework lies in Score Distillation and Hybrid 3D Gaussian Avatar representation.
Our framework further supports diverse applications, including human video reenactment and multi-subject scene composition.
arXiv Detail & Related papers (2024-09-25T17:59:45Z) - GenCA: A Text-conditioned Generative Model for Realistic and Drivable Codec Avatars [44.8290935585746]
Photo-realistic and controllable 3D avatars are crucial for various applications such as virtual and mixed reality (VR/MR), telepresence, gaming, and film production.
Traditional methods for avatar creation often involve time-consuming scanning and reconstruction processes for each avatar.
We propose a text-conditioned generative model that can generate photo-realistic facial avatars of diverse identities.
arXiv Detail & Related papers (2024-08-24T21:25:22Z) - Instant 3D Human Avatar Generation using Image Diffusion Models [37.45927867788691]
AvatarPopUp is a method for fast, high quality 3D human avatar generation from different input modalities.
Our approach can produce a 3D model in as few as 2 seconds.
arXiv Detail & Related papers (2024-06-11T17:47:27Z) - GeneAvatar: Generic Expression-Aware Volumetric Head Avatar Editing from a Single Image [89.70322127648349]
We propose a generic avatar editing approach that can be universally applied to various 3DMM driving volumetric head avatars.
To achieve this goal, we design a novel expression-aware modification generative model, which enables lift 2D editing from a single image to a consistent 3D modification field.
arXiv Detail & Related papers (2024-04-02T17:58:35Z) - UltrAvatar: A Realistic Animatable 3D Avatar Diffusion Model with Authenticity Guided Textures [80.047065473698]
We propose a novel 3D avatar generation approach termed UltrAvatar with enhanced fidelity of geometry, and superior quality of physically based rendering (PBR) textures without unwanted lighting.
We demonstrate the effectiveness and robustness of the proposed method, outperforming the state-of-the-art methods by a large margin in the experiments.
arXiv Detail & Related papers (2024-01-20T01:55:17Z) - AvatarBooth: High-Quality and Customizable 3D Human Avatar Generation [14.062402203105712]
AvatarBooth is a novel method for generating high-quality 3D avatars using text prompts or specific images.
Our key contribution is the precise avatar generation control by using dual fine-tuned diffusion models.
We present a multi-resolution rendering strategy that facilitates coarse-to-fine supervision of 3D avatar generation.
arXiv Detail & Related papers (2023-06-16T14:18:51Z) - StyleAvatar3D: Leveraging Image-Text Diffusion Models for High-Fidelity
3D Avatar Generation [103.88928334431786]
We present a novel method for generating high-quality, stylized 3D avatars.
We use pre-trained image-text diffusion models for data generation and a Generative Adversarial Network (GAN)-based 3D generation network for training.
Our approach demonstrates superior performance over current state-of-the-art methods in terms of visual quality and diversity of the produced avatars.
arXiv Detail & Related papers (2023-05-30T13:09:21Z) - DreamAvatar: Text-and-Shape Guided 3D Human Avatar Generation via
Diffusion Models [55.71306021041785]
We present DreamAvatar, a text-and-shape guided framework for generating high-quality 3D human avatars.
We leverage the SMPL model to provide shape and pose guidance for the generation.
We also jointly optimize the losses computed from the full body and from the zoomed-in 3D head to alleviate the common multi-face ''Janus'' problem.
arXiv Detail & Related papers (2023-04-03T12:11:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.