Related papers: Text-Guided Generation and Editing of Compositional 3D Avatars

Text-Guided Generation and Editing of Compositional 3D Avatars

URL: http://arxiv.org/abs/2309.07125v1
Date: Wed, 13 Sep 2023 17:59:56 GMT
Title: Text-Guided Generation and Editing of Compositional 3D Avatars
Authors: Hao Zhang, Yao Feng, Peter Kulits, Yandong Wen, Justus Thies, Michael J. Black
Abstract summary: Our goal is to create a realistic 3D facial avatar with hair and accessories using only a text description. Existing methods either lack realism, produce unrealistic shapes, or do not support editing.
Score: 59.584042376006316
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Our goal is to create a realistic 3D facial avatar with hair and accessories using only a text description. While this challenge has attracted significant recent interest, existing methods either lack realism, produce unrealistic shapes, or do not support editing, such as modifications to the hairstyle. We argue that existing methods are limited because they employ a monolithic modeling approach, using a single representation for the head, face, hair, and accessories. Our observation is that the hair and face, for example, have very different structural qualities that benefit from different representations. Building on this insight, we generate avatars with a compositional model, in which the head, face, and upper body are represented with traditional 3D meshes, and the hair, clothing, and accessories with neural radiance fields (NeRF). The model-based mesh representation provides a strong geometric prior for the face region, improving realism while enabling editing of the person's appearance. By using NeRFs to represent the remaining components, our method is able to model and synthesize parts with complex geometry and appearance, such as curly hair and fluffy scarves. Our novel system synthesizes these high-quality compositional avatars from text descriptions. The experimental results demonstrate that our method, Text-guided generation and Editing of Compositional Avatars (TECA), produces avatars that are more realistic than those of recent methods while being editable because of their compositional nature. For example, our TECA enables the seamless transfer of compositional features like hairstyles, scarves, and other accessories between avatars. This capability supports applications such as virtual try-on.

Related papers

Im2Haircut: Single-view Strand-based Hair Reconstruction for Human Avatars [60.99229760565975]
We present a novel approach for 3D hair reconstruction from single photographs based on a global hair prior combined with local optimization.<n>We exploit this prior to create a Gaussian-splatting-based reconstruction method that creates hairstyles from one or more images.
arXiv Detail & Related papers (2025-09-01T13:38:08Z)
HairCUP: Hair Compositional Universal Prior for 3D Gaussian Avatars [29.819374818200885]
We present a universal prior model for 3D head avatars with explicit hair compositionality.<n>Our model's inherent compositionality enables seamless transfer of face and hair components between avatars.
arXiv Detail & Related papers (2025-07-25T17:59:53Z)
GenCA: A Text-conditioned Generative Model for Realistic and Drivable Codec Avatars [44.8290935585746]
Photo-realistic and controllable 3D avatars are crucial for various applications such as virtual and mixed reality (VR/MR), telepresence, gaming, and film production. Traditional methods for avatar creation often involve time-consuming scanning and reconstruction processes for each avatar. We propose a text-conditioned generative model that can generate photo-realistic facial avatars of diverse identities.
arXiv Detail & Related papers (2024-08-24T21:25:22Z)
Learning Disentangled Avatars with Hybrid 3D Representations [102.9632315060652]
We present Disentangled Avatars(DELTA) which models humans with hybrid explicit-implicit 3D representations. We consider the disentanglement of the human body and clothing and in the second, we disentangle the face and hair. We show how these two applications can be easily combined to model full-body avatars.
arXiv Detail & Related papers (2023-09-12T17:59:36Z)
AvatarBooth: High-Quality and Customizable 3D Human Avatar Generation [14.062402203105712]
AvatarBooth is a novel method for generating high-quality 3D avatars using text prompts or specific images. Our key contribution is the precise avatar generation control by using dual fine-tuned diffusion models. We present a multi-resolution rendering strategy that facilitates coarse-to-fine supervision of 3D avatar generation.
arXiv Detail & Related papers (2023-06-16T14:18:51Z)
HeadSculpt: Crafting 3D Head Avatars with Text [143.14548696613886]
We introduce a versatile pipeline dubbed HeadSculpt for crafting 3D head avatars from textual prompts. We first equip the diffusion model with 3D awareness by leveraging landmark-based control and a learned textual embedding. We propose a novel identity-aware editing score distillation strategy to optimize a textured mesh with a high-resolution differentiable rendering technique.
arXiv Detail & Related papers (2023-06-05T16:53:58Z)
Single-Shot Implicit Morphable Faces with Consistent Texture Parameterization [91.52882218901627]
We propose a novel method for constructing implicit 3D morphable face models that are both generalizable and intuitive for editing. Our method improves upon photo-realism, geometry, and expression accuracy compared to state-of-the-art methods.
arXiv Detail & Related papers (2023-05-04T17:58:40Z)
Capturing and Animation of Body and Clothing from Monocular Video [105.87228128022804]
We present SCARF, a hybrid model combining a mesh-based body with a neural radiance field. integrating the mesh into the rendering enables us to optimize SCARF directly from monocular videos. We demonstrate that SCARFs clothing with higher visual quality than existing methods, that the clothing deforms with changing body pose and body shape, and that clothing can be successfully transferred between avatars of different subjects.
arXiv Detail & Related papers (2022-10-04T19:34:05Z)
I M Avatar: Implicit Morphable Head Avatars from Videos [68.13409777995392]
We propose IMavatar, a novel method for learning implicit head avatars from monocular videos. Inspired by the fine-grained control mechanisms afforded by conventional 3DMMs, we represent the expression- and pose-related deformations via learned blendshapes and skinning fields. We show quantitatively and qualitatively that our method improves geometry and covers a more complete expression space compared to state-of-the-art methods.
arXiv Detail & Related papers (2021-12-14T15:30:32Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.