Text-Conditional Contextualized Avatars For Zero-Shot Personalization
- URL: http://arxiv.org/abs/2304.07410v1
- Date: Fri, 14 Apr 2023 22:00:44 GMT
- Title: Text-Conditional Contextualized Avatars For Zero-Shot Personalization
- Authors: Samaneh Azadi, Thomas Hayes, Akbar Shah, Guan Pang, Devi Parikh, Sonal
Gupta
- Abstract summary: We propose a pipeline that enables personalization of image generation with avatars capturing a user's identity in a delightful way.
Our pipeline is zero-shot, avatar texture and style agnostic, and does not require training on the avatar at all.
We show, for the first time, how to leverage large-scale image datasets to learn human 3D pose parameters.
- Score: 47.85747039373798
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent large-scale text-to-image generation models have made significant
improvements in the quality, realism, and diversity of the synthesized images
and enable users to control the created content through language. However, the
personalization aspect of these generative models is still challenging and
under-explored. In this work, we propose a pipeline that enables
personalization of image generation with avatars capturing a user's identity in
a delightful way. Our pipeline is zero-shot, avatar texture and style agnostic,
and does not require training on the avatar at all - it is scalable to millions
of users who can generate a scene with their avatar. To render the avatar in a
pose faithful to the given text prompt, we propose a novel text-to-3D pose
diffusion model trained on a curated large-scale dataset of in-the-wild human
poses improving the performance of the SOTA text-to-motion models
significantly. We show, for the first time, how to leverage large-scale image
datasets to learn human 3D pose parameters and overcome the limitations of
motion capture datasets.
Related papers
- TEDRA: Text-based Editing of Dynamic and Photoreal Actors [59.480513384611804]
TEDRA is the first method allowing text-based edits of an avatar.
We train a model to create a controllable and high-fidelity digital replica of the real actor.
We modify the dynamic avatar based on a provided text prompt.
arXiv Detail & Related papers (2024-08-28T17:59:02Z) - GenCA: A Text-conditioned Generative Model for Realistic and Drivable Codec Avatars [44.8290935585746]
Photo-realistic and controllable 3D avatars are crucial for various applications such as virtual and mixed reality (VR/MR), telepresence, gaming, and film production.
Traditional methods for avatar creation often involve time-consuming scanning and reconstruction processes for each avatar.
We propose a text-conditioned generative model that can generate photo-realistic facial avatars of diverse identities.
arXiv Detail & Related papers (2024-08-24T21:25:22Z) - MagicMirror: Fast and High-Quality Avatar Generation with a Constrained Search Space [25.24509617548819]
We introduce a novel framework for 3D human avatar generation and personalization, leveraging text prompts.
Key innovations are aimed at overcoming the challenges in photo-realistic avatar synthesis.
arXiv Detail & Related papers (2024-04-01T17:59:11Z) - Deformable 3D Gaussian Splatting for Animatable Human Avatars [50.61374254699761]
We propose a fully explicit approach to construct a digital avatar from as little as a single monocular sequence.
ParDy-Human constitutes an explicit model for realistic dynamic human avatars which requires significantly fewer training views and images.
Our avatars learning is free of additional annotations such as Splat masks and can be trained with variable backgrounds while inferring full-resolution images efficiently even on consumer hardware.
arXiv Detail & Related papers (2023-12-22T20:56:46Z) - XAGen: 3D Expressive Human Avatars Generation [76.69560679209171]
XAGen is the first 3D generative model for human avatars capable of expressive control over body, face, and hands.
We propose a multi-part rendering technique that disentangles the synthesis of body, face, and hands.
Experiments show that XAGen surpasses state-of-the-art methods in terms of realism, diversity, and expressive control abilities.
arXiv Detail & Related papers (2023-11-22T18:30:42Z) - AvatarBooth: High-Quality and Customizable 3D Human Avatar Generation [14.062402203105712]
AvatarBooth is a novel method for generating high-quality 3D avatars using text prompts or specific images.
Our key contribution is the precise avatar generation control by using dual fine-tuned diffusion models.
We present a multi-resolution rendering strategy that facilitates coarse-to-fine supervision of 3D avatar generation.
arXiv Detail & Related papers (2023-06-16T14:18:51Z) - DreamAvatar: Text-and-Shape Guided 3D Human Avatar Generation via
Diffusion Models [55.71306021041785]
We present DreamAvatar, a text-and-shape guided framework for generating high-quality 3D human avatars.
We leverage the SMPL model to provide shape and pose guidance for the generation.
We also jointly optimize the losses computed from the full body and from the zoomed-in 3D head to alleviate the common multi-face ''Janus'' problem.
arXiv Detail & Related papers (2023-04-03T12:11:51Z) - AvatarCraft: Transforming Text into Neural Human Avatars with
Parameterized Shape and Pose Control [38.959851274747145]
AvatarCraft is a method for creating a 3D human avatar with a specific identity and artistic style that can be easily animated.
We use diffusion models to guide the learning of geometry and texture for a neural avatar based on a single text prompt.
We make the human avatar animatable by deforming the neural implicit field with an explicit warping field.
arXiv Detail & Related papers (2023-03-30T17:59:59Z) - AvatarGen: a 3D Generative Model for Animatable Human Avatars [108.11137221845352]
AvatarGen is the first method that enables not only non-rigid human generation with diverse appearance but also full control over poses and viewpoints.
To model non-rigid dynamics, it introduces a deformation network to learn pose-dependent deformations in the canonical space.
Our method can generate animatable human avatars with high-quality appearance and geometry modeling, significantly outperforming previous 3D GANs.
arXiv Detail & Related papers (2022-08-01T01:27:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.