Text-Conditional Contextualized Avatars For Zero-Shot Personalization
- URL: http://arxiv.org/abs/2304.07410v1
- Date: Fri, 14 Apr 2023 22:00:44 GMT
- Title: Text-Conditional Contextualized Avatars For Zero-Shot Personalization
- Authors: Samaneh Azadi, Thomas Hayes, Akbar Shah, Guan Pang, Devi Parikh, Sonal
Gupta
- Abstract summary: We propose a pipeline that enables personalization of image generation with avatars capturing a user's identity in a delightful way.
Our pipeline is zero-shot, avatar texture and style agnostic, and does not require training on the avatar at all.
We show, for the first time, how to leverage large-scale image datasets to learn human 3D pose parameters.
- Score: 47.85747039373798
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent large-scale text-to-image generation models have made significant
improvements in the quality, realism, and diversity of the synthesized images
and enable users to control the created content through language. However, the
personalization aspect of these generative models is still challenging and
under-explored. In this work, we propose a pipeline that enables
personalization of image generation with avatars capturing a user's identity in
a delightful way. Our pipeline is zero-shot, avatar texture and style agnostic,
and does not require training on the avatar at all - it is scalable to millions
of users who can generate a scene with their avatar. To render the avatar in a
pose faithful to the given text prompt, we propose a novel text-to-3D pose
diffusion model trained on a curated large-scale dataset of in-the-wild human
poses improving the performance of the SOTA text-to-motion models
significantly. We show, for the first time, how to leverage large-scale image
datasets to learn human 3D pose parameters and overcome the limitations of
motion capture datasets.
Related papers
- MagicMirror: Fast and High-Quality Avatar Generation with a Constrained Search Space [25.24509617548819]
We introduce a novel framework for 3D human avatar generation and personalization, leveraging text prompts.
Key innovations are aimed at overcoming the challenges in photo-realistic avatar synthesis.
arXiv Detail & Related papers (2024-04-01T17:59:11Z) - Deformable 3D Gaussian Splatting for Animatable Human Avatars [50.61374254699761]
We propose a fully explicit approach to construct a digital avatar from as little as a single monocular sequence.
ParDy-Human constitutes an explicit model for realistic dynamic human avatars which requires significantly fewer training views and images.
Our avatars learning is free of additional annotations such as Splat masks and can be trained with variable backgrounds while inferring full-resolution images efficiently even on consumer hardware.
arXiv Detail & Related papers (2023-12-22T20:56:46Z) - XAGen: 3D Expressive Human Avatars Generation [76.69560679209171]
XAGen is the first 3D generative model for human avatars capable of expressive control over body, face, and hands.
We propose a multi-part rendering technique that disentangles the synthesis of body, face, and hands.
Experiments show that XAGen surpasses state-of-the-art methods in terms of realism, diversity, and expressive control abilities.
arXiv Detail & Related papers (2023-11-22T18:30:42Z) - AvatarBooth: High-Quality and Customizable 3D Human Avatar Generation [14.062402203105712]
AvatarBooth is a novel method for generating high-quality 3D avatars using text prompts or specific images.
Our key contribution is the precise avatar generation control by using dual fine-tuned diffusion models.
We present a multi-resolution rendering strategy that facilitates coarse-to-fine supervision of 3D avatar generation.
arXiv Detail & Related papers (2023-06-16T14:18:51Z) - StyleAvatar3D: Leveraging Image-Text Diffusion Models for High-Fidelity
3D Avatar Generation [103.88928334431786]
We present a novel method for generating high-quality, stylized 3D avatars.
We use pre-trained image-text diffusion models for data generation and a Generative Adversarial Network (GAN)-based 3D generation network for training.
Our approach demonstrates superior performance over current state-of-the-art methods in terms of visual quality and diversity of the produced avatars.
arXiv Detail & Related papers (2023-05-30T13:09:21Z) - DreamAvatar: Text-and-Shape Guided 3D Human Avatar Generation via
Diffusion Models [55.71306021041785]
We present DreamAvatar, a text-and-shape guided framework for generating high-quality 3D human avatars.
We leverage the SMPL model to provide shape and pose guidance for the generation.
We also jointly optimize the losses computed from the full body and from the zoomed-in 3D head to alleviate the common multi-face ''Janus'' problem.
arXiv Detail & Related papers (2023-04-03T12:11:51Z) - AvatarCraft: Transforming Text into Neural Human Avatars with
Parameterized Shape and Pose Control [38.959851274747145]
AvatarCraft is a method for creating a 3D human avatar with a specific identity and artistic style that can be easily animated.
We use diffusion models to guide the learning of geometry and texture for a neural avatar based on a single text prompt.
We make the human avatar animatable by deforming the neural implicit field with an explicit warping field.
arXiv Detail & Related papers (2023-03-30T17:59:59Z) - X-Avatar: Expressive Human Avatars [33.24502928725897]
We present X-Avatar, a novel avatar model that captures the full expressiveness of digital humans to bring about life-like experiences in telepresence, AR/VR and beyond.
Our method models bodies, hands, facial expressions and appearance in a holistic fashion and can be learned from either full 3D scans or RGB-D data.
arXiv Detail & Related papers (2023-03-08T18:59:39Z) - AvatarGen: a 3D Generative Model for Animatable Human Avatars [108.11137221845352]
AvatarGen is the first method that enables not only non-rigid human generation with diverse appearance but also full control over poses and viewpoints.
To model non-rigid dynamics, it introduces a deformation network to learn pose-dependent deformations in the canonical space.
Our method can generate animatable human avatars with high-quality appearance and geometry modeling, significantly outperforming previous 3D GANs.
arXiv Detail & Related papers (2022-08-01T01:27:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.