Rodin: A Generative Model for Sculpting 3D Digital Avatars Using
Diffusion
- URL: http://arxiv.org/abs/2212.06135v1
- Date: Mon, 12 Dec 2022 18:59:40 GMT
- Title: Rodin: A Generative Model for Sculpting 3D Digital Avatars Using
Diffusion
- Authors: Tengfei Wang, Bo Zhang, Ting Zhang, Shuyang Gu, Jianmin Bao, Tadas
Baltrusaitis, Jingjing Shen, Dong Chen, Fang Wen, Qifeng Chen, Baining Guo
- Abstract summary: This paper presents a 3D generative model that uses diffusion models to automatically generate 3D digital avatars.
The memory and processing costs in 3D are prohibitive for producing the rich details required for high-quality avatars.
We can generate highly detailed avatars with realistic hairstyles and facial hair like beards.
- Score: 66.26780039133122
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper presents a 3D generative model that uses diffusion models to
automatically generate 3D digital avatars represented as neural radiance
fields. A significant challenge in generating such avatars is that the memory
and processing costs in 3D are prohibitive for producing the rich details
required for high-quality avatars. To tackle this problem we propose the
roll-out diffusion network (Rodin), which represents a neural radiance field as
multiple 2D feature maps and rolls out these maps into a single 2D feature
plane within which we perform 3D-aware diffusion. The Rodin model brings the
much-needed computational efficiency while preserving the integrity of
diffusion in 3D by using 3D-aware convolution that attends to projected
features in the 2D feature plane according to their original relationship in
3D. We also use latent conditioning to orchestrate the feature generation for
global coherence, leading to high-fidelity avatars and enabling their semantic
editing based on text prompts. Finally, we use hierarchical synthesis to
further enhance details. The 3D avatars generated by our model compare
favorably with those produced by existing generative techniques. We can
generate highly detailed avatars with realistic hairstyles and facial hair like
beards. We also demonstrate 3D avatar generation from image or text as well as
text-guided editability.
Related papers
- DreamWaltz-G: Expressive 3D Gaussian Avatars from Skeleton-Guided 2D
Diffusion [69.67970568012599]
We present DreamWaltz-G, a novel learning framework for animatable 3D avatar generation from text.
The core of this framework lies in Score Distillation and Hybrid 3D Gaussian Avatar representation.
Our framework further supports diverse applications, including human video reenactment and multi-subject scene composition.
arXiv Detail & Related papers (2024-09-25T17:59:45Z) - RodinHD: High-Fidelity 3D Avatar Generation with Diffusion Models [56.13752698926105]
We present RodinHD, which can generate high-fidelity 3D avatars from a portrait image.
We first identify an overlooked problem of catastrophic forgetting that arises when fitting triplanes sequentially on many avatars.
We optimize the guiding effect of the portrait image by computing a finer-grained hierarchical representation that captures rich 2D texture cues, and injecting them to the 3D diffusion model at multiple layers via cross-attention.
When trained on 46K avatars with a noise schedule optimized for triplanes, the resulting model can generate 3D avatars with notably better details than previous methods and can generalize to in-the-wild
arXiv Detail & Related papers (2024-07-09T15:14:45Z) - Human 3Diffusion: Realistic Avatar Creation via Explicit 3D Consistent Diffusion Models [29.73743772971411]
We propose Human 3Diffusion: Realistic Avatar Creation via Explicit 3D Consistent Diffusion.
Our key insight is that 2D multi-view diffusion and 3D reconstruction models provide complementary information for each other.
Our proposed framework outperforms state-of-the-art methods and enables the creation of realistic avatars from a single RGB image.
arXiv Detail & Related papers (2024-06-12T17:57:25Z) - Articulated 3D Head Avatar Generation using Text-to-Image Diffusion
Models [107.84324544272481]
The ability to generate diverse 3D articulated head avatars is vital to a plethora of applications, including augmented reality, cinematography, and education.
Recent work on text-guided 3D object generation has shown great promise in addressing these needs.
We show that our diffusion-based articulated head avatars outperform state-of-the-art approaches for this task.
arXiv Detail & Related papers (2023-07-10T19:15:32Z) - Chupa: Carving 3D Clothed Humans from Skinned Shape Priors using 2D
Diffusion Probabilistic Models [9.479195068754507]
We propose a 3D generation pipeline that uses diffusion models to generate realistic human digital avatars.
Our method, namely, Chupa, is capable of generating realistic 3D clothed humans with better perceptual quality and identity variety.
arXiv Detail & Related papers (2023-05-19T17:59:18Z) - DreamAvatar: Text-and-Shape Guided 3D Human Avatar Generation via
Diffusion Models [55.71306021041785]
We present DreamAvatar, a text-and-shape guided framework for generating high-quality 3D human avatars.
We leverage the SMPL model to provide shape and pose guidance for the generation.
We also jointly optimize the losses computed from the full body and from the zoomed-in 3D head to alleviate the common multi-face ''Janus'' problem.
arXiv Detail & Related papers (2023-04-03T12:11:51Z) - DRaCoN -- Differentiable Rasterization Conditioned Neural Radiance
Fields for Articulated Avatars [92.37436369781692]
We present DRaCoN, a framework for learning full-body volumetric avatars.
It exploits the advantages of both the 2D and 3D neural rendering techniques.
Experiments on the challenging ZJU-MoCap and Human3.6M datasets indicate that DRaCoN outperforms state-of-the-art methods.
arXiv Detail & Related papers (2022-03-29T17:59:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.