DreamWaltz-G: Expressive 3D Gaussian Avatars from Skeleton-Guided 2D
Diffusion
- URL: http://arxiv.org/abs/2409.17145v1
- Date: Wed, 25 Sep 2024 17:59:45 GMT
- Title: DreamWaltz-G: Expressive 3D Gaussian Avatars from Skeleton-Guided 2D
Diffusion
- Authors: Yukun Huang, Jianan Wang, Ailing Zeng, Zheng-Jun Zha, Lei Zhang, Xihui
Liu
- Abstract summary: We present DreamWaltz-G, a novel learning framework for animatable 3D avatar generation from text.
The core of this framework lies in Score Distillation and Hybrid 3D Gaussian Avatar representation.
Our framework further supports diverse applications, including human video reenactment and multi-subject scene composition.
- Score: 69.67970568012599
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Leveraging pretrained 2D diffusion models and score distillation sampling
(SDS), recent methods have shown promising results for text-to-3D avatar
generation. However, generating high-quality 3D avatars capable of expressive
animation remains challenging. In this work, we present DreamWaltz-G, a novel
learning framework for animatable 3D avatar generation from text. The core of
this framework lies in Skeleton-guided Score Distillation and Hybrid 3D
Gaussian Avatar representation. Specifically, the proposed skeleton-guided
score distillation integrates skeleton controls from 3D human templates into 2D
diffusion models, enhancing the consistency of SDS supervision in terms of view
and human pose. This facilitates the generation of high-quality avatars,
mitigating issues such as multiple faces, extra limbs, and blurring. The
proposed hybrid 3D Gaussian avatar representation builds on the efficient 3D
Gaussians, combining neural implicit fields and parameterized 3D meshes to
enable real-time rendering, stable SDS optimization, and expressive animation.
Extensive experiments demonstrate that DreamWaltz-G is highly effective in
generating and animating 3D avatars, outperforming existing methods in both
visual quality and animation expressiveness. Our framework further supports
diverse applications, including human video reenactment and multi-subject scene
composition.
Related papers
- DEGAS: Detailed Expressions on Full-Body Gaussian Avatars [13.683836322899953]
We present DEGAS, the first 3D Gaussian Splatting (3DGS)-based modeling method for full-body avatars with rich facial expressions.
We propose to adopt the expression latent space trained solely on 2D portrait images, bridging the gap between 2D talking faces and 3D avatars.
arXiv Detail & Related papers (2024-08-20T06:52:03Z) - CHASE: 3D-Consistent Human Avatars with Sparse Inputs via Gaussian Splatting and Contrastive Learning [19.763523500564542]
We propose CHASE, which introduces supervision from intrinsic 3D consistency across poses and 3D geometry contrastive learning.
CHASE achieves performance comparable with sparse inputs to that with full inputs.
Though CHASE is designed for sparse inputs, it surprisingly outperforms current SOTA methods.
arXiv Detail & Related papers (2024-08-19T02:46:23Z) - iHuman: Instant Animatable Digital Humans From Monocular Videos [16.98924995658091]
We present a fast, simple, yet effective method for creating animatable 3D digital humans from monocular videos.
This work achieves and illustrates the need of accurate 3D mesh-type modelling of the human body.
Our method is faster by an order of magnitude (in terms of training time) than its closest competitor.
arXiv Detail & Related papers (2024-07-15T18:51:51Z) - Deformable 3D Gaussian Splatting for Animatable Human Avatars [50.61374254699761]
We propose a fully explicit approach to construct a digital avatar from as little as a single monocular sequence.
ParDy-Human constitutes an explicit model for realistic dynamic human avatars which requires significantly fewer training views and images.
Our avatars learning is free of additional annotations such as Splat masks and can be trained with variable backgrounds while inferring full-resolution images efficiently even on consumer hardware.
arXiv Detail & Related papers (2023-12-22T20:56:46Z) - DreamWaltz: Make a Scene with Complex 3D Animatable Avatars [68.49935994384047]
We present DreamWaltz, a novel framework for generating and animating complex 3D avatars given text guidance and parametric human body prior.
For animation, our method learns an animatable 3D avatar representation from abundant image priors of diffusion model conditioned on various poses.
arXiv Detail & Related papers (2023-05-21T17:59:39Z) - DreamAvatar: Text-and-Shape Guided 3D Human Avatar Generation via
Diffusion Models [55.71306021041785]
We present DreamAvatar, a text-and-shape guided framework for generating high-quality 3D human avatars.
We leverage the SMPL model to provide shape and pose guidance for the generation.
We also jointly optimize the losses computed from the full body and from the zoomed-in 3D head to alleviate the common multi-face ''Janus'' problem.
arXiv Detail & Related papers (2023-04-03T12:11:51Z) - Rodin: A Generative Model for Sculpting 3D Digital Avatars Using
Diffusion [66.26780039133122]
This paper presents a 3D generative model that uses diffusion models to automatically generate 3D digital avatars.
The memory and processing costs in 3D are prohibitive for producing the rich details required for high-quality avatars.
We can generate highly detailed avatars with realistic hairstyles and facial hair like beards.
arXiv Detail & Related papers (2022-12-12T18:59:40Z) - DRaCoN -- Differentiable Rasterization Conditioned Neural Radiance
Fields for Articulated Avatars [92.37436369781692]
We present DRaCoN, a framework for learning full-body volumetric avatars.
It exploits the advantages of both the 2D and 3D neural rendering techniques.
Experiments on the challenging ZJU-MoCap and Human3.6M datasets indicate that DRaCoN outperforms state-of-the-art methods.
arXiv Detail & Related papers (2022-03-29T17:59:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.