HeadStudio: Text to Animatable Head Avatars with 3D Gaussian Splatting
- URL: http://arxiv.org/abs/2402.06149v2
- Date: Sat, 21 Dec 2024 01:30:18 GMT
- Title: HeadStudio: Text to Animatable Head Avatars with 3D Gaussian Splatting
- Authors: Zhenglin Zhou, Fan Ma, Hehe Fan, Zongxin Yang, Yi Yang,
- Abstract summary: HeadStudio is a framework that generates realistic and animatable avatars from text prompts.
The avatars are capable of rendering high-quality real-time views at a resolution of 1024 fps.
- Score: 43.978358118034514
- License:
- Abstract: Creating digital avatars from textual prompts has long been a desirable yet challenging task. Despite the promising results achieved with 2D diffusion priors, current methods struggle to create high-quality and consistent animated avatars efficiently. Previous animatable head models like FLAME have difficulty in accurately representing detailed texture and geometry. Additionally, high-quality 3D static representations face challenges in semantically driving with dynamic priors. In this paper, we introduce \textbf{HeadStudio}, a novel framework that utilizes 3D Gaussian splatting to generate realistic and animatable avatars from text prompts. Firstly, we associate 3D Gaussians with animatable head prior model, facilitating semantic animation on high-quality 3D representations. To ensure consistent animation, we further enhance the optimization from initialization, distillation, and regularization to jointly learn the shape, texture, and animation. Extensive experiments demonstrate the efficacy of HeadStudio in generating animatable avatars from textual prompts, exhibiting appealing appearances. The avatars are capable of rendering high-quality real-time ($\geq 40$ fps) novel views at a resolution of 1024. Moreover, These avatars can be smoothly driven by real-world speech and video. We hope that HeadStudio can enhance digital avatar creation and gain popularity in the community. Code is at: https://github.com/ZhenglinZhou/HeadStudio.
Related papers
- Generating Editable Head Avatars with 3D Gaussian GANs [57.51487984425395]
Traditional 3D-aware generative adversarial networks (GANs) achieve photorealistic and view-consistent 3D head synthesis.
We propose a novel approach that enhances the editability and animation control of 3D head avatars by incorporating 3D Gaussian Splatting (3DGS) as an explicit 3D representation.
Our approach delivers high-quality 3D-aware synthesis with state-of-the-art controllability.
arXiv Detail & Related papers (2024-12-26T10:10:03Z) - DreamWaltz-G: Expressive 3D Gaussian Avatars from Skeleton-Guided 2D
Diffusion [69.67970568012599]
We present DreamWaltz-G, a novel learning framework for animatable 3D avatar generation from text.
The core of this framework lies in Score Distillation and Hybrid 3D Gaussian Avatar representation.
Our framework further supports diverse applications, including human video reenactment and multi-subject scene composition.
arXiv Detail & Related papers (2024-09-25T17:59:45Z) - AvatarStudio: High-fidelity and Animatable 3D Avatar Creation from Text [71.09533176800707]
AvatarStudio is a coarse-to-fine generative model that generates explicit textured 3D meshes for animatable human avatars.
By effectively leveraging the synergy between the articulated mesh representation and the DensePose-conditional diffusion model, AvatarStudio can create high-quality avatars.
arXiv Detail & Related papers (2023-11-29T18:59:32Z) - HeadSculpt: Crafting 3D Head Avatars with Text [143.14548696613886]
We introduce a versatile pipeline dubbed HeadSculpt for crafting 3D head avatars from textual prompts.
We first equip the diffusion model with 3D awareness by leveraging landmark-based control and a learned textual embedding.
We propose a novel identity-aware editing score distillation strategy to optimize a textured mesh with a high-resolution differentiable rendering technique.
arXiv Detail & Related papers (2023-06-05T16:53:58Z) - DreamWaltz: Make a Scene with Complex 3D Animatable Avatars [68.49935994384047]
We present DreamWaltz, a novel framework for generating and animating complex 3D avatars given text guidance and parametric human body prior.
For animation, our method learns an animatable 3D avatar representation from abundant image priors of diffusion model conditioned on various poses.
arXiv Detail & Related papers (2023-05-21T17:59:39Z) - GANHead: Towards Generative Animatable Neural Head Avatars [31.35233032284164]
GANHead is a novel generative head model that takes advantages of the fine-grained control over the explicit expression parameters.
It represents coarse geometry, fine-gained details and texture via three networks in canonical space.
It achieves superior performance on head avatar generation and raw scan fitting.
arXiv Detail & Related papers (2023-04-08T07:56:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.