ID-to-3D: Expressive ID-guided 3D Heads via Score Distillation Sampling
- URL: http://arxiv.org/abs/2405.16570v2
- Date: Tue, 28 May 2024 09:36:06 GMT
- Title: ID-to-3D: Expressive ID-guided 3D Heads via Score Distillation Sampling
- Authors: Francesca Babiloni, Alexandros Lattas, Jiankang Deng, Stefanos Zafeiriou,
- Abstract summary: ID-to-3D is a method to generate identity- and text-guided 3D human heads with disentangled expressions.
Results achieve an unprecedented level of identity-consistent and high-quality texture and geometry generation.
- Score: 96.87575334960258
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We propose ID-to-3D, a method to generate identity- and text-guided 3D human heads with disentangled expressions, starting from even a single casually captured in-the-wild image of a subject. The foundation of our approach is anchored in compositionality, alongside the use of task-specific 2D diffusion models as priors for optimization. First, we extend a foundational model with a lightweight expression-aware and ID-aware architecture, and create 2D priors for geometry and texture generation, via fine-tuning only 0.2% of its available training parameters. Then, we jointly leverage a neural parametric representation for the expressions of each subject and a multi-stage generation of highly detailed geometry and albedo texture. This combination of strong face identity embeddings and our neural representation enables accurate reconstruction of not only facial features but also accessories and hair and can be meshed to provide render-ready assets for gaming and telepresence. Our results achieve an unprecedented level of identity-consistent and high-quality texture and geometry generation, generalizing to a ``world'' of unseen 3D identities, without relying on large 3D captured datasets of human assets.
Related papers
- Arc2Avatar: Generating Expressive 3D Avatars from a Single Image via ID Guidance [69.9745497000557]
We introduce Arc2Avatar, the first SDS-based method utilizing a human face foundation model as guidance with just a single image as input.
Our avatars maintain a dense correspondence with a human face mesh template, allowing blendshape-based expression generation.
arXiv Detail & Related papers (2025-01-09T17:04:33Z) - FAMOUS: High-Fidelity Monocular 3D Human Digitization Using View Synthesis [51.193297565630886]
The challenge of accurately inferring texture remains, particularly in obscured areas such as the back of a person in frontal-view images.
This limitation in texture prediction largely stems from the scarcity of large-scale and diverse 3D datasets.
We propose leveraging extensive 2D fashion datasets to enhance both texture and shape prediction in 3D human digitization.
arXiv Detail & Related papers (2024-10-13T01:25:05Z) - ID-Sculpt: ID-aware 3D Head Generation from Single In-the-wild Portrait Image [57.46195661521239]
Previous text-based methods for generating 3D heads were limited by text descriptions and image-based methods struggled to produce high-quality head geometry.
We propose a novel framework, ID-Sculpt, to generate high-quality 3D heads while preserving their identities.
Extensive experiments demonstrate that we can generate high-quality 3D heads with accurate geometry and texture from a single in-the-wild portrait image.
arXiv Detail & Related papers (2024-06-24T15:11:35Z) - Guide3D: Create 3D Avatars from Text and Image Guidance [55.71306021041785]
Guide3D is a text-and-image-guided generative model for 3D avatar generation based on diffusion models.
Our framework produces topologically and structurally correct geometry and high-resolution textures.
arXiv Detail & Related papers (2023-08-18T17:55:47Z) - Generalizable One-shot Neural Head Avatar [90.50492165284724]
We present a method that reconstructs and animates a 3D head avatar from a single-view portrait image.
We propose a framework that not only generalizes to unseen identities based on a single-view image, but also captures characteristic details within and beyond the face area.
arXiv Detail & Related papers (2023-06-14T22:33:09Z) - Next3D: Generative Neural Texture Rasterization for 3D-Aware Head
Avatars [36.4402388864691]
3D-aware generative adversarial networks (GANs) synthesize high-fidelity and multi-view-consistent facial images using only collections of single-view 2D imagery.
Recent efforts incorporate 3D Morphable Face Model (3DMM) to describe deformation in generative radiance fields either explicitly or implicitly.
We propose a novel 3D GAN framework for unsupervised learning of generative, high-quality and 3D-consistent facial avatars from unstructured 2D images.
arXiv Detail & Related papers (2022-11-21T06:40:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.