Structured 3D Features for Reconstructing Controllable Avatars
- URL: http://arxiv.org/abs/2212.06820v3
- Date: Sat, 15 Apr 2023 19:52:10 GMT
- Title: Structured 3D Features for Reconstructing Controllable Avatars
- Authors: Enric Corona, Mihai Zanfir, Thiemo Alldieck, Eduard Gabriel Bazavan,
Andrei Zanfir, Cristian Sminchisescu
- Abstract summary: We introduce Structured 3D Features, a model based on a novel implicit 3D representation that pools pixel-aligned image features onto dense 3D points sampled from a parametric, statistical human mesh surface.
We show that our S3F model surpasses the previous state-of-the-art on various tasks, including monocular 3D reconstruction, as well as albedo and shading estimation.
- Score: 43.36074729431982
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We introduce Structured 3D Features, a model based on a novel implicit 3D
representation that pools pixel-aligned image features onto dense 3D points
sampled from a parametric, statistical human mesh surface. The 3D points have
associated semantics and can move freely in 3D space. This allows for optimal
coverage of the person of interest, beyond just the body shape, which in turn,
additionally helps modeling accessories, hair, and loose clothing. Owing to
this, we present a complete 3D transformer-based attention framework which,
given a single image of a person in an unconstrained pose, generates an
animatable 3D reconstruction with albedo and illumination decomposition, as a
result of a single end-to-end model, trained semi-supervised, and with no
additional postprocessing. We show that our S3F model surpasses the previous
state-of-the-art on various tasks, including monocular 3D reconstruction, as
well as albedo and shading estimation. Moreover, we show that the proposed
methodology allows novel view synthesis, relighting, and re-posing the
reconstruction, and can naturally be extended to handle multiple input images
(e.g. different views of a person, or the same view, in different poses, in
video). Finally, we demonstrate the editing capabilities of our model for 3D
virtual try-on applications.
Related papers
- SYM3D: Learning Symmetric Triplanes for Better 3D-Awareness of GANs [5.84660008137615]
SYM3D is a novel 3D-aware GAN designed to leverage the prevalental symmetry structure found in natural and man-made objects.
We demonstrate its superior performance in capturing detailed geometry and texture, even when trained on only single-view images.
arXiv Detail & Related papers (2024-06-10T16:24:07Z) - En3D: An Enhanced Generative Model for Sculpting 3D Humans from 2D
Synthetic Data [36.51674664590734]
We present En3D, an enhanced izable scheme for high-qualityd 3D human avatars.
Unlike previous works that rely on scarce 3D datasets or limited 2D collections with imbalance viewing angles and pose priors, our approach aims to develop a zero-shot 3D capable of producing 3D humans.
arXiv Detail & Related papers (2024-01-02T12:06:31Z) - Single-view 3D Scene Reconstruction with High-fidelity Shape and Texture [47.44029968307207]
We propose a novel framework for simultaneous high-fidelity recovery of object shapes and textures from single-view images.
Our approach utilizes the proposed Single-view neural implicit Shape and Radiance field (SSR) representations to leverage both explicit 3D shape supervision and volume rendering.
A distinctive feature of our framework is its ability to generate fine-grained textured meshes while seamlessly integrating rendering capabilities into the single-view 3D reconstruction model.
arXiv Detail & Related papers (2023-11-01T11:46:15Z) - Anything-3D: Towards Single-view Anything Reconstruction in the Wild [61.090129285205805]
We introduce Anything-3D, a methodical framework that ingeniously combines a series of visual-language models and the Segment-Anything object segmentation model.
Our approach employs a BLIP model to generate textural descriptions, utilize the Segment-Anything model for the effective extraction of objects of interest, and leverages a text-to-image diffusion model to lift object into a neural radiance field.
arXiv Detail & Related papers (2023-04-19T16:39:51Z) - Generative Novel View Synthesis with 3D-Aware Diffusion Models [96.78397108732233]
We present a diffusion-based model for 3D-aware generative novel view synthesis from as few as a single input image.
Our method makes use of existing 2D diffusion backbones but, crucially, incorporates geometry priors in the form of a 3D feature volume.
In addition to generating novel views, our method has the ability to autoregressively synthesize 3D-consistent sequences.
arXiv Detail & Related papers (2023-04-05T17:15:47Z) - High-fidelity 3D GAN Inversion by Pseudo-multi-view Optimization [51.878078860524795]
We present a high-fidelity 3D generative adversarial network (GAN) inversion framework that can synthesize photo-realistic novel views.
Our approach enables high-fidelity 3D rendering from a single image, which is promising for various applications of AI-generated 3D content.
arXiv Detail & Related papers (2022-11-28T18:59:52Z) - Next3D: Generative Neural Texture Rasterization for 3D-Aware Head
Avatars [36.4402388864691]
3D-aware generative adversarial networks (GANs) synthesize high-fidelity and multi-view-consistent facial images using only collections of single-view 2D imagery.
Recent efforts incorporate 3D Morphable Face Model (3DMM) to describe deformation in generative radiance fields either explicitly or implicitly.
We propose a novel 3D GAN framework for unsupervised learning of generative, high-quality and 3D-consistent facial avatars from unstructured 2D images.
arXiv Detail & Related papers (2022-11-21T06:40:46Z) - Learning Canonical 3D Object Representation for Fine-Grained Recognition [77.33501114409036]
We propose a novel framework for fine-grained object recognition that learns to recover object variation in 3D space from a single image.
We represent an object as a composition of 3D shape and its appearance, while eliminating the effect of camera viewpoint.
By incorporating 3D shape and appearance jointly in a deep representation, our method learns the discriminative representation of the object.
arXiv Detail & Related papers (2021-08-10T12:19:34Z) - Towards Realistic 3D Embedding via View Alignment [53.89445873577063]
This paper presents an innovative View Alignment GAN (VA-GAN) that composes new images by embedding 3D models into 2D background images realistically and automatically.
VA-GAN consists of a texture generator and a differential discriminator that are inter-connected and end-to-end trainable.
arXiv Detail & Related papers (2020-07-14T14:45:00Z) - AUTO3D: Novel view synthesis through unsupervisely learned variational
viewpoint and global 3D representation [27.163052958878776]
This paper targets on learning-based novel view synthesis from a single or limited 2D images without the pose supervision.
We construct an end-to-end trainable conditional variational framework to disentangle the unsupervisely learned relative-pose/rotation and implicit global 3D representation.
Our system can achieve implicitly 3D understanding without explicitly 3D reconstruction.
arXiv Detail & Related papers (2020-07-13T18:51:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.