Disentangling 3D Attributes from a Single 2D Image: Human Pose, Shape
and Garment
- URL: http://arxiv.org/abs/2208.03167v1
- Date: Fri, 5 Aug 2022 13:48:43 GMT
- Title: Disentangling 3D Attributes from a Single 2D Image: Human Pose, Shape
and Garment
- Authors: Xue Hu, Xinghui Li, Benjamin Busam, Yiren Zhou, Ales Leonardis,
Shanxin Yuan
- Abstract summary: We focus on the challenging task of extracting disentangled 3D attributes only from 2D image data.
Our method learns an embedding with disentangled latent representations of these three image properties.
We show how an implicit shape loss can benefit the model's ability to recover fine-grained reconstruction details.
- Score: 20.17991487155361
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: For visual manipulation tasks, we aim to represent image content with
semantically meaningful features. However, learning implicit representations
from images often lacks interpretability, especially when attributes are
intertwined. We focus on the challenging task of extracting disentangled 3D
attributes only from 2D image data. Specifically, we focus on human appearance
and learn implicit pose, shape and garment representations of dressed humans
from RGB images. Our method learns an embedding with disentangled latent
representations of these three image properties and enables meaningful
re-assembling of features and property control through a 2D-to-3D
encoder-decoder structure. The 3D model is inferred solely from the feature map
in the learned embedding space. To the best of our knowledge, our method is the
first to achieve cross-domain disentanglement for this highly under-constrained
problem. We qualitatively and quantitatively demonstrate our framework's
ability to transfer pose, shape, and garments in 3D reconstruction on virtual
data and show how an implicit shape loss can benefit the model's ability to
recover fine-grained reconstruction details.
Related papers
- SSR-2D: Semantic 3D Scene Reconstruction from 2D Images [54.46126685716471]
In this work, we explore a central 3D scene modeling task, namely, semantic scene reconstruction without using any 3D annotations.
The key idea of our approach is to design a trainable model that employs both incomplete 3D reconstructions and their corresponding source RGB-D images.
Our method achieves the state-of-the-art performance of semantic scene completion on two large-scale benchmark datasets MatterPort3D and ScanNet.
arXiv Detail & Related papers (2023-02-07T17:47:52Z) - GAN2X: Non-Lambertian Inverse Rendering of Image GANs [85.76426471872855]
We present GAN2X, a new method for unsupervised inverse rendering that only uses unpaired images for training.
Unlike previous Shape-from-GAN approaches that mainly focus on 3D shapes, we take the first attempt to also recover non-Lambertian material properties by exploiting the pseudo paired data generated by a GAN.
Experiments demonstrate that GAN2X can accurately decompose 2D images to 3D shape, albedo, and specular properties for different object categories, and achieves the state-of-the-art performance for unsupervised single-view 3D face reconstruction.
arXiv Detail & Related papers (2022-06-18T16:58:49Z) - RiCS: A 2D Self-Occlusion Map for Harmonizing Volumetric Objects [68.85305626324694]
Ray-marching in Camera Space (RiCS) is a new method to represent the self-occlusions of foreground objects in 3D into a 2D self-occlusion map.
We show that our representation map not only allows us to enhance the image quality but also to model temporally coherent complex shadow effects.
arXiv Detail & Related papers (2022-05-14T05:35:35Z) - 3D Shape Reconstruction from 2D Images with Disentangled Attribute Flow [61.62796058294777]
Reconstructing 3D shape from a single 2D image is a challenging task.
Most of the previous methods still struggle to extract semantic attributes for 3D reconstruction task.
We propose 3DAttriFlow to disentangle and extract semantic attributes through different semantic levels in the input images.
arXiv Detail & Related papers (2022-03-29T02:03:31Z) - Learning Canonical 3D Object Representation for Fine-Grained Recognition [77.33501114409036]
We propose a novel framework for fine-grained object recognition that learns to recover object variation in 3D space from a single image.
We represent an object as a composition of 3D shape and its appearance, while eliminating the effect of camera viewpoint.
By incorporating 3D shape and appearance jointly in a deep representation, our method learns the discriminative representation of the object.
arXiv Detail & Related papers (2021-08-10T12:19:34Z) - Neural Articulated Radiance Field [90.91714894044253]
We present Neural Articulated Radiance Field (NARF), a novel deformable 3D representation for articulated objects learned from images.
Experiments show that the proposed method is efficient and can generalize well to novel poses.
arXiv Detail & Related papers (2021-04-07T13:23:14Z) - Fully Understanding Generic Objects: Modeling, Segmentation, and
Reconstruction [33.95791350070165]
Inferring 3D structure of a generic object from a 2D image is a long-standing objective of computer vision.
We take an alternative approach with semi-supervised learning. That is, for a 2D image of a generic object, we decompose it into latent representations of category, shape and albedo.
We show that the complete shape and albedo modeling enables us to leverage real 2D images in both modeling and model fitting.
arXiv Detail & Related papers (2021-04-02T02:39:29Z) - Cycle-Consistent Generative Rendering for 2D-3D Modality Translation [21.962725416347855]
We learn a module that generates a realistic rendering of a 3D object and infers a realistic 3D shape from an image.
By leveraging generative domain translation methods, we are able to define a learning algorithm that requires only weak supervision, with unpaired data.
The resulting model is able to perform 3D shape, pose, and texture inference from 2D images, but can also generate novel textured 3D shapes and renders.
arXiv Detail & Related papers (2020-11-16T15:23:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.