Pixel2ISDF: Implicit Signed Distance Fields based Human Body Model from
Multi-view and Multi-pose Images
- URL: http://arxiv.org/abs/2212.02765v1
- Date: Tue, 6 Dec 2022 05:30:49 GMT
- Title: Pixel2ISDF: Implicit Signed Distance Fields based Human Body Model from
Multi-view and Multi-pose Images
- Authors: Jianchuan Chen, Wentao Yi, Tiantian Wang, Xing Li, Liqian Ma, Yangyu
Fan, Huchuan Lu
- Abstract summary: We focus on reconstructing clothed humans in the canonical space given multiple views and poses of a human as the input.
We learn latent codes on the posed mesh by leveraging multiple input images and then assign the latent codes to the mesh in the canonical space.
Our work for reconstructing the human shape on canonical pose achieves 3rd performance on WCPA MVP-Human Body Challenge.
- Score: 67.45882013828256
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this report, we focus on reconstructing clothed humans in the canonical
space given multiple views and poses of a human as the input. To achieve this,
we utilize the geometric prior of the SMPLX model in the canonical space to
learn the implicit representation for geometry reconstruction. Based on the
observation that the topology between the posed mesh and the mesh in the
canonical space are consistent, we propose to learn latent codes on the posed
mesh by leveraging multiple input images and then assign the latent codes to
the mesh in the canonical space. Specifically, we first leverage normal and
geometry networks to extract the feature vector for each vertex on the SMPLX
mesh. Normal maps are adopted for better generalization to unseen images
compared to 2D images. Then, features for each vertex on the posed mesh from
multiple images are integrated by MLPs. The integrated features acting as the
latent code are anchored to the SMPLX mesh in the canonical space. Finally,
latent code for each 3D point is extracted and utilized to calculate the SDF.
Our work for reconstructing the human shape on canonical pose achieves 3rd
performance on WCPA MVP-Human Body Challenge.
Related papers
- CanonicalFusion: Generating Drivable 3D Human Avatars from Multiple Images [17.10258463020844]
We present a novel framework for reconstructing animatable human avatars from multiple images, termed CanonicalFusion.
We first predict Linear Blend Skinning (LBS) weight maps and depth maps using a shared-encoder-dual-decoder network, enabling direct canonicalization of the 3D mesh from the predicted depth maps.
We also introduce a forward skinning-based different rendering scheme to merge the reconstructed results from multiple images.
arXiv Detail & Related papers (2024-07-05T08:36:26Z) - HR Human: Modeling Human Avatars with Triangular Mesh and High-Resolution Textures from Videos [52.23323966700072]
We present a framework for acquiring human avatars that are attached with high-resolution physically-based material textures and mesh from monocular video.
Our method introduces a novel information fusion strategy to combine the information from the monocular video and synthesize virtual multi-view images.
Experiments show that our approach outperforms previous representations in terms of high fidelity, and this explicit result supports deployment on common triangulars.
arXiv Detail & Related papers (2024-05-18T11:49:09Z) - GALA: Generating Animatable Layered Assets from a Single Scan [20.310367593475508]
We present GALA, a framework that takes as input a single-layer clothed 3D human mesh and decomposes it into complete multi-layered 3D assets.
The outputs can then be combined with other assets to create novel clothed human avatars with any pose.
arXiv Detail & Related papers (2024-01-23T18:59:59Z) - Weakly-Supervised 3D Reconstruction of Clothed Humans via Normal Maps [1.6462601662291156]
We present a novel deep learning-based approach to the 3D reconstruction of clothed humans using weak supervision via 2D normal maps.
Given a single RGB image or multiview images, our network infers a signed distance function (SDF) discretized on a tetrahedral mesh surrounding the body in a rest pose.
We demonstrate the efficacy of our approach for both network inference and 3D reconstruction.
arXiv Detail & Related papers (2023-11-27T18:06:35Z) - Neural Capture of Animatable 3D Human from Monocular Video [38.974181971541846]
We present a novel paradigm of building an animatable 3D human representation from a monocular video input, such that it can be rendered in any unseen poses and views.
Our method is based on a dynamic Neural Radiance Field (NeRF) rigged by a mesh-based parametric 3D human model serving as a geometry proxy.
arXiv Detail & Related papers (2022-08-18T09:20:48Z) - Facial Depth and Normal Estimation using Single Dual-Pixel Camera [81.02680586859105]
We introduce a DP-oriented Depth/Normal network that reconstructs the 3D facial geometry.
It contains the corresponding ground-truth 3D models including depth map and surface normal in metric scale.
It achieves state-of-the-art performances over recent DP-based depth/normal estimation methods.
arXiv Detail & Related papers (2021-11-25T05:59:27Z) - A Divide et Impera Approach for 3D Shape Reconstruction from Multiple
Views [49.03830902235915]
Estimating the 3D shape of an object from a single or multiple images has gained popularity thanks to the recent breakthroughs powered by deep learning.
This paper proposes to rely on viewpoint variant reconstructions by merging the visible information from the given views.
To validate the proposed method, we perform a comprehensive evaluation on the ShapeNet reference benchmark in terms of relative pose estimation and 3D shape reconstruction.
arXiv Detail & Related papers (2020-11-17T09:59:32Z) - SofGAN: A Portrait Image Generator with Dynamic Styling [47.10046693844792]
Generative Adversarial Networks (GANs) have been widely used for portrait image generation.
We propose a SofGAN image generator to decouple the latent space of portraits into two subspaces.
We show that our system can generate high quality portrait images with independently controllable geometry and texture attributes.
arXiv Detail & Related papers (2020-07-07T20:28:47Z) - Geo-PIFu: Geometry and Pixel Aligned Implicit Functions for Single-view
Human Reconstruction [97.3274868990133]
Geo-PIFu is a method to recover a 3D mesh from a monocular color image of a clothed person.
We show that, by both encoding query points and constraining global shape using latent voxel features, the reconstruction we obtain for clothed human meshes exhibits less shape distortion and improved surface details compared to competing methods.
arXiv Detail & Related papers (2020-06-15T01:11:48Z) - 3D Human Mesh Regression with Dense Correspondence [95.92326689172877]
Estimating 3D mesh of the human body from a single 2D image is an important task with many applications such as augmented reality and Human-Robot interaction.
Prior works reconstructed 3D mesh from global image feature extracted by using convolutional neural network (CNN), where the dense correspondences between the mesh surface and the image pixels are missing.
This paper proposes a model-free 3D human mesh estimation framework, named DecoMR, which explicitly establishes the dense correspondence between the mesh and the local image features in the UV space.
arXiv Detail & Related papers (2020-06-10T08:50:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.