Multi-View Consistency Loss for Improved Single-Image 3D Reconstruction
of Clothed People
- URL: http://arxiv.org/abs/2009.14162v1
- Date: Tue, 29 Sep 2020 17:18:00 GMT
- Title: Multi-View Consistency Loss for Improved Single-Image 3D Reconstruction
of Clothed People
- Authors: Akin Caliskan, Armin Mustafa, Evren Imre, Adrian Hilton
- Abstract summary: We present a novel method to improve the accuracy of the 3D reconstruction of clothed human shape from a single image.
The accuracy and completeness for reconstruction of clothed people is limited due to the large variation in shape resulting from clothing, hair, body size, pose and camera viewpoint.
- Score: 36.30755368202957
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present a novel method to improve the accuracy of the 3D reconstruction of
clothed human shape from a single image. Recent work has introduced volumetric,
implicit and model-based shape learning frameworks for reconstruction of
objects and people from one or more images. However, the accuracy and
completeness for reconstruction of clothed people is limited due to the large
variation in shape resulting from clothing, hair, body size, pose and camera
viewpoint. This paper introduces two advances to overcome this limitation:
firstly a new synthetic dataset of realistic clothed people, 3DVH; and
secondly, a novel multiple-view loss function for training of monocular
volumetric shape estimation, which is demonstrated to significantly improve
generalisation and reconstruction accuracy. The 3DVH dataset of realistic
clothed 3D human models rendered with diverse natural backgrounds is
demonstrated to allows transfer to reconstruction from real images of people.
Comprehensive comparative performance evaluation on both synthetic and real
images of people demonstrates that the proposed method significantly
outperforms the previous state-of-the-art learning-based single image 3D human
shape estimation approaches achieving significant improvement of reconstruction
accuracy, completeness, and quality. An ablation study shows that this is due
to both the proposed multiple-view training and the new 3DVH dataset. The code
and the dataset can be found at the project website:
https://akincaliskan3d.github.io/MV3DH/.
Related papers
- SiTH: Single-view Textured Human Reconstruction with Image-Conditioned Diffusion [35.73448283467723]
SiTH is a novel pipeline that integrates an image-conditioned diffusion model into a 3D mesh reconstruction workflow.
We employ a powerful generative diffusion model to hallucinate unseen back-view appearance based on the input images.
For the latter, we leverage skinned body meshes as guidance to recover full-body texture meshes from the input and back-view images.
arXiv Detail & Related papers (2023-11-27T14:22:07Z) - Refining 3D Human Texture Estimation from a Single Image [3.8761064607384195]
Estimating 3D human texture from a single image is essential in graphics and vision.
We propose a framework that adaptively samples the input by a deformable convolution where offsets are learned via a deep neural network.
arXiv Detail & Related papers (2023-03-06T19:53:50Z) - TexPose: Neural Texture Learning for Self-Supervised 6D Object Pose
Estimation [55.94900327396771]
We introduce neural texture learning for 6D object pose estimation from synthetic data.
We learn to predict realistic texture of objects from real image collections.
We learn pose estimation from pixel-perfect synthetic data.
arXiv Detail & Related papers (2022-12-25T13:36:32Z) - ReFu: Refine and Fuse the Unobserved View for Detail-Preserving
Single-Image 3D Human Reconstruction [31.782985891629448]
Single-image 3D human reconstruction aims to reconstruct the 3D textured surface of the human body given a single image.
We propose ReFu, a coarse-to-fine approach that refines the projected backside view image and fuses the refined image to predict the final human body.
arXiv Detail & Related papers (2022-11-09T09:14:11Z) - LatentHuman: Shape-and-Pose Disentangled Latent Representation for Human
Bodies [78.17425779503047]
We propose a novel neural implicit representation for the human body.
It is fully differentiable and optimizable with disentangled shape and pose latent spaces.
Our model can be trained and fine-tuned directly on non-watertight raw data with well-designed losses.
arXiv Detail & Related papers (2021-11-30T04:10:57Z) - Detailed Avatar Recovery from Single Image [50.82102098057822]
This paper presents a novel framework to recover emphdetailed avatar from a single image.
We use the deep neural networks to refine the 3D shape in a Hierarchical Mesh Deformation framework.
Our method can restore detailed human body shapes with complete textures beyond skinned models.
arXiv Detail & Related papers (2021-08-06T03:51:26Z) - Multi-person Implicit Reconstruction from a Single Image [37.6877421030774]
We present a new end-to-end learning framework to obtain detailed and spatially coherent reconstructions of multiple people from a single image.
Existing multi-person methods suffer from two main drawbacks: they are often model-based and cannot capture accurate 3D models of people with loose clothing and hair.
arXiv Detail & Related papers (2021-04-19T13:21:55Z) - Temporal Consistency Loss for High Resolution Textured and Clothed
3DHuman Reconstruction from Monocular Video [35.42021156572568]
We present a novel method to learn temporally consistent 3D reconstruction of clothed people from a monocular video.
The proposed advances improve the temporal consistency and accuracy of both the 3D reconstruction and texture prediction from a monocular video.
arXiv Detail & Related papers (2021-04-19T13:04:29Z) - Neural Descent for Visual 3D Human Pose and Shape [67.01050349629053]
We present deep neural network methodology to reconstruct the 3d pose and shape of people, given an input RGB image.
We rely on a recently introduced, expressivefull body statistical 3d human model, GHUM, trained end-to-end.
Central to our methodology, is a learning to learn and optimize approach, referred to as HUmanNeural Descent (HUND), which avoids both second-order differentiation.
arXiv Detail & Related papers (2020-08-16T13:38:41Z) - SparseFusion: Dynamic Human Avatar Modeling from Sparse RGBD Images [49.52782544649703]
We propose a novel approach to reconstruct 3D human body shapes based on a sparse set of RGBD frames.
The main challenge is how to robustly fuse these sparse frames into a canonical 3D model.
Our framework is flexible, with potential applications going beyond shape reconstruction.
arXiv Detail & Related papers (2020-06-05T18:53:36Z) - Deep Fashion3D: A Dataset and Benchmark for 3D Garment Reconstruction
from Single Images [50.34202789543989]
Deep Fashion3D is the largest collection to date of 3D garment models.
It provides rich annotations including 3D feature lines, 3D body pose and the corresponded multi-view real images.
A novel adaptable template is proposed to enable the learning of all types of clothing in a single network.
arXiv Detail & Related papers (2020-03-28T09:20:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.