MVP-Human Dataset for 3D Human Avatar Reconstruction from Unconstrained
Frames
- URL: http://arxiv.org/abs/2204.11184v2
- Date: Wed, 17 May 2023 10:58:37 GMT
- Title: MVP-Human Dataset for 3D Human Avatar Reconstruction from Unconstrained
Frames
- Authors: Xiangyu Zhu, Tingting Liao, Jiangjing Lyu, Xiang Yan, Yunfeng Wang,
Kan Guo, Qiong Cao, Stan Z. Li, and Zhen Lei
- Abstract summary: We present 3D Avatar Reconstruction in the wild (ARwild), which first reconstructs the implicit skinning fields in a multi-level manner.
We contribute a large-scale dataset, MVP-Human, which contains 400 subjects, each of which has 15 scans in different poses.
Overall, benefits from the specific network architecture and the diverse data, the trained model enables 3D avatar reconstruction from unconstrained frames.
- Score: 59.37430649840777
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we consider a novel problem of reconstructing a 3D human
avatar from multiple unconstrained frames, independent of assumptions on camera
calibration, capture space, and constrained actions. The problem should be
addressed by a framework that takes multiple unconstrained images as inputs,
and generates a shape-with-skinning avatar in the canonical space, finished in
one feed-forward pass. To this end, we present 3D Avatar Reconstruction in the
wild (ARwild), which first reconstructs the implicit skinning fields in a
multi-level manner, by which the image features from multiple images are
aligned and integrated to estimate a pixel-aligned implicit function that
represents the clothed shape. To enable the training and testing of the new
framework, we contribute a large-scale dataset, MVP-Human (Multi-View and
multi-Pose 3D Human), which contains 400 subjects, each of which has 15 scans
in different poses and 8-view images for each pose, providing 6,000 3D scans
and 48,000 images in total. Overall, benefits from the specific network
architecture and the diverse data, the trained model enables 3D avatar
reconstruction from unconstrained frames and achieves state-of-the-art
performance.
Related papers
- HAVE-FUN: Human Avatar Reconstruction from Few-Shot Unconstrained Images [33.298962236215964]
We study the reconstruction of human avatars from a few-shot unconstrained photo album.
For handling dynamic data, we integrate a skinning mechanism with deep marching tetrahedra.
Our framework, called HaveFun, can undertake avatar reconstruction, rendering, and animation.
arXiv Detail & Related papers (2023-11-27T10:01:31Z) - Generalizable One-shot Neural Head Avatar [90.50492165284724]
We present a method that reconstructs and animates a 3D head avatar from a single-view portrait image.
We propose a framework that not only generalizes to unseen identities based on a single-view image, but also captures characteristic details within and beyond the face area.
arXiv Detail & Related papers (2023-06-14T22:33:09Z) - DreamAvatar: Text-and-Shape Guided 3D Human Avatar Generation via
Diffusion Models [55.71306021041785]
We present DreamAvatar, a text-and-shape guided framework for generating high-quality 3D human avatars.
We leverage the SMPL model to provide shape and pose guidance for the generation.
We also jointly optimize the losses computed from the full body and from the zoomed-in 3D head to alleviate the common multi-face ''Janus'' problem.
arXiv Detail & Related papers (2023-04-03T12:11:51Z) - Crowd3D: Towards Hundreds of People Reconstruction from a Single Image [57.58149031283827]
We propose Crowd3D, the first framework to reconstruct the 3D poses, shapes and locations of hundreds of people with global consistency from a single large-scene image.
To deal with a large number of persons and various human sizes, we also design an adaptive human-centric cropping scheme.
arXiv Detail & Related papers (2023-01-23T11:45:27Z) - Structured 3D Features for Reconstructing Controllable Avatars [43.36074729431982]
We introduce Structured 3D Features, a model based on a novel implicit 3D representation that pools pixel-aligned image features onto dense 3D points sampled from a parametric, statistical human mesh surface.
We show that our S3F model surpasses the previous state-of-the-art on various tasks, including monocular 3D reconstruction, as well as albedo and shading estimation.
arXiv Detail & Related papers (2022-12-13T18:57:33Z) - DRaCoN -- Differentiable Rasterization Conditioned Neural Radiance
Fields for Articulated Avatars [92.37436369781692]
We present DRaCoN, a framework for learning full-body volumetric avatars.
It exploits the advantages of both the 2D and 3D neural rendering techniques.
Experiments on the challenging ZJU-MoCap and Human3.6M datasets indicate that DRaCoN outperforms state-of-the-art methods.
arXiv Detail & Related papers (2022-03-29T17:59:15Z) - Deep3DPose: Realtime Reconstruction of Arbitrarily Posed Human Bodies
from Single RGB Images [5.775625085664381]
We introduce an approach that accurately reconstructs 3D human poses and detailed 3D full-body geometric models from single images in realtime.
Key idea of our approach is a novel end-to-end multi-task deep learning framework that uses single images to predict five outputs simultaneously.
We show the system advances the frontier of 3D human body and pose reconstruction from single images by quantitative evaluations and comparisons with state-of-the-art methods.
arXiv Detail & Related papers (2021-06-22T04:26:11Z) - Multi-person Implicit Reconstruction from a Single Image [37.6877421030774]
We present a new end-to-end learning framework to obtain detailed and spatially coherent reconstructions of multiple people from a single image.
Existing multi-person methods suffer from two main drawbacks: they are often model-based and cannot capture accurate 3D models of people with loose clothing and hair.
arXiv Detail & Related papers (2021-04-19T13:21:55Z) - SparseFusion: Dynamic Human Avatar Modeling from Sparse RGBD Images [49.52782544649703]
We propose a novel approach to reconstruct 3D human body shapes based on a sparse set of RGBD frames.
The main challenge is how to robustly fuse these sparse frames into a canonical 3D model.
Our framework is flexible, with potential applications going beyond shape reconstruction.
arXiv Detail & Related papers (2020-06-05T18:53:36Z) - ARCH: Animatable Reconstruction of Clothed Humans [27.849315613277724]
ARCH (Animatable Reconstruction of Clothed Humans) is an end-to-end framework for accurate reconstruction of animation-ready 3D clothed humans from a monocular image.
ARCH is a learned pose-aware model that produces detailed 3D rigged full-body human avatars from a single unconstrained RGB image.
arXiv Detail & Related papers (2020-04-08T14:23:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.