ARCH++: Animation-Ready Clothed Human Reconstruction Revisited
- URL: http://arxiv.org/abs/2108.07845v1
- Date: Tue, 17 Aug 2021 19:27:12 GMT
- Title: ARCH++: Animation-Ready Clothed Human Reconstruction Revisited
- Authors: Tong He, Yuanlu Xu, Shunsuke Saito, Stefano Soatto, Tony Tung
- Abstract summary: We present ARCH++, an image-based method to reconstruct 3D avatars with arbitrary clothing styles.
Our reconstructed avatars are animation-ready and highly realistic, in both the visible regions from input views and the unseen regions.
- Score: 82.83445332309238
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present ARCH++, an image-based method to reconstruct 3D avatars with
arbitrary clothing styles. Our reconstructed avatars are animation-ready and
highly realistic, in both the visible regions from input views and the unseen
regions. While prior work shows great promise of reconstructing animatable
clothed humans with various topologies, we observe that there exist fundamental
limitations resulting in sub-optimal reconstruction quality. In this paper, we
revisit the major steps of image-based avatar reconstruction and address the
limitations with ARCH++. First, we introduce an end-to-end point based geometry
encoder to better describe the semantics of the underlying 3D human body, in
replacement of previous hand-crafted features. Second, in order to address the
occupancy ambiguity caused by topological changes of clothed humans in the
canonical pose, we propose a co-supervising framework with cross-space
consistency to jointly estimate the occupancy in both the posed and canonical
spaces. Last, we use image-to-image translation networks to further refine
detailed geometry and texture on the reconstructed surface, which improves the
fidelity and consistency across arbitrary viewpoints. In the experiments, we
demonstrate improvements over the state of the art on both public benchmarks
and user studies in reconstruction quality and realism.
Related papers
- Crowd3D++: Robust Monocular Crowd Reconstruction with Upright Space [55.77397543011443]
This paper aims to reconstruct hundreds of people's 3D poses, shapes, and locations from a single image with unknown camera parameters.
Crowd3D is proposed to convert the complex 3D human localization into 2D-pixel localization with robust camera and ground estimation.
Crowd3D++ eliminates the influence of camera parameters and the cropping operation by the proposed canonical upright space and ground-aware normalization transform.
arXiv Detail & Related papers (2024-11-09T16:49:59Z) - InvertAvatar: Incremental GAN Inversion for Generalized Head Avatars [40.10906393484584]
We propose a novel framework that enhances avatar reconstruction performance using an algorithm designed to increase the fidelity from multiple frames.
Our architecture emphasizes pixel-aligned image-to-image translation, mitigating the need to learn correspondences between observation and canonical spaces.
The proposed paradigm demonstrates state-of-the-art performance on one-shot and few-shot avatar animation tasks.
arXiv Detail & Related papers (2023-12-03T18:59:15Z) - SiTH: Single-view Textured Human Reconstruction with Image-Conditioned Diffusion [35.73448283467723]
SiTH is a novel pipeline that integrates an image-conditioned diffusion model into a 3D mesh reconstruction workflow.
We employ a powerful generative diffusion model to hallucinate unseen back-view appearance based on the input images.
For the latter, we leverage skinned body meshes as guidance to recover full-body texture meshes from the input and back-view images.
arXiv Detail & Related papers (2023-11-27T14:22:07Z) - AniPixel: Towards Animatable Pixel-Aligned Human Avatar [65.7175527782209]
AniPixel is a novel animatable and generalizable human avatar reconstruction method.
We propose a neural skinning field based on skeleton-driven deformation to establish the target-to-canonical and canonical-to-observation correspondences.
Experiments show that AniPixel renders comparable novel views while delivering better novel pose animation results than state-of-the-art methods.
arXiv Detail & Related papers (2023-02-07T11:04:14Z) - Few-View Object Reconstruction with Unknown Categories and Camera Poses [80.0820650171476]
This work explores reconstructing general real-world objects from a few images without known camera poses or object categories.
The crux of our work is solving two fundamental 3D vision problems -- shape reconstruction and pose estimation.
Our method FORGE predicts 3D features from each view and leverages them in conjunction with the input images to establish cross-view correspondence.
arXiv Detail & Related papers (2022-12-08T18:59:02Z) - ReFu: Refine and Fuse the Unobserved View for Detail-Preserving
Single-Image 3D Human Reconstruction [31.782985891629448]
Single-image 3D human reconstruction aims to reconstruct the 3D textured surface of the human body given a single image.
We propose ReFu, a coarse-to-fine approach that refines the projected backside view image and fuses the refined image to predict the final human body.
arXiv Detail & Related papers (2022-11-09T09:14:11Z) - Pixel Codec Avatars [99.36561532588831]
Pixel Codec Avatars (PiCA) is a deep generative model of 3D human faces.
On a single Oculus Quest 2 mobile VR headset, 5 avatars are rendered in realtime in the same scene.
arXiv Detail & Related papers (2021-04-09T23:17:36Z) - Multi-View Consistency Loss for Improved Single-Image 3D Reconstruction
of Clothed People [36.30755368202957]
We present a novel method to improve the accuracy of the 3D reconstruction of clothed human shape from a single image.
The accuracy and completeness for reconstruction of clothed people is limited due to the large variation in shape resulting from clothing, hair, body size, pose and camera viewpoint.
arXiv Detail & Related papers (2020-09-29T17:18:00Z) - SparseFusion: Dynamic Human Avatar Modeling from Sparse RGBD Images [49.52782544649703]
We propose a novel approach to reconstruct 3D human body shapes based on a sparse set of RGBD frames.
The main challenge is how to robustly fuse these sparse frames into a canonical 3D model.
Our framework is flexible, with potential applications going beyond shape reconstruction.
arXiv Detail & Related papers (2020-06-05T18:53:36Z) - ARCH: Animatable Reconstruction of Clothed Humans [27.849315613277724]
ARCH (Animatable Reconstruction of Clothed Humans) is an end-to-end framework for accurate reconstruction of animation-ready 3D clothed humans from a monocular image.
ARCH is a learned pose-aware model that produces detailed 3D rigged full-body human avatars from a single unconstrained RGB image.
arXiv Detail & Related papers (2020-04-08T14:23:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.