Reconstructing 3D Human Pose by Watching Humans in the Mirror
- URL: http://arxiv.org/abs/2104.00340v1
- Date: Thu, 1 Apr 2021 08:42:51 GMT
- Title: Reconstructing 3D Human Pose by Watching Humans in the Mirror
- Authors: Qi Fang, Qing Shuai, Junting Dong, Hujun Bao, Xiaowei Zhou
- Abstract summary: We introduce the new task of reconstructing 3D human pose from a single image in which we can see the person and the person's image through a mirror.
We develop an optimization-based approach that exploits mirror symmetry constraints for accurate 3D pose reconstruction.
- Score: 41.894948553970245
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this paper, we introduce the new task of reconstructing 3D human pose from
a single image in which we can see the person and the person's image through a
mirror. Compared to general scenarios of 3D pose estimation from a single view,
the mirror reflection provides an additional view for resolving the depth
ambiguity. We develop an optimization-based approach that exploits mirror
symmetry constraints for accurate 3D pose reconstruction. We also provide a
method to estimate the surface normal of the mirror from vanishing points in
the single image. To validate the proposed approach, we collect a large-scale
dataset named Mirrored-Human, which covers a large variety of human subjects,
poses and backgrounds. The experiments demonstrate that, when trained on
Mirrored-Human with our reconstructed 3D poses as pseudo ground-truth, the
accuracy and generalizability of existing single-view 3D pose estimators can be
largely improved.
Related papers
- Mirror-Aware Neural Humans [21.0548144424571]
We develop a consumer-level 3D motion capture system that starts from off-the-shelf 2D poses by automatically calibrating the camera.
We empirically demonstrate the benefit of learning a body model and accounting for occlusion in challenging mirror scenes.
arXiv Detail & Related papers (2023-09-09T10:43:45Z) - Few-View Object Reconstruction with Unknown Categories and Camera Poses [80.0820650171476]
This work explores reconstructing general real-world objects from a few images without known camera poses or object categories.
The crux of our work is solving two fundamental 3D vision problems -- shape reconstruction and pose estimation.
Our method FORGE predicts 3D features from each view and leverages them in conjunction with the input images to establish cross-view correspondence.
arXiv Detail & Related papers (2022-12-08T18:59:02Z) - ReFu: Refine and Fuse the Unobserved View for Detail-Preserving
Single-Image 3D Human Reconstruction [31.782985891629448]
Single-image 3D human reconstruction aims to reconstruct the 3D textured surface of the human body given a single image.
We propose ReFu, a coarse-to-fine approach that refines the projected backside view image and fuses the refined image to predict the final human body.
arXiv Detail & Related papers (2022-11-09T09:14:11Z) - Deep3DPose: Realtime Reconstruction of Arbitrarily Posed Human Bodies
from Single RGB Images [5.775625085664381]
We introduce an approach that accurately reconstructs 3D human poses and detailed 3D full-body geometric models from single images in realtime.
Key idea of our approach is a novel end-to-end multi-task deep learning framework that uses single images to predict five outputs simultaneously.
We show the system advances the frontier of 3D human body and pose reconstruction from single images by quantitative evaluations and comparisons with state-of-the-art methods.
arXiv Detail & Related papers (2021-06-22T04:26:11Z) - Residual Pose: A Decoupled Approach for Depth-based 3D Human Pose
Estimation [18.103595280706593]
We leverage recent advances in reliable 2D pose estimation with CNN to estimate the 3D pose of people from depth images.
Our approach achieves very competitive results both in accuracy and speed on two public datasets.
arXiv Detail & Related papers (2020-11-10T10:08:13Z) - 3D Multi-bodies: Fitting Sets of Plausible 3D Human Models to Ambiguous
Image Data [77.57798334776353]
We consider the problem of obtaining dense 3D reconstructions of humans from single and partially occluded views.
We suggest that ambiguities can be modelled more effectively by parametrizing the possible body shapes and poses.
We show that our method outperforms alternative approaches in ambiguous pose recovery on standard benchmarks for 3D humans.
arXiv Detail & Related papers (2020-11-02T13:55:31Z) - Learning to Detect 3D Reflection Symmetry for Single-View Reconstruction [32.14605731030579]
3D reconstruction from a single RGB image is a challenging problem in computer vision.
Previous methods are usually solely data-driven, which lead to inaccurate 3D shape recovery and limited generalization capability.
We present a geometry-based end-to-end deep learning framework that first detects the mirror plane of reflection symmetry that commonly exists in man-made objects and then predicts depth maps by finding the intra-image pixel-wise correspondence of the symmetry.
arXiv Detail & Related papers (2020-06-17T17:58:59Z) - Self-Supervised 3D Human Pose Estimation via Part Guided Novel Image
Synthesis [72.34794624243281]
We propose a self-supervised learning framework to disentangle variations from unlabeled video frames.
Our differentiable formalization, bridging the representation gap between the 3D pose and spatial part maps, allows us to operate on videos with diverse camera movements.
arXiv Detail & Related papers (2020-04-09T07:55:01Z) - Learning Pose-invariant 3D Object Reconstruction from Single-view Images [61.98279201609436]
In this paper, we explore a more realistic setup of learning 3D shape from only single-view images.
The major difficulty lies in insufficient constraints that can be provided by single view images.
We propose an effective adversarial domain confusion method to learn pose-disentangled compact shape space.
arXiv Detail & Related papers (2020-04-03T02:47:35Z) - Deep 3D Capture: Geometry and Reflectance from Sparse Multi-View Images [59.906948203578544]
We introduce a novel learning-based method to reconstruct the high-quality geometry and complex, spatially-varying BRDF of an arbitrary object.
We first estimate per-view depth maps using a deep multi-view stereo network.
These depth maps are used to coarsely align the different views.
We propose a novel multi-view reflectance estimation network architecture.
arXiv Detail & Related papers (2020-03-27T21:28:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.