VirtualPose: Learning Generalizable 3D Human Pose Models from Virtual
Data
- URL: http://arxiv.org/abs/2207.09949v1
- Date: Wed, 20 Jul 2022 14:47:28 GMT
- Title: VirtualPose: Learning Generalizable 3D Human Pose Models from Virtual
Data
- Authors: Jiajun Su, Chunyu Wang, Xiaoxuan Ma, Wenjun Zeng, and Yizhou Wang
- Abstract summary: We introduce VirtualPose, a two-stage learning framework to exploit the hidden "free lunch" specific to this task.
The first stage transforms images to abstract geometry representations (AGR), and then the second maps them to 3D poses.
It addresses the generalization issue from two aspects: (1) the first stage can be trained on diverse 2D datasets to reduce the risk of over-fitting to limited appearance; (2) the second stage can be trained on diverse AGR synthesized from a large number of virtual cameras and poses.
- Score: 69.64723752430244
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: While monocular 3D pose estimation seems to have achieved very accurate
results on the public datasets, their generalization ability is largely
overlooked. In this work, we perform a systematic evaluation of the existing
methods and find that they get notably larger errors when tested on different
cameras, human poses and appearance. To address the problem, we introduce
VirtualPose, a two-stage learning framework to exploit the hidden "free lunch"
specific to this task, i.e. generating infinite number of poses and cameras for
training models at no cost. To that end, the first stage transforms images to
abstract geometry representations (AGR), and then the second maps them to 3D
poses. It addresses the generalization issue from two aspects: (1) the first
stage can be trained on diverse 2D datasets to reduce the risk of over-fitting
to limited appearance; (2) the second stage can be trained on diverse AGR
synthesized from a large number of virtual cameras and poses. It outperforms
the SOTA methods without using any paired images and 3D poses from the
benchmarks, which paves the way for practical applications. Code is available
at https://github.com/wkom/VirtualPose.
Related papers
- MPL: Lifting 3D Human Pose from Multi-view 2D Poses [75.26416079541723]
We propose combining 2D pose estimation, for which large and rich training datasets exist, and 2D-to-3D pose lifting, using a transformer-based network.
Our experiments demonstrate decreases up to 45% in MPJPE errors compared to the 3D pose obtained by triangulating the 2D poses.
arXiv Detail & Related papers (2024-08-20T12:55:14Z) - Unsupervised Learning of Category-Level 3D Pose from Object-Centric Videos [15.532504015622159]
Category-level 3D pose estimation is a fundamentally important problem in computer vision and robotics.
We tackle the problem of learning to estimate the category-level 3D pose only from casually taken object-centric videos.
arXiv Detail & Related papers (2024-07-05T09:43:05Z) - Implicit Learning of Scene Geometry from Poses for Global Localization [7.077874294016776]
Global visual localization estimates the absolute pose of a camera using a single image, in a previously mapped area.
Many existing approaches directly learn and regress 6 DoF pose from an input image.
We propose to utilize these minimal available labels to learn the underlying 3D geometry of the scene.
arXiv Detail & Related papers (2023-12-04T16:51:23Z) - MPM: A Unified 2D-3D Human Pose Representation via Masked Pose Modeling [59.74064212110042]
mpmcan handle multiple tasks including 3D human pose estimation, 3D pose estimation from cluded 2D pose, and 3D pose completion in a textocbfsingle framework.
We conduct extensive experiments and ablation studies on several widely used human pose datasets and achieve state-of-the-art performance on MPI-INF-3DHP.
arXiv Detail & Related papers (2023-06-29T10:30:00Z) - CameraPose: Weakly-Supervised Monocular 3D Human Pose Estimation by
Leveraging In-the-wild 2D Annotations [25.05308239278207]
We present CameraPose, a weakly-supervised framework for 3D human pose estimation from a single image.
By adding a camera parameter branch, any in-the-wild 2D annotations can be fed into our pipeline to boost the training diversity.
We also introduce a refinement network module with confidence-guided loss to further improve the quality of noisy 2D keypoints extracted by 2D pose estimators.
arXiv Detail & Related papers (2023-01-08T05:07:41Z) - ElePose: Unsupervised 3D Human Pose Estimation by Predicting Camera
Elevation and Learning Normalizing Flows on 2D Poses [23.554957518485324]
We propose an unsupervised approach that learns to predict a 3D human pose from a single image.
We estimate the 3D pose that is most likely over random projections, with the likelihood estimated using normalizing flows on 2D poses.
We outperform the state-of-the-art unsupervised human pose estimation methods on the benchmark datasets Human3.6M and MPI-INF-3DHP in many metrics.
arXiv Detail & Related papers (2021-12-14T01:12:45Z) - MetaPose: Fast 3D Pose from Multiple Views without 3D Supervision [72.5863451123577]
We show how to train a neural model that can perform accurate 3D pose and camera estimation.
Our method outperforms both classical bundle adjustment and weakly-supervised monocular 3D baselines.
arXiv Detail & Related papers (2021-08-10T18:39:56Z) - Cascaded deep monocular 3D human pose estimation with evolutionary
training data [76.3478675752847]
Deep representation learning has achieved remarkable accuracy for monocular 3D human pose estimation.
This paper proposes a novel data augmentation method that is scalable for massive amount of training data.
Our method synthesizes unseen 3D human skeletons based on a hierarchical human representation and synthesizings inspired by prior knowledge.
arXiv Detail & Related papers (2020-06-14T03:09:52Z) - Fusing Wearable IMUs with Multi-View Images for Human Pose Estimation: A
Geometric Approach [76.10879433430466]
We propose to estimate 3D human pose from multi-view images and a few IMUs attached at person's limbs.
It operates by firstly detecting 2D poses from the two signals, and then lifting them to the 3D space.
The simple two-step approach reduces the error of the state-of-the-art by a large margin on a public dataset.
arXiv Detail & Related papers (2020-03-25T00:26:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.