Error Bounds of Projection Models in Weakly Supervised 3D Human Pose
Estimation
- URL: http://arxiv.org/abs/2010.12317v1
- Date: Fri, 23 Oct 2020 11:48:13 GMT
- Title: Error Bounds of Projection Models in Weakly Supervised 3D Human Pose
Estimation
- Authors: Nikolas Klug, Moritz Einfalt, Stephan Brehm, Rainer Lienhart
- Abstract summary: We present a detailed analysis of the most commonly used simplified projection models.
We show how the normalized perspective projection can be replaced to avoid this guaranteed minimal error.
Our results show that both projection models lead to an inherent minimal error between 19.3mm and 54.7mm, even after alignment in position and scale.
- Score: 27.289210415215067
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The current state-of-the-art in monocular 3D human pose estimation is heavily
influenced by weakly supervised methods. These allow 2D labels to be used to
learn effective 3D human pose recovery either directly from images or via
2D-to-3D pose uplifting. In this paper we present a detailed analysis of the
most commonly used simplified projection models, which relate the estimated 3D
pose representation to 2D labels: normalized perspective and weak perspective
projections. Specifically, we derive theoretical lower bound errors for those
projection models under the commonly used mean per-joint position error
(MPJPE). Additionally, we show how the normalized perspective projection can be
replaced to avoid this guaranteed minimal error. We evaluate the derived lower
bounds on the most commonly used 3D human pose estimation benchmark datasets.
Our results show that both projection models lead to an inherent minimal error
between 19.3mm and 54.7mm, even after alignment in position and scale. This is
a considerable share when comparing with recent state-of-the-art results. Our
paper thus establishes a theoretical baseline that shows the importance of
suitable projection models in weakly supervised 3D human pose estimation.
Related papers
- TokenHMR: Advancing Human Mesh Recovery with a Tokenized Pose Representation [48.08156777874614]
Current methods leverage 3D pseudo-ground-truth (p-GT) and 2D keypoints, leading to robust performance.
With such methods, we observe a paradoxical decline in 3D pose accuracy with increasing 2D accuracy.
We quantify the error induced by current camera models and show that fitting 2D keypoints and p-GT accurately causes incorrect 3D poses.
arXiv Detail & Related papers (2024-04-25T17:09:14Z) - Personalized 3D Human Pose and Shape Refinement [19.082329060985455]
regression-based methods have dominated the field of 3D human pose and shape estimation.
We propose to construct dense correspondences between initial human model estimates and the corresponding images.
We show that our approach not only consistently leads to better image-model alignment, but also to improved 3D accuracy.
arXiv Detail & Related papers (2024-03-18T10:13:53Z) - ManiPose: Manifold-Constrained Multi-Hypothesis 3D Human Pose Estimation [54.86887812687023]
Most 3D-HPE methods rely on regression models, which assume a one-to-one mapping between inputs and outputs.
We propose ManiPose, a novel manifold-constrained multi-hypothesis model capable of proposing multiple candidate 3D poses for each 2D input.
Unlike previous multi-hypothesis approaches, our solution is completely supervised and does not rely on complex generative models.
arXiv Detail & Related papers (2023-12-11T13:50:10Z) - SPGNet: Spatial Projection Guided 3D Human Pose Estimation in Low
Dimensional Space [14.81199315166042]
We propose a method for 3D human pose estimation that mixes multi-dimensional re-projection into supervised learning.
Based on the estimation results for the dataset Human3.6M, our approach outperforms many state-of-the-art methods both qualitatively and quantitatively.
arXiv Detail & Related papers (2022-06-04T00:51:00Z) - Uncertainty-Aware Adaptation for Self-Supervised 3D Human Pose
Estimation [70.32536356351706]
We introduce MRP-Net that constitutes a common deep network backbone with two output heads subscribing to two diverse configurations.
We derive suitable measures to quantify prediction uncertainty at both pose and joint level.
We present a comprehensive evaluation of the proposed approach and demonstrate state-of-the-art performance on benchmark datasets.
arXiv Detail & Related papers (2022-03-29T07:14:58Z) - PONet: Robust 3D Human Pose Estimation via Learning Orientations Only [116.1502793612437]
We propose a novel Pose Orientation Net (PONet) that is able to robustly estimate 3D pose by learning orientations only.
PONet estimates the 3D orientation of these limbs by taking advantage of the local image evidence to recover the 3D pose.
We evaluate our method on multiple datasets, including Human3.6M, MPII, MPI-INF-3DHP, and 3DPW.
arXiv Detail & Related papers (2021-12-21T12:48:48Z) - Probabilistic Monocular 3D Human Pose Estimation with Normalizing Flows [24.0966076588569]
We propose a normalizing flow based method that exploits the deterministic 3D-to-2D mapping to solve the ambiguous inverse 2D-to-3D problem.
We evaluate our approach on the two benchmark datasets Human3.6M and MPI-INF-3DHP, outperforming all comparable methods in most metrics.
arXiv Detail & Related papers (2021-07-29T07:33:14Z) - Synthetic Training for Monocular Human Mesh Recovery [100.38109761268639]
This paper aims to estimate 3D mesh of multiple body parts with large-scale differences from a single RGB image.
The main challenge is lacking training data that have complete 3D annotations of all body parts in 2D images.
We propose a depth-to-scale (D2S) projection to incorporate the depth difference into the projection function to derive per-joint scale variants.
arXiv Detail & Related papers (2020-10-27T03:31:35Z) - Kinematic-Structure-Preserved Representation for Unsupervised 3D Human
Pose Estimation [58.72192168935338]
Generalizability of human pose estimation models developed using supervision on large-scale in-studio datasets remains questionable.
We propose a novel kinematic-structure-preserved unsupervised 3D pose estimation framework, which is not restrained by any paired or unpaired weak supervisions.
Our proposed model employs three consecutive differentiable transformations named as forward-kinematics, camera-projection and spatial-map transformation.
arXiv Detail & Related papers (2020-06-24T23:56:33Z) - Self-Supervised 3D Human Pose Estimation via Part Guided Novel Image
Synthesis [72.34794624243281]
We propose a self-supervised learning framework to disentangle variations from unlabeled video frames.
Our differentiable formalization, bridging the representation gap between the 3D pose and spatial part maps, allows us to operate on videos with diverse camera movements.
arXiv Detail & Related papers (2020-04-09T07:55:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.