POCO: 3D Pose and Shape Estimation with Confidence
- URL: http://arxiv.org/abs/2308.12965v1
- Date: Thu, 24 Aug 2023 17:59:04 GMT
- Title: POCO: 3D Pose and Shape Estimation with Confidence
- Authors: Sai Kumar Dwivedi, Cordelia Schmid, Hongwei Yi, Michael J. Black,
Dimitrios Tzionas
- Abstract summary: We develop POCO, a novel framework for training HPS regressors to estimate not only a 3D human body, but also their confidence.
Specifically, POCO estimates both the 3D body pose and a per-sample variance.
In all cases, training the network to reason about uncertainty helps it learn to more accurately estimate 3D pose.
- Score: 99.91683561240549
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The regression of 3D Human Pose and Shape (HPS) from an image is becoming
increasingly accurate. This makes the results useful for downstream tasks like
human action recognition or 3D graphics. Yet, no regressor is perfect, and
accuracy can be affected by ambiguous image evidence or by poses and appearance
that are unseen during training. Most current HPS regressors, however, do not
report the confidence of their outputs, meaning that downstream tasks cannot
differentiate accurate estimates from inaccurate ones. To address this, we
develop POCO, a novel framework for training HPS regressors to estimate not
only a 3D human body, but also their confidence, in a single feed-forward pass.
Specifically, POCO estimates both the 3D body pose and a per-sample variance.
The key idea is to introduce a Dual Conditioning Strategy (DCS) for regressing
uncertainty that is highly correlated to pose reconstruction quality. The POCO
framework can be applied to any HPS regressor and here we evaluate it by
modifying HMR, PARE, and CLIFF. In all cases, training the network to reason
about uncertainty helps it learn to more accurately estimate 3D pose. While
this was not our goal, the improvement is modest but consistent. Our main
motivation is to provide uncertainty estimates for downstream tasks; we
demonstrate this in two ways: (1) We use the confidence estimates to bootstrap
HPS training. Given unlabelled image data, we take the confident estimates of a
POCO-trained regressor as pseudo ground truth. Retraining with this
automatically-curated data improves accuracy. (2) We exploit uncertainty in
video pose estimation by automatically identifying uncertain frames (e.g. due
to occlusion) and inpainting these from confident frames. Code and models will
be available for research at https://poco.is.tue.mpg.de.
Related papers
- TokenHMR: Advancing Human Mesh Recovery with a Tokenized Pose Representation [48.08156777874614]
Current methods leverage 3D pseudo-ground-truth (p-GT) and 2D keypoints, leading to robust performance.
With such methods, we observe a paradoxical decline in 3D pose accuracy with increasing 2D accuracy.
We quantify the error induced by current camera models and show that fitting 2D keypoints and p-GT accurately causes incorrect 3D poses.
arXiv Detail & Related papers (2024-04-25T17:09:14Z) - UPose3D: Uncertainty-Aware 3D Human Pose Estimation with Cross-View and Temporal Cues [55.69339788566899]
UPose3D is a novel approach for multi-view 3D human pose estimation.
It improves robustness and flexibility without requiring direct 3D annotations.
arXiv Detail & Related papers (2024-04-23T00:18:00Z) - On the Calibration of Human Pose Estimation [39.15814732856338]
Calibrated ConfidenceNet (CCNet) is a light-weight post-hoc addition that improves AP by up to 1.4% on off-the-shelf pose estimation frameworks.
applied to the downstream task of mesh recovery, CCNet facilitates an additional 1.0mm decrease in 3D keypoint error.
arXiv Detail & Related papers (2023-11-28T09:31:09Z) - "Teaching Independent Parts Separately"(TIPSy-GAN) : Improving Accuracy
and Stability in Unsupervised Adversarial 2D to 3D Human Pose Estimation [7.294965109944706]
We present TIPSy-GAN, a new approach to improve the accuracy and stability in unsupervised adversarial 2D to 3D human pose estimation.
In our work we demonstrate that the human kinematic skeleton should not be assumed as one spatially codependent structure.
arXiv Detail & Related papers (2022-05-12T09:40:25Z) - PoseTriplet: Co-evolving 3D Human Pose Estimation, Imitation, and
Hallucination under Self-supervision [102.48681650013698]
Existing self-supervised 3D human pose estimation schemes have largely relied on weak supervisions to guide the learning.
We propose a novel self-supervised approach that allows us to explicitly generate 2D-3D pose pairs for augmenting supervision.
This is made possible via introducing a reinforcement-learning-based imitator, which is learned jointly with a pose estimator alongside a pose hallucinator.
arXiv Detail & Related papers (2022-03-29T14:45:53Z) - PONet: Robust 3D Human Pose Estimation via Learning Orientations Only [116.1502793612437]
We propose a novel Pose Orientation Net (PONet) that is able to robustly estimate 3D pose by learning orientations only.
PONet estimates the 3D orientation of these limbs by taking advantage of the local image evidence to recover the 3D pose.
We evaluate our method on multiple datasets, including Human3.6M, MPII, MPI-INF-3DHP, and 3DPW.
arXiv Detail & Related papers (2021-12-21T12:48:48Z) - Uncertainty-Aware Camera Pose Estimation from Points and Lines [101.03675842534415]
Perspective-n-Point-and-Line (Pn$PL) aims at fast, accurate and robust camera localizations with respect to a 3D model from 2D-3D feature coordinates.
arXiv Detail & Related papers (2021-07-08T15:19:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.