Hierarchical Kinematic Probability Distributions for 3D Human Shape and
Pose Estimation from Images in the Wild
- URL: http://arxiv.org/abs/2110.00990v1
- Date: Sun, 3 Oct 2021 11:59:37 GMT
- Title: Hierarchical Kinematic Probability Distributions for 3D Human Shape and
Pose Estimation from Images in the Wild
- Authors: Akash Sengupta, Ignas Budvytis, Roberto Cipolla
- Abstract summary: This paper addresses the problem of 3D human body shape and pose estimation from an RGB image.
We train a deep neural network to estimate a hierarchical matrix-Fisher distribution over relative 3D joint rotation matrices.
We show that our method is competitive with the state-of-the-art in terms of 3D shape and pose metrics on the SSP-3D and 3DPW datasets.
- Score: 25.647676661390282
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper addresses the problem of 3D human body shape and pose estimation
from an RGB image. This is often an ill-posed problem, since multiple plausible
3D bodies may match the visual evidence present in the input - particularly
when the subject is occluded. Thus, it is desirable to estimate a distribution
over 3D body shape and pose conditioned on the input image instead of a single
3D reconstruction. We train a deep neural network to estimate a hierarchical
matrix-Fisher distribution over relative 3D joint rotation matrices (i.e. body
pose), which exploits the human body's kinematic tree structure, as well as a
Gaussian distribution over SMPL body shape parameters. To further ensure that
the predicted shape and pose distributions match the visual evidence in the
input image, we implement a differentiable rejection sampler to impose a
reprojection loss between ground-truth 2D joint coordinates and samples from
the predicted distributions, projected onto the image plane. We show that our
method is competitive with the state-of-the-art in terms of 3D shape and pose
metrics on the SSP-3D and 3DPW datasets, while also yielding a structured
probability distribution over 3D body shape and pose, with which we can
meaningfully quantify prediction uncertainty and sample multiple plausible 3D
reconstructions to explain a given input image. Code is available at
https://github.com/akashsengupta1997/HierarchicalProbabilistic3DHuman .
Related papers
- Neural Localizer Fields for Continuous 3D Human Pose and Shape Estimation [32.30055363306321]
We propose a paradigm for seamlessly unifying different human pose and shape-related tasks and datasets.
Our formulation is centered on the ability - both at training and test time - to query any arbitrary point of the human volume.
We can naturally exploit differently annotated data sources including mesh, 2D/3D skeleton and dense pose, without having to convert between them.
arXiv Detail & Related papers (2024-07-10T10:44:18Z) - HuManiFlow: Ancestor-Conditioned Normalising Flows on SO(3) Manifolds
for Human Pose and Shape Distribution Estimation [27.14060158187953]
Recent approaches predict a probability distribution over plausible 3D pose and shape parameters conditioned on the image.
We show that these approaches exhibit a trade-off between three key properties.
Our method, HuManiFlow, predicts simultaneously accurate, consistent and diverse distributions.
arXiv Detail & Related papers (2023-05-11T16:49:19Z) - Diffusion-Based 3D Human Pose Estimation with Multi-Hypothesis
Aggregation [64.874000550443]
A Diffusion-based 3D Pose estimation (D3DP) method with Joint-wise reProjection-based Multi-hypothesis Aggregation (JPMA) is proposed.
The proposed JPMA assembles multiple hypotheses generated by D3DP into a single 3D pose for practical use.
Our method outperforms the state-of-the-art deterministic and probabilistic approaches by 1.5% and 8.9%, respectively.
arXiv Detail & Related papers (2023-03-21T04:00:47Z) - DreamFusion: Text-to-3D using 2D Diffusion [52.52529213936283]
Recent breakthroughs in text-to-image synthesis have been driven by diffusion models trained on billions of image-text pairs.
In this work, we circumvent these limitations by using a pretrained 2D text-to-image diffusion model to perform text-to-3D synthesis.
Our approach requires no 3D training data and no modifications to the image diffusion model, demonstrating the effectiveness of pretrained image diffusion models as priors.
arXiv Detail & Related papers (2022-09-29T17:50:40Z) - Beyond 3DMM: Learning to Capture High-fidelity 3D Face Shape [77.95154911528365]
3D Morphable Model (3DMM) fitting has widely benefited face analysis due to its strong 3D priori.
Previous reconstructed 3D faces suffer from degraded visual verisimilitude due to the loss of fine-grained geometry.
This paper proposes a complete solution to capture the personalized shape so that the reconstructed shape looks identical to the corresponding person.
arXiv Detail & Related papers (2022-04-09T03:46:18Z) - Uncertainty-Aware Adaptation for Self-Supervised 3D Human Pose
Estimation [70.32536356351706]
We introduce MRP-Net that constitutes a common deep network backbone with two output heads subscribing to two diverse configurations.
We derive suitable measures to quantify prediction uncertainty at both pose and joint level.
We present a comprehensive evaluation of the proposed approach and demonstrate state-of-the-art performance on benchmark datasets.
arXiv Detail & Related papers (2022-03-29T07:14:58Z) - Probabilistic Estimation of 3D Human Shape and Pose with a Semantic
Local Parametric Model [25.647676661390282]
This paper addresses the problem of 3D human body shape and pose estimation from RGB images.
We present a method that predicts distributions over local body shape in the form of semantic body measurements.
We show that our method outperforms the current state-of-the-art in terms of identity-dependent body shape estimation accuracy.
arXiv Detail & Related papers (2021-11-30T13:50:45Z) - Probabilistic 3D Human Shape and Pose Estimation from Multiple
Unconstrained Images in the Wild [25.647676661390282]
We propose a new task: shape and pose estimation from a group of multiple images of a human subject.
Our solution predicts distributions over SMPL body shape and pose parameters conditioned on the input images in the group.
We show that the additional body shape information present in multi-image input groups improves 3D human shape estimation metrics.
arXiv Detail & Related papers (2021-03-19T18:32:16Z) - 3D Multi-bodies: Fitting Sets of Plausible 3D Human Models to Ambiguous
Image Data [77.57798334776353]
We consider the problem of obtaining dense 3D reconstructions of humans from single and partially occluded views.
We suggest that ambiguities can be modelled more effectively by parametrizing the possible body shapes and poses.
We show that our method outperforms alternative approaches in ambiguous pose recovery on standard benchmarks for 3D humans.
arXiv Detail & Related papers (2020-11-02T13:55:31Z) - Weakly Supervised Generative Network for Multiple 3D Human Pose
Hypotheses [74.48263583706712]
3D human pose estimation from a single image is an inverse problem due to the inherent ambiguity of the missing depth.
We propose a weakly supervised deep generative network to address the inverse problem.
arXiv Detail & Related papers (2020-08-13T09:26:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.