HEMlets PoSh: Learning Part-Centric Heatmap Triplets for 3D Human Pose
and Shape Estimation
- URL: http://arxiv.org/abs/2003.04894v3
- Date: Tue, 12 Jan 2021 07:01:23 GMT
- Title: HEMlets PoSh: Learning Part-Centric Heatmap Triplets for 3D Human Pose
and Shape Estimation
- Authors: Kun Zhou, Xiaoguang Han, Nianjuan Jiang, Kui Jia, Jiangbo Lu
- Abstract summary: This work attempts to address the uncertainty of lifting the detected 2D joints to the 3D space by introducing an intermediate state-Part-Centric Heatmap Triplets (HEMlets)
The HEMlets utilize three joint-heatmaps to represent the relative depth information of the end-joints for each skeletal body part.
A Convolutional Network (ConvNet) is first trained to predict HEMlets from the input image, followed by a volumetric joint-heatmap regression.
- Score: 60.35776484235304
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Estimating 3D human pose from a single image is a challenging task. This work
attempts to address the uncertainty of lifting the detected 2D joints to the 3D
space by introducing an intermediate state-Part-Centric Heatmap Triplets
(HEMlets), which shortens the gap between the 2D observation and the 3D
interpretation. The HEMlets utilize three joint-heatmaps to represent the
relative depth information of the end-joints for each skeletal body part. In
our approach, a Convolutional Network (ConvNet) is first trained to predict
HEMlets from the input image, followed by a volumetric joint-heatmap
regression. We leverage on the integral operation to extract the joint
locations from the volumetric heatmaps, guaranteeing end-to-end learning.
Despite the simplicity of the network design, the quantitative comparisons show
a significant performance improvement over the best-of-grade methods (e.g.
$20\%$ on Human3.6M). The proposed method naturally supports training with
"in-the-wild" images, where only weakly-annotated relative depth information of
skeletal joints is available. This further improves the generalization ability
of our model, as validated by qualitative comparisons on outdoor images.
Leveraging the strength of the HEMlets pose estimation, we further design and
append a shallow yet effective network module to regress the SMPL parameters of
the body pose and shape. We term the entire HEMlets-based human pose and shape
recovery pipeline HEMlets PoSh. Extensive quantitative and qualitative
experiments on the existing human body recovery benchmarks justify the
state-of-the-art results obtained with our HEMlets PoSh approach.
Related papers
- ARTS: Semi-Analytical Regressor using Disentangled Skeletal Representations for Human Mesh Recovery from Videos [18.685856290041283]
ARTS surpasses existing state-of-the-art video-based methods in both per-frame accuracy and temporal consistency on popular benchmarks.
A skeleton estimation and disentanglement module is proposed to estimate the 3D skeletons from a video.
The regressor consists of three modules: Temporal Inverse Kinematics (TIK), Bone-guided Shape Fitting (BSF), and Motion-Centric Refinement (MCR)
arXiv Detail & Related papers (2024-10-21T02:06:43Z) - Learnable human mesh triangulation for 3D human pose and shape
estimation [6.699132260402631]
The accuracy of joint rotation and shape estimation has received relatively little attention in the skinned multi-person linear model (SMPL)-based human mesh reconstruction from multi-view images.
We propose a two-stage method to resolve the ambiguity of joint rotation and shape reconstruction and the difficulty of network learning.
The proposed method significantly outperforms the previous works in terms of joint rotation and shape estimation, and achieves competitive performance in terms of joint location estimation.
arXiv Detail & Related papers (2022-08-24T01:11:57Z) - KAMA: 3D Keypoint Aware Body Mesh Articulation [79.04090630502782]
We propose an analytical solution to articulate a parametric body model, SMPL, via a set of straightforward geometric transformations.
Our approach offers significantly better alignment to image content when compared to state-of-the-art approaches.
Results on the challenging 3DPW and Human3.6M demonstrate that our approach yields state-of-the-art body mesh fittings.
arXiv Detail & Related papers (2021-04-27T23:01:03Z) - Synthetic Training for Monocular Human Mesh Recovery [100.38109761268639]
This paper aims to estimate 3D mesh of multiple body parts with large-scale differences from a single RGB image.
The main challenge is lacking training data that have complete 3D annotations of all body parts in 2D images.
We propose a depth-to-scale (D2S) projection to incorporate the depth difference into the projection function to derive per-joint scale variants.
arXiv Detail & Related papers (2020-10-27T03:31:35Z) - Neural Descent for Visual 3D Human Pose and Shape [67.01050349629053]
We present deep neural network methodology to reconstruct the 3d pose and shape of people, given an input RGB image.
We rely on a recently introduced, expressivefull body statistical 3d human model, GHUM, trained end-to-end.
Central to our methodology, is a learning to learn and optimize approach, referred to as HUmanNeural Descent (HUND), which avoids both second-order differentiation.
arXiv Detail & Related papers (2020-08-16T13:38:41Z) - Monocular Human Pose and Shape Reconstruction using Part Differentiable
Rendering [53.16864661460889]
Recent works succeed in regression-based methods which estimate parametric models directly through a deep neural network supervised by 3D ground truth.
In this paper, we introduce body segmentation as critical supervision.
To improve the reconstruction with part segmentation, we propose a part-level differentiable part that enables part-based models to be supervised by part segmentation.
arXiv Detail & Related papers (2020-03-24T14:25:46Z) - Learning 3D Human Shape and Pose from Dense Body Parts [117.46290013548533]
We propose a Decompose-and-aggregate Network (DaNet) to learn 3D human shape and pose from dense correspondences of body parts.
Messages from local streams are aggregated to enhance the robust prediction of the rotation-based poses.
Our method is validated on both indoor and real-world datasets including Human3.6M, UP3D, COCO, and 3DPW.
arXiv Detail & Related papers (2019-12-31T15:09:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.