HybrIK: A Hybrid Analytical-Neural Inverse Kinematics Solution for 3D
Human Pose and Shape Estimation
- URL: http://arxiv.org/abs/2011.14672v3
- Date: Mon, 5 Apr 2021 13:57:49 GMT
- Title: HybrIK: A Hybrid Analytical-Neural Inverse Kinematics Solution for 3D
Human Pose and Shape Estimation
- Authors: Jiefeng Li, Chao Xu, Zhicun Chen, Siyuan Bian, Lixin Yang, Cewu Lu
- Abstract summary: We propose a novel hybrid inverse kinematics solution (HybrIK) to bridge the gap between body mesh estimation and 3D keypoint estimation.
HybrIK directly transforms accurate 3D joints to relative body-part rotations for 3D body mesh reconstruction.
We show that HybrIK preserves both the accuracy of 3D pose and the realistic body structure of the parametric human model.
- Score: 39.67289969828706
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Model-based 3D pose and shape estimation methods reconstruct a full 3D mesh
for the human body by estimating several parameters. However, learning the
abstract parameters is a highly non-linear process and suffers from image-model
misalignment, leading to mediocre model performance. In contrast, 3D keypoint
estimation methods combine deep CNN network with the volumetric representation
to achieve pixel-level localization accuracy but may predict unrealistic body
structure. In this paper, we address the above issues by bridging the gap
between body mesh estimation and 3D keypoint estimation. We propose a novel
hybrid inverse kinematics solution (HybrIK). HybrIK directly transforms
accurate 3D joints to relative body-part rotations for 3D body mesh
reconstruction, via the twist-and-swing decomposition. The swing rotation is
analytically solved with 3D joints, and the twist rotation is derived from the
visual cues through the neural network. We show that HybrIK preserves both the
accuracy of 3D pose and the realistic body structure of the parametric human
model, leading to a pixel-aligned 3D body mesh and a more accurate 3D pose than
the pure 3D keypoint estimation methods. Without bells and whistles, the
proposed method surpasses the state-of-the-art methods by a large margin on
various 3D human pose and shape benchmarks. As an illustrative example, HybrIK
outperforms all the previous methods by 13.2 mm MPJPE and 21.9 mm PVE on 3DPW
dataset. Our code is available at https://github.com/Jeff-sjtu/HybrIK.
Related papers
- UPose3D: Uncertainty-Aware 3D Human Pose Estimation with Cross-View and Temporal Cues [55.69339788566899]
UPose3D is a novel approach for multi-view 3D human pose estimation.
It improves robustness and flexibility without requiring direct 3D annotations.
arXiv Detail & Related papers (2024-04-23T00:18:00Z) - Co-Evolution of Pose and Mesh for 3D Human Body Estimation from Video [23.93644678238666]
We propose a Pose and Mesh Co-Evolution network (PMCE) to recover 3D human motion from a video.
The proposed PMCE outperforms previous state-of-the-art methods in terms of both per-frame accuracy and temporal consistency.
arXiv Detail & Related papers (2023-08-20T16:03:21Z) - 3D-Aware Neural Body Fitting for Occlusion Robust 3D Human Pose
Estimation [28.24765523800196]
We propose 3D-aware Neural Body Fitting (3DNBF) for 3D human pose estimation.
In particular, we propose a generative model of deep features based on a volumetric human representation with Gaussian ellipsoidal kernels emitting 3D pose-dependent feature vectors.
The neural features are trained with contrastive learning to become 3D-aware and hence to overcome the 2D-3D ambiguity.
arXiv Detail & Related papers (2023-08-19T22:41:00Z) - ARTIC3D: Learning Robust Articulated 3D Shapes from Noisy Web Image
Collections [71.46546520120162]
Estimating 3D articulated shapes like animal bodies from monocular images is inherently challenging.
We propose ARTIC3D, a self-supervised framework to reconstruct per-instance 3D shapes from a sparse image collection in-the-wild.
We produce realistic animations by fine-tuning the rendered shape and texture under rigid part transformations.
arXiv Detail & Related papers (2023-06-07T17:47:50Z) - HybrIK-X: Hybrid Analytical-Neural Inverse Kinematics for Whole-body
Mesh Recovery [40.88628101598707]
This paper presents a novel hybrid inverse kinematics solution, HybrIK, that integrates 3D keypoint estimation and body mesh recovery.
HybrIK directly transforms accurate 3D joints to body-part rotations via twist-and-swing decomposition.
We further develop a holistic framework, HybrIK-X, which enhances HybrIK with articulated hands and an expressive face.
arXiv Detail & Related papers (2023-04-12T08:29:31Z) - KAMA: 3D Keypoint Aware Body Mesh Articulation [79.04090630502782]
We propose an analytical solution to articulate a parametric body model, SMPL, via a set of straightforward geometric transformations.
Our approach offers significantly better alignment to image content when compared to state-of-the-art approaches.
Results on the challenging 3DPW and Human3.6M demonstrate that our approach yields state-of-the-art body mesh fittings.
arXiv Detail & Related papers (2021-04-27T23:01:03Z) - An Effective Loss Function for Generating 3D Models from Single 2D Image
without Rendering [0.0]
Differentiable rendering is a very successful technique that applies to a Single-View 3D Reconstruction.
Currents use losses based on pixels between a rendered image of some 3D reconstructed object and ground-truth images from given matched viewpoints to optimise parameters of the 3D shape.
We propose a novel effective loss function that evaluates how well the projections of reconstructed 3D point clouds cover the ground truth object's silhouette.
arXiv Detail & Related papers (2021-03-05T00:02:18Z) - Synthetic Training for Monocular Human Mesh Recovery [100.38109761268639]
This paper aims to estimate 3D mesh of multiple body parts with large-scale differences from a single RGB image.
The main challenge is lacking training data that have complete 3D annotations of all body parts in 2D images.
We propose a depth-to-scale (D2S) projection to incorporate the depth difference into the projection function to derive per-joint scale variants.
arXiv Detail & Related papers (2020-10-27T03:31:35Z) - PerMO: Perceiving More at Once from a Single Image for Autonomous
Driving [76.35684439949094]
We present a novel approach to detect, segment, and reconstruct complete textured 3D models of vehicles from a single image.
Our approach combines the strengths of deep learning and the elegance of traditional techniques.
We have integrated these algorithms with an autonomous driving system.
arXiv Detail & Related papers (2020-07-16T05:02:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.