3D-Aware Neural Body Fitting for Occlusion Robust 3D Human Pose
Estimation
- URL: http://arxiv.org/abs/2308.10123v1
- Date: Sat, 19 Aug 2023 22:41:00 GMT
- Title: 3D-Aware Neural Body Fitting for Occlusion Robust 3D Human Pose
Estimation
- Authors: Yi Zhang, Pengliang Ji, Angtian Wang, Jieru Mei, Adam Kortylewski,
Alan Yuille
- Abstract summary: We propose 3D-aware Neural Body Fitting (3DNBF) for 3D human pose estimation.
In particular, we propose a generative model of deep features based on a volumetric human representation with Gaussian ellipsoidal kernels emitting 3D pose-dependent feature vectors.
The neural features are trained with contrastive learning to become 3D-aware and hence to overcome the 2D-3D ambiguity.
- Score: 28.24765523800196
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Regression-based methods for 3D human pose estimation directly predict the 3D
pose parameters from a 2D image using deep networks. While achieving
state-of-the-art performance on standard benchmarks, their performance degrades
under occlusion. In contrast, optimization-based methods fit a parametric body
model to 2D features in an iterative manner. The localized reconstruction loss
can potentially make them robust to occlusion, but they suffer from the 2D-3D
ambiguity.
Motivated by the recent success of generative models in rigid object pose
estimation, we propose 3D-aware Neural Body Fitting (3DNBF) - an approximate
analysis-by-synthesis approach to 3D human pose estimation with SOTA
performance and occlusion robustness. In particular, we propose a generative
model of deep features based on a volumetric human representation with Gaussian
ellipsoidal kernels emitting 3D pose-dependent feature vectors. The neural
features are trained with contrastive learning to become 3D-aware and hence to
overcome the 2D-3D ambiguity.
Experiments show that 3DNBF outperforms other approaches on both occluded and
standard benchmarks. Code is available at https://github.com/edz-o/3DNBF
Related papers
- UPose3D: Uncertainty-Aware 3D Human Pose Estimation with Cross-View and Temporal Cues [55.69339788566899]
UPose3D is a novel approach for multi-view 3D human pose estimation.
It improves robustness and flexibility without requiring direct 3D annotations.
arXiv Detail & Related papers (2024-04-23T00:18:00Z) - JOTR: 3D Joint Contrastive Learning with Transformers for Occluded Human
Mesh Recovery [84.67823511418334]
This paper presents 3D JOint contrastive learning with TRansformers framework for handling occluded 3D human mesh recovery.
Our method includes an encoder-decoder transformer architecture to fuse 2D and 3D representations for achieving 2D$&$3D aligned results.
arXiv Detail & Related papers (2023-07-31T02:58:58Z) - Back to Optimization: Diffusion-based Zero-Shot 3D Human Pose Estimation [29.037799937729687]
Learning-based methods have dominated the 3D human pose estimation (HPE) tasks with significantly better performance in most benchmarks than traditional optimization-based methods.
We propose textbfZero-shot textbfDiffusion-based textbfOptimization (textbfZeDO) pipeline for 3D HPE.
Our multi-hypothesis textittextbfZeDO achieves state-of-the-art (SOTA) performance on Human3.6M, with minMPJPE $51.4$
arXiv Detail & Related papers (2023-07-07T21:03:18Z) - DiffuPose: Monocular 3D Human Pose Estimation via Denoising Diffusion
Probabilistic Model [25.223801390996435]
This paper focuses on reconstructing a 3D pose from a single 2D keypoint detection.
We build a novel diffusion-based framework to effectively sample diverse 3D poses from an off-the-shelf 2D detector.
We evaluate our method on the widely adopted Human3.6M and HumanEva-I datasets.
arXiv Detail & Related papers (2022-12-06T07:22:20Z) - PONet: Robust 3D Human Pose Estimation via Learning Orientations Only [116.1502793612437]
We propose a novel Pose Orientation Net (PONet) that is able to robustly estimate 3D pose by learning orientations only.
PONet estimates the 3D orientation of these limbs by taking advantage of the local image evidence to recover the 3D pose.
We evaluate our method on multiple datasets, including Human3.6M, MPII, MPI-INF-3DHP, and 3DPW.
arXiv Detail & Related papers (2021-12-21T12:48:48Z) - Probabilistic Monocular 3D Human Pose Estimation with Normalizing Flows [24.0966076588569]
We propose a normalizing flow based method that exploits the deterministic 3D-to-2D mapping to solve the ambiguous inverse 2D-to-3D problem.
We evaluate our approach on the two benchmark datasets Human3.6M and MPI-INF-3DHP, outperforming all comparable methods in most metrics.
arXiv Detail & Related papers (2021-07-29T07:33:14Z) - HybrIK: A Hybrid Analytical-Neural Inverse Kinematics Solution for 3D
Human Pose and Shape Estimation [39.67289969828706]
We propose a novel hybrid inverse kinematics solution (HybrIK) to bridge the gap between body mesh estimation and 3D keypoint estimation.
HybrIK directly transforms accurate 3D joints to relative body-part rotations for 3D body mesh reconstruction.
We show that HybrIK preserves both the accuracy of 3D pose and the realistic body structure of the parametric human model.
arXiv Detail & Related papers (2020-11-30T10:32:30Z) - Synthetic Training for Monocular Human Mesh Recovery [100.38109761268639]
This paper aims to estimate 3D mesh of multiple body parts with large-scale differences from a single RGB image.
The main challenge is lacking training data that have complete 3D annotations of all body parts in 2D images.
We propose a depth-to-scale (D2S) projection to incorporate the depth difference into the projection function to derive per-joint scale variants.
arXiv Detail & Related papers (2020-10-27T03:31:35Z) - Neural Descent for Visual 3D Human Pose and Shape [67.01050349629053]
We present deep neural network methodology to reconstruct the 3d pose and shape of people, given an input RGB image.
We rely on a recently introduced, expressivefull body statistical 3d human model, GHUM, trained end-to-end.
Central to our methodology, is a learning to learn and optimize approach, referred to as HUmanNeural Descent (HUND), which avoids both second-order differentiation.
arXiv Detail & Related papers (2020-08-16T13:38:41Z) - Exemplar Fine-Tuning for 3D Human Model Fitting Towards In-the-Wild 3D
Human Pose Estimation [107.07047303858664]
Large-scale human datasets with 3D ground-truth annotations are difficult to obtain in the wild.
We address this problem by augmenting existing 2D datasets with high-quality 3D pose fits.
The resulting annotations are sufficient to train from scratch 3D pose regressor networks that outperform the current state-of-the-art on in-the-wild benchmarks.
arXiv Detail & Related papers (2020-04-07T20:21:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.