Related papers: ManiPose: Manifold-Constrained Multi-Hypothesis 3D Human Pose Estimation

ManiPose: Manifold-Constrained Multi-Hypothesis 3D Human Pose Estimation

URL: http://arxiv.org/abs/2312.06386v2
Date: Wed, 27 Nov 2024 22:24:02 GMT
Title: ManiPose: Manifold-Constrained Multi-Hypothesis 3D Human Pose Estimation
Authors: Cédric Rommel, Victor Letzelter, Nermin Samet, Renaud Marlet, Matthieu Cord, Patrick Pérez, Eduardo Valle,
Abstract summary: ManiPose is a manifold-constrained multi-hypothesis model for human-pose 2D-to-3D lifting.<n>By constraining the outputs to lie on the human pose manifold, ManiPose guarantees the consistency of all hypothetical poses.<n>We showcase the performance of ManiPose on real-world datasets, where it outperforms state-of-the-art models in pose consistency.
Score: 71.2556016049579
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We propose ManiPose, a manifold-constrained multi-hypothesis model for human-pose 2D-to-3D lifting. We provide theoretical and empirical evidence that, due to the depth ambiguity inherent to monocular 3D human pose estimation, traditional regression models suffer from pose-topology consistency issues, which standard evaluation metrics (MPJPE, P-MPJPE and PCK) fail to assess. ManiPose addresses depth ambiguity by proposing multiple candidate 3D poses for each 2D input, each with its estimated plausibility. Unlike previous multi-hypothesis approaches, ManiPose forgoes generative models, greatly facilitating its training and usage. By constraining the outputs to lie on the human pose manifold, ManiPose guarantees the consistency of all hypothetical poses, in contrast to previous works. We showcase the performance of ManiPose on real-world datasets, where it outperforms state-of-the-art models in pose consistency by a large margin while being very competitive on the MPJPE metric.

Related papers

DPoser-X: Diffusion Model as Robust 3D Whole-body Human Pose Prior [82.9526308672547]
We present DPoser-X, a diffusion-based prior model for 3D whole-body human poses.<n>Our approach unifies various pose-centric tasks as inverse problems, solving them through variational diffusion sampling.<n>Our model consistently outperforms state-of-the-art alternatives, establishing a new benchmark for whole-body human pose prior modeling.
arXiv Detail & Related papers (2025-08-01T12:56:39Z)
MPM: A Unified 2D-3D Human Pose Representation via Masked Pose Modeling [59.74064212110042]
mpmcan handle multiple tasks including 3D human pose estimation, 3D pose estimation from cluded 2D pose, and 3D pose completion in a textocbfsingle framework. We conduct extensive experiments and ablation studies on several widely used human pose datasets and achieve state-of-the-art performance on MPI-INF-3DHP.
arXiv Detail & Related papers (2023-06-29T10:30:00Z)
DiffPose: Multi-hypothesis Human Pose Estimation using Diffusion models [5.908471365011943]
We propose emphDiffPose, a conditional diffusion model that predicts multiple hypotheses for a given input image. We show that DiffPose slightly improves upon the state of the art for multi-hypothesis pose estimation for simple poses and outperforms it by a large margin for highly ambiguous poses.
arXiv Detail & Related papers (2022-11-29T18:55:13Z)
PoseGU: 3D Human Pose Estimation with Novel Human Pose Generator and Unbiased Learning [36.609189237732394]
3D pose estimation has recently gained substantial interests in computer vision domain. Existing 3D pose estimation methods have a strong reliance on large size well-annotated 3D pose datasets. We propose PoseGU, a novel human pose generator that generates diverse poses with access only to a small size of seed samples.
arXiv Detail & Related papers (2022-07-07T23:43:53Z)
Uncertainty-Aware Adaptation for Self-Supervised 3D Human Pose Estimation [70.32536356351706]
We introduce MRP-Net that constitutes a common deep network backbone with two output heads subscribing to two diverse configurations. We derive suitable measures to quantify prediction uncertainty at both pose and joint level. We present a comprehensive evaluation of the proposed approach and demonstrate state-of-the-art performance on benchmark datasets.
arXiv Detail & Related papers (2022-03-29T07:14:58Z)
Probabilistic Monocular 3D Human Pose Estimation with Normalizing Flows [24.0966076588569]
We propose a normalizing flow based method that exploits the deterministic 3D-to-2D mapping to solve the ambiguous inverse 2D-to-3D problem. We evaluate our approach on the two benchmark datasets Human3.6M and MPI-INF-3DHP, outperforming all comparable methods in most metrics.
arXiv Detail & Related papers (2021-07-29T07:33:14Z)
3D Multi-bodies: Fitting Sets of Plausible 3D Human Models to Ambiguous Image Data [77.57798334776353]
We consider the problem of obtaining dense 3D reconstructions of humans from single and partially occluded views. We suggest that ambiguities can be modelled more effectively by parametrizing the possible body shapes and poses. We show that our method outperforms alternative approaches in ambiguous pose recovery on standard benchmarks for 3D humans.
arXiv Detail & Related papers (2020-11-02T13:55:31Z)
Synthetic Training for Monocular Human Mesh Recovery [100.38109761268639]
This paper aims to estimate 3D mesh of multiple body parts with large-scale differences from a single RGB image. The main challenge is lacking training data that have complete 3D annotations of all body parts in 2D images. We propose a depth-to-scale (D2S) projection to incorporate the depth difference into the projection function to derive per-joint scale variants.
arXiv Detail & Related papers (2020-10-27T03:31:35Z)
Multi-person 3D Pose Estimation in Crowded Scenes Based on Multi-View Geometry [62.29762409558553]
Epipolar constraints are at the core of feature matching and depth estimation in multi-person 3D human pose estimation methods. Despite the satisfactory performance of this formulation in sparser crowd scenes, its effectiveness is frequently challenged under denser crowd circumstances. In this paper, we depart from the multi-person 3D pose estimation formulation, and instead reformulate it as crowd pose estimation.
arXiv Detail & Related papers (2020-07-21T17:59:36Z)
Kinematic-Structure-Preserved Representation for Unsupervised 3D Human Pose Estimation [58.72192168935338]
Generalizability of human pose estimation models developed using supervision on large-scale in-studio datasets remains questionable. We propose a novel kinematic-structure-preserved unsupervised 3D pose estimation framework, which is not restrained by any paired or unpaired weak supervisions. Our proposed model employs three consecutive differentiable transformations named as forward-kinematics, camera-projection and spatial-map transformation.
arXiv Detail & Related papers (2020-06-24T23:56:33Z)
Self-Supervised 3D Human Pose Estimation via Part Guided Novel Image Synthesis [72.34794624243281]
We propose a self-supervised learning framework to disentangle variations from unlabeled video frames. Our differentiable formalization, bridging the representation gap between the 3D pose and spatial part maps, allows us to operate on videos with diverse camera movements.
arXiv Detail & Related papers (2020-04-09T07:55:01Z)

This list is automatically generated from the titles and abstracts of the papers in this site.