D3PRefiner: A Diffusion-based Denoise Method for 3D Human Pose
Refinement
- URL: http://arxiv.org/abs/2401.03914v1
- Date: Mon, 8 Jan 2024 14:21:02 GMT
- Title: D3PRefiner: A Diffusion-based Denoise Method for 3D Human Pose
Refinement
- Authors: Danqi Yan, Qing Gao, Yuepeng Qian, Xinxing Chen, Chenglong Fu, and
Yuquan Leng
- Abstract summary: A Diffusion-based 3D Pose Refiner is proposed to refine the output of any existing 3D pose estimator.
We leverage the architecture of current diffusion models to convert the distribution of noisy 3D poses into ground truth 3D poses.
Experimental results demonstrate the proposed architecture can significantly improve the performance of current sequence-to-sequence 3D pose estimators.
- Score: 3.514184876338779
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Three-dimensional (3D) human pose estimation using a monocular camera has
gained increasing attention due to its ease of implementation and the abundance
of data available from daily life. However, owing to the inherent depth
ambiguity in images, the accuracy of existing monocular camera-based 3D pose
estimation methods remains unsatisfactory, and the estimated 3D poses usually
include much noise. By observing the histogram of this noise, we find each
dimension of the noise follows a certain distribution, which indicates the
possibility for a neural network to learn the mapping between noisy poses and
ground truth poses. In this work, in order to obtain more accurate 3D poses, a
Diffusion-based 3D Pose Refiner (D3PRefiner) is proposed to refine the output
of any existing 3D pose estimator. We first introduce a conditional
multivariate Gaussian distribution to model the distribution of noisy 3D poses,
using paired 2D poses and noisy 3D poses as conditions to achieve greater
accuracy. Additionally, we leverage the architecture of current diffusion
models to convert the distribution of noisy 3D poses into ground truth 3D
poses. To evaluate the effectiveness of the proposed method, two
state-of-the-art sequence-to-sequence 3D pose estimators are used as basic 3D
pose estimation models, and the proposed method is evaluated on different types
of 2D poses and different lengths of the input sequence. Experimental results
demonstrate the proposed architecture can significantly improve the performance
of current sequence-to-sequence 3D pose estimators, with a reduction of at
least 10.3% in the mean per joint position error (MPJPE) and at least 11.0% in
the Procrustes MPJPE (P-MPJPE).
Related papers
- UPose3D: Uncertainty-Aware 3D Human Pose Estimation with Cross-View and Temporal Cues [55.69339788566899]
UPose3D is a novel approach for multi-view 3D human pose estimation.
It improves robustness and flexibility without requiring direct 3D annotations.
arXiv Detail & Related papers (2024-04-23T00:18:00Z) - 3D-Aware Neural Body Fitting for Occlusion Robust 3D Human Pose
Estimation [28.24765523800196]
We propose 3D-aware Neural Body Fitting (3DNBF) for 3D human pose estimation.
In particular, we propose a generative model of deep features based on a volumetric human representation with Gaussian ellipsoidal kernels emitting 3D pose-dependent feature vectors.
The neural features are trained with contrastive learning to become 3D-aware and hence to overcome the 2D-3D ambiguity.
arXiv Detail & Related papers (2023-08-19T22:41:00Z) - Diffusion-Based 3D Human Pose Estimation with Multi-Hypothesis
Aggregation [64.874000550443]
A Diffusion-based 3D Pose estimation (D3DP) method with Joint-wise reProjection-based Multi-hypothesis Aggregation (JPMA) is proposed.
The proposed JPMA assembles multiple hypotheses generated by D3DP into a single 3D pose for practical use.
Our method outperforms the state-of-the-art deterministic and probabilistic approaches by 1.5% and 8.9%, respectively.
arXiv Detail & Related papers (2023-03-21T04:00:47Z) - DiffPose: Toward More Reliable 3D Pose Estimation [11.6015323757147]
We propose a novel pose estimation framework (DiffPose) that formulates 3D pose estimation as a reverse diffusion process.
Our proposed DiffPose significantly outperforms existing methods on the widely used pose estimation benchmarks Human3.6M and MPI-INF-3DHP.
arXiv Detail & Related papers (2022-11-30T12:22:22Z) - A generic diffusion-based approach for 3D human pose prediction in the
wild [68.00961210467479]
3D human pose forecasting, i.e., predicting a sequence of future human 3D poses given a sequence of past observed ones, is a challenging-temporal task.
We provide a unified formulation in which incomplete elements (no matter in the prediction or observation) are treated as noise and propose a conditional diffusion model that denoises them and forecasts plausible poses.
We investigate our findings on four standard datasets and obtain significant improvements over the state-of-the-art.
arXiv Detail & Related papers (2022-10-11T17:59:54Z) - PONet: Robust 3D Human Pose Estimation via Learning Orientations Only [116.1502793612437]
We propose a novel Pose Orientation Net (PONet) that is able to robustly estimate 3D pose by learning orientations only.
PONet estimates the 3D orientation of these limbs by taking advantage of the local image evidence to recover the 3D pose.
We evaluate our method on multiple datasets, including Human3.6M, MPII, MPI-INF-3DHP, and 3DPW.
arXiv Detail & Related papers (2021-12-21T12:48:48Z) - Probabilistic Monocular 3D Human Pose Estimation with Normalizing Flows [24.0966076588569]
We propose a normalizing flow based method that exploits the deterministic 3D-to-2D mapping to solve the ambiguous inverse 2D-to-3D problem.
We evaluate our approach on the two benchmark datasets Human3.6M and MPI-INF-3DHP, outperforming all comparable methods in most metrics.
arXiv Detail & Related papers (2021-07-29T07:33:14Z) - Uncertainty-Aware Camera Pose Estimation from Points and Lines [101.03675842534415]
Perspective-n-Point-and-Line (Pn$PL) aims at fast, accurate and robust camera localizations with respect to a 3D model from 2D-3D feature coordinates.
arXiv Detail & Related papers (2021-07-08T15:19:36Z) - Residual Pose: A Decoupled Approach for Depth-based 3D Human Pose
Estimation [18.103595280706593]
We leverage recent advances in reliable 2D pose estimation with CNN to estimate the 3D pose of people from depth images.
Our approach achieves very competitive results both in accuracy and speed on two public datasets.
arXiv Detail & Related papers (2020-11-10T10:08:13Z) - Synthetic Training for Monocular Human Mesh Recovery [100.38109761268639]
This paper aims to estimate 3D mesh of multiple body parts with large-scale differences from a single RGB image.
The main challenge is lacking training data that have complete 3D annotations of all body parts in 2D images.
We propose a depth-to-scale (D2S) projection to incorporate the depth difference into the projection function to derive per-joint scale variants.
arXiv Detail & Related papers (2020-10-27T03:31:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.