DiffPose: Toward More Reliable 3D Pose Estimation
- URL: http://arxiv.org/abs/2211.16940v3
- Date: Sun, 9 Apr 2023 06:46:06 GMT
- Title: DiffPose: Toward More Reliable 3D Pose Estimation
- Authors: Jia Gong, Lin Geng Foo, Zhipeng Fan, Qiuhong Ke, Hossein Rahmani, Jun
Liu
- Abstract summary: We propose a novel pose estimation framework (DiffPose) that formulates 3D pose estimation as a reverse diffusion process.
Our proposed DiffPose significantly outperforms existing methods on the widely used pose estimation benchmarks Human3.6M and MPI-INF-3DHP.
- Score: 11.6015323757147
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Monocular 3D human pose estimation is quite challenging due to the inherent
ambiguity and occlusion, which often lead to high uncertainty and
indeterminacy. On the other hand, diffusion models have recently emerged as an
effective tool for generating high-quality images from noise. Inspired by their
capability, we explore a novel pose estimation framework (DiffPose) that
formulates 3D pose estimation as a reverse diffusion process. We incorporate
novel designs into our DiffPose to facilitate the diffusion process for 3D pose
estimation: a pose-specific initialization of pose uncertainty distributions, a
Gaussian Mixture Model-based forward diffusion process, and a
context-conditioned reverse diffusion process. Our proposed DiffPose
significantly outperforms existing methods on the widely used pose estimation
benchmarks Human3.6M and MPI-INF-3DHP. Project page:
https://gongjia0208.github.io/Diffpose/.
Related papers
- UPose3D: Uncertainty-Aware 3D Human Pose Estimation with Cross-View and Temporal Cues [55.69339788566899]
UPose3D is a novel approach for multi-view 3D human pose estimation.
It improves robustness and flexibility without requiring direct 3D annotations.
arXiv Detail & Related papers (2024-04-23T00:18:00Z) - HandDiff: 3D Hand Pose Estimation with Diffusion on Image-Point Cloud [60.47544798202017]
Hand pose estimation is a critical task in various human-computer interaction applications.
This paper proposes HandDiff, a diffusion-based hand pose estimation model that iteratively denoises accurate hand pose conditioned on hand-shaped image-point clouds.
Experimental results demonstrate that the proposed HandDiff significantly outperforms the existing approaches on four challenging hand pose benchmark datasets.
arXiv Detail & Related papers (2024-04-04T02:15:16Z) - Cameras as Rays: Pose Estimation via Ray Diffusion [54.098613859015856]
Estimating camera poses is a fundamental task for 3D reconstruction and remains challenging given sparsely sampled views.
We propose a distributed representation of camera pose that treats a camera as a bundle of rays.
Our proposed methods, both regression- and diffusion-based, demonstrate state-of-the-art performance on camera pose estimation on CO3D.
arXiv Detail & Related papers (2024-02-22T18:59:56Z) - Diffusion-based Pose Refinement and Muti-hypothesis Generation for 3D
Human Pose Estimaiton [27.708016152889787]
Previous probabilistic models for 3D Human Pose Estimation (3DHPE) aimed to enhance pose accuracy by generating multiple hypotheses.
Most of the hypotheses generated deviate substantially from the true pose.
Compared to deterministic models, the excessive uncertainty in probabilistic models leads to weaker performance in single-hypothesis prediction.
We propose a diffusion-based refinement framework called DRPose, which refines the output of deterministic models by reverse diffusion.
arXiv Detail & Related papers (2024-01-10T04:07:50Z) - D3PRefiner: A Diffusion-based Denoise Method for 3D Human Pose
Refinement [3.514184876338779]
A Diffusion-based 3D Pose Refiner is proposed to refine the output of any existing 3D pose estimator.
We leverage the architecture of current diffusion models to convert the distribution of noisy 3D poses into ground truth 3D poses.
Experimental results demonstrate the proposed architecture can significantly improve the performance of current sequence-to-sequence 3D pose estimators.
arXiv Detail & Related papers (2024-01-08T14:21:02Z) - DiffHPE: Robust, Coherent 3D Human Pose Lifting with Diffusion [54.0238087499699]
We show that diffusion models enhance the accuracy, robustness, and coherence of human pose estimations.
We introduce DiffHPE, a novel strategy for harnessing diffusion models in 3D-HPE.
Our findings indicate that while standalone diffusion models provide commendable performance, their accuracy is even better in combination with supervised models.
arXiv Detail & Related papers (2023-09-04T12:54:10Z) - Denoising Diffusion for 3D Hand Pose Estimation from Images [38.20064386142944]
This paper addresses the problem of 3D hand pose estimation from monocular images or sequences.
We present a novel end-to-end framework for 3D hand regression that employs diffusion models that have shown excellent ability to capture the distribution of data for generative purposes.
The proposed model provides state-of-the-art performance when lifting a 2D single-hand image to 3D.
arXiv Detail & Related papers (2023-08-18T12:57:22Z) - Diffusion-Based 3D Human Pose Estimation with Multi-Hypothesis
Aggregation [64.874000550443]
A Diffusion-based 3D Pose estimation (D3DP) method with Joint-wise reProjection-based Multi-hypothesis Aggregation (JPMA) is proposed.
The proposed JPMA assembles multiple hypotheses generated by D3DP into a single 3D pose for practical use.
Our method outperforms the state-of-the-art deterministic and probabilistic approaches by 1.5% and 8.9%, respectively.
arXiv Detail & Related papers (2023-03-21T04:00:47Z) - DiffPose: Multi-hypothesis Human Pose Estimation using Diffusion models [5.908471365011943]
We propose emphDiffPose, a conditional diffusion model that predicts multiple hypotheses for a given input image.
We show that DiffPose slightly improves upon the state of the art for multi-hypothesis pose estimation for simple poses and outperforms it by a large margin for highly ambiguous poses.
arXiv Detail & Related papers (2022-11-29T18:55:13Z) - A generic diffusion-based approach for 3D human pose prediction in the
wild [68.00961210467479]
3D human pose forecasting, i.e., predicting a sequence of future human 3D poses given a sequence of past observed ones, is a challenging-temporal task.
We provide a unified formulation in which incomplete elements (no matter in the prediction or observation) are treated as noise and propose a conditional diffusion model that denoises them and forecasts plausible poses.
We investigate our findings on four standard datasets and obtain significant improvements over the state-of-the-art.
arXiv Detail & Related papers (2022-10-11T17:59:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.