DiffHPE: Robust, Coherent 3D Human Pose Lifting with Diffusion
- URL: http://arxiv.org/abs/2309.01575v1
- Date: Mon, 4 Sep 2023 12:54:10 GMT
- Title: DiffHPE: Robust, Coherent 3D Human Pose Lifting with Diffusion
- Authors: C\'edric Rommel, Eduardo Valle, Micka\"el Chen, Souhaiel Khalfaoui,
Renaud Marlet, Matthieu Cord and Patrick P\'erez
- Abstract summary: We show that diffusion models enhance the accuracy, robustness, and coherence of human pose estimations.
We introduce DiffHPE, a novel strategy for harnessing diffusion models in 3D-HPE.
Our findings indicate that while standalone diffusion models provide commendable performance, their accuracy is even better in combination with supervised models.
- Score: 54.0238087499699
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present an innovative approach to 3D Human Pose Estimation (3D-HPE) by
integrating cutting-edge diffusion models, which have revolutionized diverse
fields, but are relatively unexplored in 3D-HPE. We show that diffusion models
enhance the accuracy, robustness, and coherence of human pose estimations. We
introduce DiffHPE, a novel strategy for harnessing diffusion models in 3D-HPE,
and demonstrate its ability to refine standard supervised 3D-HPE. We also show
how diffusion models lead to more robust estimations in the face of occlusions,
and improve the time-coherence and the sagittal symmetry of predictions. Using
the Human\,3.6M dataset, we illustrate the effectiveness of our approach and
its superiority over existing models, even under adverse situations where the
occlusion patterns in training do not match those in inference. Our findings
indicate that while standalone diffusion models provide commendable
performance, their accuracy is even better in combination with supervised
models, opening exciting new avenues for 3D-HPE research.
Related papers
- 4Diffusion: Multi-view Video Diffusion Model for 4D Generation [55.82208863521353]
Current 4D generation methods have achieved noteworthy efficacy with the aid of advanced diffusion generative models.
We propose a novel 4D generation pipeline, namely 4Diffusion, aimed at generating spatial-temporally consistent 4D content from a monocular video.
arXiv Detail & Related papers (2024-05-31T08:18:39Z) - UPose3D: Uncertainty-Aware 3D Human Pose Estimation with Cross-View and Temporal Cues [55.69339788566899]
UPose3D is a novel approach for multi-view 3D human pose estimation.
It improves robustness and flexibility without requiring direct 3D annotations.
arXiv Detail & Related papers (2024-04-23T00:18:00Z) - CAD: Photorealistic 3D Generation via Adversarial Distillation [28.07049413820128]
We propose a novel learning paradigm for 3D synthesis that utilizes pre-trained diffusion models.
Our method unlocks the generation of high-fidelity and photorealistic 3D content conditioned on a single image and prompt.
arXiv Detail & Related papers (2023-12-11T18:59:58Z) - Learn to Optimize Denoising Scores for 3D Generation: A Unified and
Improved Diffusion Prior on NeRF and 3D Gaussian Splatting [60.393072253444934]
We propose a unified framework aimed at enhancing the diffusion priors for 3D generation tasks.
We identify a divergence between the diffusion priors and the training procedures of diffusion models that substantially impairs the quality of 3D generation.
arXiv Detail & Related papers (2023-12-08T03:55:34Z) - DiffPose: Toward More Reliable 3D Pose Estimation [11.6015323757147]
We propose a novel pose estimation framework (DiffPose) that formulates 3D pose estimation as a reverse diffusion process.
Our proposed DiffPose significantly outperforms existing methods on the widely used pose estimation benchmarks Human3.6M and MPI-INF-3DHP.
arXiv Detail & Related papers (2022-11-30T12:22:22Z) - A generic diffusion-based approach for 3D human pose prediction in the
wild [68.00961210467479]
3D human pose forecasting, i.e., predicting a sequence of future human 3D poses given a sequence of past observed ones, is a challenging-temporal task.
We provide a unified formulation in which incomplete elements (no matter in the prediction or observation) are treated as noise and propose a conditional diffusion model that denoises them and forecasts plausible poses.
We investigate our findings on four standard datasets and obtain significant improvements over the state-of-the-art.
arXiv Detail & Related papers (2022-10-11T17:59:54Z) - Learned Vertex Descent: A New Direction for 3D Human Model Fitting [64.04726230507258]
We propose a novel optimization-based paradigm for 3D human model fitting on images and scans.
Our approach is able to capture the underlying body of clothed people with very different body shapes, achieving a significant improvement compared to state-of-the-art.
LVD is also applicable to 3D model fitting of humans and hands, for which we show a significant improvement to the SOTA with a much simpler and faster method.
arXiv Detail & Related papers (2022-05-12T17:55:51Z) - Distribution-Aware Single-Stage Models for Multi-Person 3D Pose
Estimation [29.430404703883084]
We present a novel Distribution-Aware Single-stage (DAS) model for tackling the challenging multi-person 3D pose estimation problem.
The proposed DAS model simultaneously localizes person positions and their corresponding body joints in the 3D camera space in a one-pass manner.
Comprehensive experiments on benchmarks CMU Panoptic and MuPoTS-3D demonstrate the superior efficiency of the proposed DAS model.
arXiv Detail & Related papers (2022-03-15T07:30:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.