Related papers: PHD: Personalized 3D Human Body Fitting with Point Diffusion

PHD: Personalized 3D Human Body Fitting with Point Diffusion

URL: http://arxiv.org/abs/2508.21257v1
Date: Thu, 28 Aug 2025 23:03:35 GMT
Title: PHD: Personalized 3D Human Body Fitting with Point Diffusion
Authors: Hsuan-I Ho, Chen Guo, Po-Chen Wu, Ivan Shugurov, Chengcheng Tang, Abhay Mittal, Sizhe An, Manuel Kaufmann, Linguang Zhang,
Abstract summary: PHD is a novel approach for personalized 3D human mesh recovery (HMR) and body fitting.<n>It leverages user-specific shape information to improve pose estimation accuracy from videos.
Score: 19.282384138333537
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We introduce PHD, a novel approach for personalized 3D human mesh recovery (HMR) and body fitting that leverages user-specific shape information to improve pose estimation accuracy from videos. Traditional HMR methods are designed to be user-agnostic and optimized for generalization. While these methods often refine poses using constraints derived from the 2D image to improve alignment, this process compromises 3D accuracy by failing to jointly account for person-specific body shapes and the plausibility of 3D poses. In contrast, our pipeline decouples this process by first calibrating the user's body shape and then employing a personalized pose fitting process conditioned on that shape. To achieve this, we develop a body shape-conditioned 3D pose prior, implemented as a Point Diffusion Transformer, which iteratively guides the pose fitting via a Point Distillation Sampling loss. This learned 3D pose prior effectively mitigates errors arising from an over-reliance on 2D constraints. Consequently, our approach improves not only pelvis-aligned pose accuracy but also absolute pose accuracy -- an important metric often overlooked by prior work. Furthermore, our method is highly data-efficient, requiring only synthetic data for training, and serves as a versatile plug-and-play module that can be seamlessly integrated with existing 3D pose estimators to enhance their performance. Project page: https://phd-pose.github.io/

Related papers

CameraHMR: Aligning People with Perspective [54.05758012879385]
We address the challenge of accurate 3D human pose and shape estimation from monocular images. Existing training datasets containing real images with pseudo ground truth (pGT) use SMPLify to fit SMPL to sparse 2D joint locations. We make two contributions that improve pGT accuracy.
arXiv Detail & Related papers (2024-11-12T19:12:12Z)
TokenHMR: Advancing Human Mesh Recovery with a Tokenized Pose Representation [48.08156777874614]
Current methods leverage 3D pseudo-ground-truth (p-GT) and 2D keypoints, leading to robust performance. With such methods, we observe a paradoxical decline in 3D pose accuracy with increasing 2D accuracy. We quantify the error induced by current camera models and show that fitting 2D keypoints and p-GT accurately causes incorrect 3D poses.
arXiv Detail & Related papers (2024-04-25T17:09:14Z)
UPose3D: Uncertainty-Aware 3D Human Pose Estimation with Cross-View and Temporal Cues [55.69339788566899]
UPose3D is a novel approach for multi-view 3D human pose estimation. It improves robustness and flexibility without requiring direct 3D annotations.
arXiv Detail & Related papers (2024-04-23T00:18:00Z)
LInKs "Lifting Independent Keypoints" -- Partial Pose Lifting for Occlusion Handling with Improved Accuracy in 2D-3D Human Pose Estimation [4.648549457266638]
We present LInKs, a novel unsupervised learning method to recover 3D human poses from 2D kinematic skeletons. Our approach follows a unique two-step process, which involves first lifting the occluded 2D pose to the 3D domain. This lift-then-fill approach leads to significantly more accurate results compared to models that complete the pose in 2D space alone.
arXiv Detail & Related papers (2023-09-13T18:28:04Z)
3D-Aware Neural Body Fitting for Occlusion Robust 3D Human Pose Estimation [28.24765523800196]
We propose 3D-aware Neural Body Fitting (3DNBF) for 3D human pose estimation. In particular, we propose a generative model of deep features based on a volumetric human representation with Gaussian ellipsoidal kernels emitting 3D pose-dependent feature vectors. The neural features are trained with contrastive learning to become 3D-aware and hence to overcome the 2D-3D ambiguity.
arXiv Detail & Related papers (2023-08-19T22:41:00Z)
Shape of You: Precise 3D shape estimations for diverse body types [5.037272815698192]
This paper presents an approach to improve the accuracy of 3D body shape estimation for vision-based clothing recommendation systems. We propose two loss functions that can be readily integrated into parametric 3D human reconstruction pipelines. Our method improves over the recent SHAPY method by 17.7% on the challenging SSP-3D dataset.
arXiv Detail & Related papers (2023-04-14T20:53:16Z)
Adversarial Parametric Pose Prior [106.12437086990853]
We learn a prior that restricts the SMPL parameters to values that produce realistic poses via adversarial training. We show that our learned prior covers the diversity of the real-data distribution, facilitates optimization for 3D reconstruction from 2D keypoints, and yields better pose estimates when used for regression from images.
arXiv Detail & Related papers (2021-12-08T10:05:32Z)
HybrIK: A Hybrid Analytical-Neural Inverse Kinematics Solution for 3D Human Pose and Shape Estimation [39.67289969828706]
We propose a novel hybrid inverse kinematics solution (HybrIK) to bridge the gap between body mesh estimation and 3D keypoint estimation. HybrIK directly transforms accurate 3D joints to relative body-part rotations for 3D body mesh reconstruction. We show that HybrIK preserves both the accuracy of 3D pose and the realistic body structure of the parametric human model.
arXiv Detail & Related papers (2020-11-30T10:32:30Z)
Synthetic Training for Monocular Human Mesh Recovery [100.38109761268639]
This paper aims to estimate 3D mesh of multiple body parts with large-scale differences from a single RGB image. The main challenge is lacking training data that have complete 3D annotations of all body parts in 2D images. We propose a depth-to-scale (D2S) projection to incorporate the depth difference into the projection function to derive per-joint scale variants.
arXiv Detail & Related papers (2020-10-27T03:31:35Z)
Exemplar Fine-Tuning for 3D Human Model Fitting Towards In-the-Wild 3D Human Pose Estimation [107.07047303858664]
Large-scale human datasets with 3D ground-truth annotations are difficult to obtain in the wild. We address this problem by augmenting existing 2D datasets with high-quality 3D pose fits. The resulting annotations are sufficient to train from scratch 3D pose regressor networks that outperform the current state-of-the-art on in-the-wild benchmarks.
arXiv Detail & Related papers (2020-04-07T20:21:18Z)

This list is automatically generated from the titles and abstracts of the papers in this site.