Global Adaptation meets Local Generalization: Unsupervised Domain
Adaptation for 3D Human Pose Estimation
- URL: http://arxiv.org/abs/2303.16456v2
- Date: Thu, 17 Aug 2023 06:55:15 GMT
- Title: Global Adaptation meets Local Generalization: Unsupervised Domain
Adaptation for 3D Human Pose Estimation
- Authors: Wenhao Chai, Zhongyu Jiang, Jenq-Neng Hwang, and Gaoang Wang
- Abstract summary: textitPoseDA achieves 61.3 mm of MPJPE on MPI-INF-3DHP under a cross-dataset evaluation setup, improving upon the previous state-of-the-art method by 10.2%.
- Score: 31.178656420040692
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: When applying a pre-trained 2D-to-3D human pose lifting model to a target
unseen dataset, large performance degradation is commonly encountered due to
domain shift issues. We observe that the degradation is caused by two factors:
1) the large distribution gap over global positions of poses between the source
and target datasets due to variant camera parameters and settings, and 2) the
deficient diversity of local structures of poses in training. To this end, we
combine \textbf{global adaptation} and \textbf{local generalization} in
\textit{PoseDA}, a simple yet effective framework of unsupervised domain
adaptation for 3D human pose estimation. Specifically, global adaptation aims
to align global positions of poses from the source domain to the target domain
with a proposed global position alignment (GPA) module. And local
generalization is designed to enhance the diversity of 2D-3D pose mapping with
a local pose augmentation (LPA) module. These modules bring significant
performance improvement without introducing additional learnable parameters. In
addition, we propose local pose augmentation (LPA) to enhance the diversity of
3D poses following an adversarial training scheme consisting of 1) a
augmentation generator that generates the parameters of pre-defined pose
transformations and 2) an anchor discriminator to ensure the reality and
quality of the augmented data. Our approach can be applicable to almost all
2D-3D lifting models. \textit{PoseDA} achieves 61.3 mm of MPJPE on MPI-INF-3DHP
under a cross-dataset evaluation setup, improving upon the previous
state-of-the-art method by 10.2\%.
Related papers
- Toward Efficient Generalization in 3D Human Pose Estimation via a Canonical Domain Approach [0.0]
Performance degradation caused by domain gaps between source and target domains remains a major challenge to generalization.
We propose a novel canonical domain approach that maps both the source and target domains into a unified canonical domain.
The proposed method substantially improves generalization capability across datasets while using the same data volume.
arXiv Detail & Related papers (2025-01-27T15:39:39Z) - Optimizing Local-Global Dependencies for Accurate 3D Human Pose Estimation [2.1330933342577096]
We propose SSR-STF, a dual-stream model that integrates local features with global dependencies to enhance 3D human pose estimation.
Specifically, we introduce SSRFormer, a simple yet effective module that employs the skeleton selective refine attention (SSRA) mechanism to capture fine-grained local dependencies.
Experiments on the Human3.6M and MPI-INF-3DHP datasets demonstrate that SSR-STF achieves state-of-the-art performance, with P1 errors of 37.4 mm and 13.2 mm respectively.
arXiv Detail & Related papers (2024-12-27T14:54:12Z) - Boosting Cross-Domain Point Classification via Distilling Relational Priors from 2D Transformers [59.0181939916084]
Traditional 3D networks mainly focus on local geometric details and ignore the topological structure between local geometries.
We propose a novel Priors Distillation (RPD) method to extract priors from the well-trained transformers on massive images.
Experiments on the PointDA-10 and the Sim-to-Real datasets verify that the proposed method consistently achieves the state-of-the-art performance of UDA for point cloud classification.
arXiv Detail & Related papers (2024-07-26T06:29:09Z) - UPose3D: Uncertainty-Aware 3D Human Pose Estimation with Cross-View and Temporal Cues [55.69339788566899]
UPose3D is a novel approach for multi-view 3D human pose estimation.
It improves robustness and flexibility without requiring direct 3D annotations.
arXiv Detail & Related papers (2024-04-23T00:18:00Z) - A Dual-Augmentor Framework for Domain Generalization in 3D Human Pose Estimation [11.525573321175925]
3D human pose data collected in controlled laboratory settings present challenges for pose estimators that generalize across diverse scenarios.
We propose a novel framework featuring two pose augmentors: the weak and the strong augmentors.
Our proposed approach significantly outperforms existing methods, as demonstrated through comprehensive experiments on various benchmark datasets.
arXiv Detail & Related papers (2024-03-17T19:10:07Z) - Source-Free and Image-Only Unsupervised Domain Adaptation for Category
Level Object Pose Estimation [18.011044932979143]
3DUDA is a method capable of adapting to a nuisance-ridden target domain without 3D or depth data.
We represent object categories as simple cuboid meshes, and harness a generative model of neural feature activations.
We show that our method simulates fine-tuning on a global pseudo-labeled dataset under mild assumptions.
arXiv Detail & Related papers (2024-01-19T17:48:05Z) - Double-chain Constraints for 3D Human Pose Estimation in Images and
Videos [21.42410292863492]
Reconstructing 3D poses from 2D poses lacking depth information is challenging due to the complexity and diversity of human motion.
We propose a novel model, called Double-chain Graph Convolutional Transformer (DC-GCT), to constrain the pose.
We show that DC-GCT achieves state-of-the-art performance on two challenging datasets.
arXiv Detail & Related papers (2023-08-10T02:41:18Z) - Non-Local Latent Relation Distillation for Self-Adaptive 3D Human Pose
Estimation [63.199549837604444]
3D human pose estimation approaches leverage different forms of strong (2D/3D pose) or weak (multi-view or depth) paired supervision.
We cast 3D pose learning as a self-supervised adaptation problem that aims to transfer the task knowledge from a labeled source domain to a completely unpaired target.
We evaluate different self-adaptation settings and demonstrate state-of-the-art 3D human pose estimation performance on standard benchmarks.
arXiv Detail & Related papers (2022-04-05T03:52:57Z) - Aligning Silhouette Topology for Self-Adaptive 3D Human Pose Recovery [70.66865453410958]
Articulation-centric 2D/3D pose supervision forms the core training objective in most existing 3D human pose estimation techniques.
We propose a novel framework that relies only on silhouette supervision to adapt a source-trained model-based regressor.
We develop a series of convolution-friendly spatial transformations in order to disentangle a topological-skeleton representation from the raw silhouette.
arXiv Detail & Related papers (2022-04-04T06:58:15Z) - Uncertainty-Aware Adaptation for Self-Supervised 3D Human Pose
Estimation [70.32536356351706]
We introduce MRP-Net that constitutes a common deep network backbone with two output heads subscribing to two diverse configurations.
We derive suitable measures to quantify prediction uncertainty at both pose and joint level.
We present a comprehensive evaluation of the proposed approach and demonstrate state-of-the-art performance on benchmark datasets.
arXiv Detail & Related papers (2022-03-29T07:14:58Z) - PoseAug: A Differentiable Pose Augmentation Framework for 3D Human Pose
Estimation [83.50127973254538]
Existing 3D human pose estimators suffer poor generalization performance to new datasets.
We present PoseAug, a new auto-augmentation framework that learns to augment the available training poses towards a greater diversity.
arXiv Detail & Related papers (2021-05-06T06:57:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.