Leveraging Synthetic Adult Datasets for Unsupervised Infant Pose Estimation
- URL: http://arxiv.org/abs/2504.05789v1
- Date: Tue, 08 Apr 2025 08:13:38 GMT
- Title: Leveraging Synthetic Adult Datasets for Unsupervised Infant Pose Estimation
- Authors: Sarosij Bose, Hannah Dela Cruz, Arindam Dutta, Elena Kokkoni, Konstantinos Karydis, Amit K. Roy-Chowdhury,
- Abstract summary: SHIFT: Leveraging SyntHetic Adult datasets for Unsupervised InFanT Pose Estimation is presented.<n>It exploits the pseudo-labeling-based Mean-Teacher framework to compensate for the lack of labeled data.<n>It also addresses distribution shifts by enforcing consistency between the student and the teacher pseudo-labels.<n>It significantly outperforms existing state-of-the-art unsupervised domain adaptation (UDA) pose estimation methods by 5% and supervised infant pose estimation methods by a margin of 16%.
- Score: 22.117963103350164
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Human pose estimation is a critical tool across a variety of healthcare applications. Despite significant progress in pose estimation algorithms targeting adults, such developments for infants remain limited. Existing algorithms for infant pose estimation, despite achieving commendable performance, depend on fully supervised approaches that require large amounts of labeled data. These algorithms also struggle with poor generalizability under distribution shifts. To address these challenges, we introduce SHIFT: Leveraging SyntHetic Adult Datasets for Unsupervised InFanT Pose Estimation, which leverages the pseudo-labeling-based Mean-Teacher framework to compensate for the lack of labeled data and addresses distribution shifts by enforcing consistency between the student and the teacher pseudo-labels. Additionally, to penalize implausible predictions obtained from the mean-teacher framework, we incorporate an infant manifold pose prior. To enhance SHIFT's self-occlusion perception ability, we propose a novel visibility consistency module for improved alignment of the predicted poses with the original image. Extensive experiments on multiple benchmarks show that SHIFT significantly outperforms existing state-of-the-art unsupervised domain adaptation (UDA) pose estimation methods by 5% and supervised infant pose estimation methods by a margin of 16%. The project page is available at: https://sarosijbose.github.io/SHIFT.
Related papers
- Unsupervised Domain Adaptation for Occlusion Resilient Human Pose Estimation [23.0839810713682]
Occlusions are a significant challenge to human pose estimation algorithms.<n>We propose OR-POSE: Unsupervised Domain Adaptation for Occlusion Resilient Human POSE Estimation.
arXiv Detail & Related papers (2025-01-06T05:30:37Z) - Affinity-Graph-Guided Contractive Learning for Pretext-Free Medical Image Segmentation with Minimal Annotation [55.325956390997]
This paper proposes an affinity-graph-guided semi-supervised contrastive learning framework (Semi-AGCL) for medical image segmentation.
The framework first designs an average-patch-entropy-driven inter-patch sampling method, which can provide a robust initial feature space.
With merely 10% of the complete annotation set, our model approaches the accuracy of the fully annotated baseline, manifesting a marginal deviation of only 2.52%.
arXiv Detail & Related papers (2024-10-14T10:44:47Z) - Semi-supervised 2D Human Pose Estimation via Adaptive Keypoint Masking [2.297586471170049]
This paper proposes an adaptive keypoint masking method, which can fully mine the information in the samples and obtain better estimation performance.
The effectiveness of the proposed method is verified on COCO and MPII, outperforming the state-of-the-art semi-supervised pose estimation by 5.2% and 0.3%, respectively.
arXiv Detail & Related papers (2024-04-23T08:41:50Z) - Modeling the Uncertainty with Maximum Discrepant Students for
Semi-supervised 2D Pose Estimation [57.17120203327993]
We propose a framework to estimate the quality of pseudo-labels in semi-supervised pose estimation tasks.
Our method improves the performance of semi-supervised pose estimation on three datasets.
arXiv Detail & Related papers (2023-11-03T08:11:06Z) - Semi-Supervised 2D Human Pose Estimation Driven by Position
Inconsistency Pseudo Label Correction Module [74.80776648785897]
The previous method ignored two problems: (i) When conducting interactive training between large model and lightweight model, the pseudo label of lightweight model will be used to guide large models.
We propose a semi-supervised 2D human pose estimation framework driven by a position inconsistency pseudo label correction module (SSPCM)
To further improve the performance of the student model, we use the semi-supervised Cut-Occlude based on pseudo keypoint perception to generate more hard and effective samples.
arXiv Detail & Related papers (2023-03-08T02:57:05Z) - AggPose: Deep Aggregation Vision Transformer for Infant Pose Estimation [6.9000851935487075]
We propose infant pose dataset and Deep Aggregation Vision Transformer for human pose estimation.
AggPose is a fast trained full transformer framework without using convolution operations to extract features in the early stages.
We show that AggPose could effectively learn the multi-scale features among different resolutions and significantly improve the performance of infant pose estimation.
arXiv Detail & Related papers (2022-05-11T05:34:14Z) - Unsupervised Domain Adaptation Learning for Hierarchical Infant Pose
Recognition with Synthetic Data [28.729049747477085]
We present a CNN-based model which takes any infant image as input and predicts the coarse and fine-level pose labels.
Our experimental results show that the proposed method can significantly align the distribution of synthetic and real-world datasets.
arXiv Detail & Related papers (2022-05-04T04:59:26Z) - Uncertainty-Aware Adaptation for Self-Supervised 3D Human Pose
Estimation [70.32536356351706]
We introduce MRP-Net that constitutes a common deep network backbone with two output heads subscribing to two diverse configurations.
We derive suitable measures to quantify prediction uncertainty at both pose and joint level.
We present a comprehensive evaluation of the proposed approach and demonstrate state-of-the-art performance on benchmark datasets.
arXiv Detail & Related papers (2022-03-29T07:14:58Z) - Bayesian Graph Contrastive Learning [55.36652660268726]
We propose a novel perspective of graph contrastive learning methods showing random augmentations leads to encoders.
Our proposed method represents each node by a distribution in the latent space in contrast to existing techniques which embed each node to a deterministic vector.
We show a considerable improvement in performance compared to existing state-of-the-art methods on several benchmark datasets.
arXiv Detail & Related papers (2021-12-15T01:45:32Z) - FP-Age: Leveraging Face Parsing Attention for Facial Age Estimation in
the Wild [50.8865921538953]
We propose a method to explicitly incorporate facial semantics into age estimation.
We design a face parsing-based network to learn semantic information at different scales.
We show that our method consistently outperforms all existing age estimation methods.
arXiv Detail & Related papers (2021-06-21T14:31:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.