Semi-Supervised Unconstrained Head Pose Estimation in the Wild
- URL: http://arxiv.org/abs/2404.02544v2
- Date: Fri, 23 Aug 2024 10:38:07 GMT
- Title: Semi-Supervised Unconstrained Head Pose Estimation in the Wild
- Authors: Huayi Zhou, Fei Jiang, Jin Yuan, Yong Rui, Hongtao Lu, Kui Jia,
- Abstract summary: We propose the first semi-supervised unconstrained head pose estimation method SemiUHPE.
Our method is based on the observation that the aspect-ratio invariant cropping of wild heads is superior to the previous landmark-based affine alignment.
Experiments and ablation studies show that SemiUHPE outperforms existing methods greatly on public benchmarks.
- Score: 60.08319512840091
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Existing research on unconstrained in-the-wild head pose estimation suffers from the flaws of its datasets, which consist of either numerous samples by non-realistic synthesis or constrained collection, or small-scale natural images yet with plausible manual annotations. To alleviate it, we propose the first semi-supervised unconstrained head pose estimation method SemiUHPE, which can leverage abundant easily available unlabeled head images. Technically, we choose semi-supervised rotation regression and adapt it to the error-sensitive and label-scarce problem of unconstrained head pose. Our method is based on the observation that the aspect-ratio invariant cropping of wild heads is superior to the previous landmark-based affine alignment given that landmarks of unconstrained human heads are usually unavailable, especially for less-explored non-frontal heads. Instead of using an empirically fixed threshold to filter out pseudo labeled heads, we propose dynamic entropy based filtering to adaptively remove unlabeled outliers as training progresses by updating the threshold in multiple stages. We then revisit the design of weak-strong augmentations and improve it by devising two novel head-oriented strong augmentations, termed pose-irrelevant cut-occlusion and pose-altering rotation consistency respectively. Extensive experiments and ablation studies show that SemiUHPE outperforms existing methods greatly on public benchmarks under both the front-range and full-range settings. Code is released in \url{https://github.com/hnuzhy/SemiUHPE}.
Related papers
- Orthogonal Subspace Decomposition for Generalizable AI-Generated Image Detection [58.87142367781417]
A naively trained detector tends to favor overfitting to the limited and monotonous fake patterns, causing the feature space to become highly constrained and low-ranked.
One potential remedy is incorporating the pre-trained knowledge within the vision foundation models to expand the feature space.
By freezing the principal components and adapting only the remained components, we preserve the pre-trained knowledge while learning forgery-related patterns.
arXiv Detail & Related papers (2024-11-23T19:10:32Z) - Adaptive Bidirectional Displacement for Semi-Supervised Medical Image Segmentation [11.195959019678314]
Consistency learning is a central strategy to tackle unlabeled data in semi-supervised medical image segmentation.
In this paper, we propose an Adaptive Bidirectional Displacement approach to solve the above challenge.
arXiv Detail & Related papers (2024-05-01T08:17:43Z) - Towards Robust and Unconstrained Full Range of Rotation Head Pose
Estimation [2.915868985330569]
We present a novel method for unconstrained end-to-end head pose estimation.
We propose a continuous 6D rotation matrix representation for efficient and robust direct regression.
Our method significantly outperforms other state-of-the-art methods in an efficient and robust manner.
arXiv Detail & Related papers (2023-09-14T12:17:38Z) - Explicit Occlusion Reasoning for Multi-person 3D Human Pose Estimation [33.86986028882488]
Occlusion poses a great threat to monocular multi-person 3D human pose estimation due to large variability in terms of the shape, appearance, and position of occluders.
Existing methods try to handle occlusion with pose priors/constraints, data augmentation, or implicit reasoning.
We develop a method to explicitly model this process that significantly improves bottom-up multi-person human pose estimation.
arXiv Detail & Related papers (2022-07-29T22:12:50Z) - Non-Local Latent Relation Distillation for Self-Adaptive 3D Human Pose
Estimation [63.199549837604444]
3D human pose estimation approaches leverage different forms of strong (2D/3D pose) or weak (multi-view or depth) paired supervision.
We cast 3D pose learning as a self-supervised adaptation problem that aims to transfer the task knowledge from a labeled source domain to a completely unpaired target.
We evaluate different self-adaptation settings and demonstrate state-of-the-art 3D human pose estimation performance on standard benchmarks.
arXiv Detail & Related papers (2022-04-05T03:52:57Z) - Uncertainty-Aware Adaptation for Self-Supervised 3D Human Pose
Estimation [70.32536356351706]
We introduce MRP-Net that constitutes a common deep network backbone with two output heads subscribing to two diverse configurations.
We derive suitable measures to quantify prediction uncertainty at both pose and joint level.
We present a comprehensive evaluation of the proposed approach and demonstrate state-of-the-art performance on benchmark datasets.
arXiv Detail & Related papers (2022-03-29T07:14:58Z) - Towards Calibrated Model for Long-Tailed Visual Recognition from Prior
Perspective [17.733087434470907]
Real-world data universally confronts a severe class-imbalance problem and exhibits a long-tailed distribution.
We propose two novel methods from the prior perspective to alleviate this dilemma.
First, we deduce a balance-oriented data augmentation named Uniform Mixup (UniMix) to promote mixup in long-tailed scenarios.
Second, motivated by the Bayesian theory, we figure out the Bayes Bias (Bayias) to compensate it as a modification on standard cross-entropy loss.
arXiv Detail & Related papers (2021-11-06T12:53:34Z) - Regressive Domain Adaptation for Unsupervised Keypoint Detection [67.2950306888855]
Domain adaptation (DA) aims at transferring knowledge from a labeled source domain to an unlabeled target domain.
We present a method of regressive domain adaptation (RegDA) for unsupervised keypoint detection.
Our method brings large improvement by 8% to 11% in terms of PCK on different datasets.
arXiv Detail & Related papers (2021-03-10T16:45:22Z) - Weakly-Supervised Cross-Domain Adaptation for Endoscopic Lesions
Segmentation [79.58311369297635]
We propose a new weakly-supervised lesions transfer framework, which can explore transferable domain-invariant knowledge across different datasets.
A Wasserstein quantified transferability framework is developed to highlight widerange transferable contextual dependencies.
A novel self-supervised pseudo label generator is designed to equally provide confident pseudo pixel labels for both hard-to-transfer and easy-to-transfer target samples.
arXiv Detail & Related papers (2020-12-08T02:26:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.