Related papers: Semi-Supervised Unconstrained Head Pose Estimation in the Wild

Semi-Supervised Unconstrained Head Pose Estimation in the Wild

URL: http://arxiv.org/abs/2404.02544v2
Date: Fri, 23 Aug 2024 10:38:07 GMT
Title: Semi-Supervised Unconstrained Head Pose Estimation in the Wild
Authors: Huayi Zhou, Fei Jiang, Jin Yuan, Yong Rui, Hongtao Lu, Kui Jia,
Abstract summary: We propose the first semi-supervised unconstrained head pose estimation method SemiUHPE. Our method is based on the observation that the aspect-ratio invariant cropping of wild heads is superior to the previous landmark-based affine alignment. Experiments and ablation studies show that SemiUHPE outperforms existing methods greatly on public benchmarks.
Score: 60.08319512840091
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Existing research on unconstrained in-the-wild head pose estimation suffers from the flaws of its datasets, which consist of either numerous samples by non-realistic synthesis or constrained collection, or small-scale natural images yet with plausible manual annotations. To alleviate it, we propose the first semi-supervised unconstrained head pose estimation method SemiUHPE, which can leverage abundant easily available unlabeled head images. Technically, we choose semi-supervised rotation regression and adapt it to the error-sensitive and label-scarce problem of unconstrained head pose. Our method is based on the observation that the aspect-ratio invariant cropping of wild heads is superior to the previous landmark-based affine alignment given that landmarks of unconstrained human heads are usually unavailable, especially for less-explored non-frontal heads. Instead of using an empirically fixed threshold to filter out pseudo labeled heads, we propose dynamic entropy based filtering to adaptively remove unlabeled outliers as training progresses by updating the threshold in multiple stages. We then revisit the design of weak-strong augmentations and improve it by devising two novel head-oriented strong augmentations, termed pose-irrelevant cut-occlusion and pose-altering rotation consistency respectively. Extensive experiments and ablation studies show that SemiUHPE outperforms existing methods greatly on public benchmarks under both the front-range and full-range settings. Code is released in \url{https://github.com/hnuzhy/SemiUHPE}.

Related papers

Orthogonal Subspace Decomposition for Generalizable AI-Generated Image Detection [58.87142367781417]
A naively trained detector tends to favor overfitting to the limited and monotonous fake patterns, causing the feature space to become highly constrained and low-ranked. One potential remedy is incorporating the pre-trained knowledge within the vision foundation models to expand the feature space. By freezing the principal components and adapting only the remained components, we preserve the pre-trained knowledge while learning forgery-related patterns.
arXiv Detail & Related papers (2024-11-23T19:10:32Z)
Consistency Regularisation for Unsupervised Domain Adaptation in Monocular Depth Estimation [15.285720572043678]
We formulate unsupervised domain adaptation for monocular depth estimation as a consistency-based semi-supervised learning problem. We introduce a pairwise loss function that regularises predictions on the source domain while enforcing consistency across multiple augmented views. In our experiments, we rely on the standard depth estimation benchmarks KITTI and NYUv2 to demonstrate state-of-the-art results.
arXiv Detail & Related papers (2024-05-27T23:32:06Z)
Adaptive Bidirectional Displacement for Semi-Supervised Medical Image Segmentation [11.195959019678314]
Consistency learning is a central strategy to tackle unlabeled data in semi-supervised medical image segmentation. In this paper, we propose an Adaptive Bidirectional Displacement approach to solve the above challenge.
arXiv Detail & Related papers (2024-05-01T08:17:43Z)
Towards Robust and Unconstrained Full Range of Rotation Head Pose Estimation [2.915868985330569]
We present a novel method for unconstrained end-to-end head pose estimation. We propose a continuous 6D rotation matrix representation for efficient and robust direct regression. Our method significantly outperforms other state-of-the-art methods in an efficient and robust manner.
arXiv Detail & Related papers (2023-09-14T12:17:38Z)
Exploiting Completeness and Uncertainty of Pseudo Labels for Weakly Supervised Video Anomaly Detection [149.23913018423022]
Weakly supervised video anomaly detection aims to identify abnormal events in videos using only video-level labels. Two-stage self-training methods have achieved significant improvements by self-generating pseudo labels. We propose an enhancement framework by exploiting completeness and uncertainty properties for effective self-training.
arXiv Detail & Related papers (2022-12-08T05:53:53Z)
Explicit Occlusion Reasoning for Multi-person 3D Human Pose Estimation [33.86986028882488]
Occlusion poses a great threat to monocular multi-person 3D human pose estimation due to large variability in terms of the shape, appearance, and position of occluders. Existing methods try to handle occlusion with pose priors/constraints, data augmentation, or implicit reasoning. We develop a method to explicitly model this process that significantly improves bottom-up multi-person human pose estimation.
arXiv Detail & Related papers (2022-07-29T22:12:50Z)
Semi-supervised Contrastive Outlier removal for Pseudo Expectation Maximization (SCOPE) [2.33877878310217]
We present a new approach to suppress confounding errors through a method we describe as Semi-supervised Contrastive Outlier removal for Pseudo Expectation Maximization (SCOPE) Our results show that SCOPE greatly improves semi-supervised classification accuracy over a baseline, and furthermore when combined with consistency regularization achieves the highest reported accuracy for the semi-supervised CIFAR-10 classification task using 250 and 4000 labeled samples.
arXiv Detail & Related papers (2022-06-28T19:32:50Z)
Non-Local Latent Relation Distillation for Self-Adaptive 3D Human Pose Estimation [63.199549837604444]
3D human pose estimation approaches leverage different forms of strong (2D/3D pose) or weak (multi-view or depth) paired supervision. We cast 3D pose learning as a self-supervised adaptation problem that aims to transfer the task knowledge from a labeled source domain to a completely unpaired target. We evaluate different self-adaptation settings and demonstrate state-of-the-art 3D human pose estimation performance on standard benchmarks.
arXiv Detail & Related papers (2022-04-05T03:52:57Z)
Uncertainty-Aware Adaptation for Self-Supervised 3D Human Pose Estimation [70.32536356351706]
We introduce MRP-Net that constitutes a common deep network backbone with two output heads subscribing to two diverse configurations. We derive suitable measures to quantify prediction uncertainty at both pose and joint level. We present a comprehensive evaluation of the proposed approach and demonstrate state-of-the-art performance on benchmark datasets.
arXiv Detail & Related papers (2022-03-29T07:14:58Z)
Towards Calibrated Model for Long-Tailed Visual Recognition from Prior Perspective [17.733087434470907]
Real-world data universally confronts a severe class-imbalance problem and exhibits a long-tailed distribution. We propose two novel methods from the prior perspective to alleviate this dilemma. First, we deduce a balance-oriented data augmentation named Uniform Mixup (UniMix) to promote mixup in long-tailed scenarios. Second, motivated by the Bayesian theory, we figure out the Bayes Bias (Bayias) to compensate it as a modification on standard cross-entropy loss.
arXiv Detail & Related papers (2021-11-06T12:53:34Z)
Regressive Domain Adaptation for Unsupervised Keypoint Detection [67.2950306888855]
Domain adaptation (DA) aims at transferring knowledge from a labeled source domain to an unlabeled target domain. We present a method of regressive domain adaptation (RegDA) for unsupervised keypoint detection. Our method brings large improvement by 8% to 11% in terms of PCK on different datasets.
arXiv Detail & Related papers (2021-03-10T16:45:22Z)
Weakly-Supervised Cross-Domain Adaptation for Endoscopic Lesions Segmentation [79.58311369297635]
We propose a new weakly-supervised lesions transfer framework, which can explore transferable domain-invariant knowledge across different datasets. A Wasserstein quantified transferability framework is developed to highlight widerange transferable contextual dependencies. A novel self-supervised pseudo label generator is designed to equally provide confident pseudo pixel labels for both hard-to-transfer and easy-to-transfer target samples.
arXiv Detail & Related papers (2020-12-08T02:26:03Z)
The Simulator: Understanding Adaptive Sampling in the Moderate-Confidence Regime [52.38455827779212]
We propose a novel technique for analyzing adaptive sampling called the em Simulator. We prove the first instance-based lower bounds the top-k problem which incorporate the appropriate log-factors. Our new analysis inspires a simple and near-optimal for the best-arm and top-k identification, the first em practical of its kind for the latter problem.
arXiv Detail & Related papers (2017-02-16T23:42:02Z)

This list is automatically generated from the titles and abstracts of the papers in this site.