Latent Embedding Clustering for Occlusion Robust Head Pose Estimation
- URL: http://arxiv.org/abs/2403.20251v1
- Date: Fri, 29 Mar 2024 15:57:38 GMT
- Title: Latent Embedding Clustering for Occlusion Robust Head Pose Estimation
- Authors: José Celestino, Manuel Marques, Jacinto C. Nascimento,
- Abstract summary: Head pose estimation has become a crucial area of research in computer vision given its usefulness in a wide range of applications.
One of the most difficult challenges in this field is managing head occlusions that frequently take place in real-world scenarios.
We propose a novel and efficient framework that is robust in real world head occlusion scenarios.
- Score: 7.620379605206596
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Head pose estimation has become a crucial area of research in computer vision given its usefulness in a wide range of applications, including robotics, surveillance, or driver attention monitoring. One of the most difficult challenges in this field is managing head occlusions that frequently take place in real-world scenarios. In this paper, we propose a novel and efficient framework that is robust in real world head occlusion scenarios. In particular, we propose an unsupervised latent embedding clustering with regression and classification components for each pose angle. The model optimizes latent feature representations for occluded and non-occluded images through a clustering term while improving fine-grained angle predictions. Experimental evaluation on in-the-wild head pose benchmark datasets reveal competitive performance in comparison to state-of-the-art methodologies with the advantage of having a significant data reduction. We observe a substantial improvement in occluded head pose estimation. Also, an ablation study is conducted to ascertain the impact of the clustering term within our proposed framework.
Related papers
- Neighbor-Aware Calibration of Segmentation Networks with Penalty-Based
Constraints [19.897181782914437]
We propose a principled and simple solution based on equality constraints on the logit values, which enables to control explicitly both the enforced constraint and the weight of the penalty.
Our approach can be used to train a wide span of deep segmentation networks.
arXiv Detail & Related papers (2024-01-25T19:46:57Z) - 2D Image head pose estimation via latent space regression under
occlusion settings [7.620379605206596]
The strategy is based on latent space regression as a fundamental key to better structure the problem for occluded scenarios.
We demonstrate the usefulness of the proposed approach with: (i) two synthetically occluded versions of the BIWI and AFLW2000 datasets, (ii) real-life occlusions of the Pandora dataset, and (iii) a real-life application to human-robot interaction scenarios.
arXiv Detail & Related papers (2023-11-10T12:53:02Z) - Group-Conditional Conformal Prediction via Quantile Regression
Calibration for Crop and Weed Classification [0.0]
This article presents the conformal prediction framework that provides valid statistical guarantees on the predictive performance of any black box prediction machine.
The framework is exposed with a focus on its practical aspects and special attention accorded to the Adaptive Prediction Sets (APS) approach.
To tackle this shortcoming, group-conditional conformal approaches are presented.
arXiv Detail & Related papers (2023-08-29T08:02:41Z) - When Demonstrations Meet Generative World Models: A Maximum Likelihood
Framework for Offline Inverse Reinforcement Learning [62.00672284480755]
This paper aims to recover the structure of rewards and environment dynamics that underlie observed actions in a fixed, finite set of demonstrations from an expert agent.
Accurate models of expertise in executing a task has applications in safety-sensitive applications such as clinical decision making and autonomous driving.
arXiv Detail & Related papers (2023-02-15T04:14:20Z) - The Probabilistic Normal Epipolar Constraint for Frame-To-Frame Rotation
Optimization under Uncertain Feature Positions [53.478856119297284]
We introduce the probabilistic normal epipolar constraint (PNEC) that overcomes the limitation by accounting for anisotropic and inhomogeneous uncertainties in the feature positions.
In experiments on synthetic data, we demonstrate that the novel PNEC yields more accurate rotation estimates than the original NEC.
We integrate the proposed method into a state-of-the-art monocular rotation-only odometry system and achieve consistently improved results for the real-world KITTI dataset.
arXiv Detail & Related papers (2022-04-05T14:47:11Z) - Welsch Based Multiview Disparity Estimation [0.8594140167290096]
We experimentally identify occlusions as a key challenge for disparity estimation for applications with high numbers of views.
We propose the use of a Welsch loss function for the data term in a global variational framework for disparity estimation.
arXiv Detail & Related papers (2021-10-02T13:44:49Z) - Unsupervised Learning of Debiased Representations with Pseudo-Attributes [85.5691102676175]
We propose a simple but effective debiasing technique in an unsupervised manner.
We perform clustering on the feature embedding space and identify pseudoattributes by taking advantage of the clustering results.
We then employ a novel cluster-based reweighting scheme for learning debiased representation.
arXiv Detail & Related papers (2021-08-06T05:20:46Z) - Adversarial Motion Modelling helps Semi-supervised Hand Pose Estimation [116.07661813869196]
We propose to combine ideas from adversarial training and motion modelling to tap into unlabeled videos.
We show that an adversarial leads to better properties of the hand pose estimator via semi-supervised training on unlabeled video sequences.
The main advantage of our approach is that we can make use of unpaired videos and joint sequence data both of which are much easier to attain than paired training data.
arXiv Detail & Related papers (2021-06-10T17:50:19Z) - Learning to Detect Head Movement in Unconstrained Remote Gaze Estimation
in the Wild [19.829721663742124]
We propose end-to-end appearance-based gaze estimation methods that could more robustly incorporate different levels of head-pose representations into gaze estimation.
Our method could generalize to real-world scenarios with low image quality, different lightings and scenarios where direct head-pose information is not available.
arXiv Detail & Related papers (2020-04-07T22:38:49Z) - Peeking into occluded joints: A novel framework for crowd pose
estimation [88.56203133287865]
OPEC-Net is an Image-Guided Progressive GCN module that estimates invisible joints from an inference perspective.
OCPose is the most complex Occluded Pose dataset with respect to average IoU between adjacent instances.
arXiv Detail & Related papers (2020-03-23T19:32:40Z) - Unsupervised Domain Adaptation in Person re-ID via k-Reciprocal
Clustering and Large-Scale Heterogeneous Environment Synthesis [76.46004354572956]
We introduce an unsupervised domain adaptation approach for person re-identification.
Experimental results show that the proposed ktCUDA and SHRED approach achieves an average improvement of +5.7 mAP in re-identification performance.
arXiv Detail & Related papers (2020-01-14T17:43:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.