Learning to Detect Head Movement in Unconstrained Remote Gaze Estimation
in the Wild
- URL: http://arxiv.org/abs/2004.03737v1
- Date: Tue, 7 Apr 2020 22:38:49 GMT
- Title: Learning to Detect Head Movement in Unconstrained Remote Gaze Estimation
in the Wild
- Authors: Zhecan Wang, Jian Zhao, Cheng Lu, Han Huang, Fan Yang, Lianji Li,
Yandong Guo
- Abstract summary: We propose end-to-end appearance-based gaze estimation methods that could more robustly incorporate different levels of head-pose representations into gaze estimation.
Our method could generalize to real-world scenarios with low image quality, different lightings and scenarios where direct head-pose information is not available.
- Score: 19.829721663742124
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Unconstrained remote gaze estimation remains challenging mostly due to its
vulnerability to the large variability in head-pose. Prior solutions struggle
to maintain reliable accuracy in unconstrained remote gaze tracking. Among
them, appearance-based solutions demonstrate tremendous potential in improving
gaze accuracy. However, existing works still suffer from head movement and are
not robust enough to handle real-world scenarios. Especially most of them study
gaze estimation under controlled scenarios where the collected datasets often
cover limited ranges of both head-pose and gaze which introduces further bias.
In this paper, we propose novel end-to-end appearance-based gaze estimation
methods that could more robustly incorporate different levels of head-pose
representations into gaze estimation. Our method could generalize to real-world
scenarios with low image quality, different lightings and scenarios where
direct head-pose information is not available. To better demonstrate the
advantage of our methods, we further propose a new benchmark dataset with the
most rich distribution of head-gaze combination reflecting real-world
scenarios. Extensive evaluations on several public datasets and our own dataset
demonstrate that our method consistently outperforms the state-of-the-art by a
significant margin.
Related papers
- A Comprehensive Library for Benchmarking Multi-class Visual Anomaly Detection [52.228708947607636]
This paper introduces a comprehensive visual anomaly detection benchmark, ADer, which is a modular framework for new methods.
The benchmark includes multiple datasets from industrial and medical domains, implementing fifteen state-of-the-art methods and nine comprehensive metrics.
We objectively reveal the strengths and weaknesses of different methods and provide insights into the challenges and future directions of multi-class visual anomaly detection.
arXiv Detail & Related papers (2024-06-05T13:40:07Z) - Semi-Supervised Unconstrained Head Pose Estimation in the Wild [60.08319512840091]
We propose the first semi-supervised unconstrained head pose estimation method SemiUHPE.
Our method is based on the observation that the aspect-ratio invariant cropping of wild heads is superior to the previous landmark-based affine alignment.
Experiments and ablation studies show that SemiUHPE outperforms existing methods greatly on public benchmarks.
arXiv Detail & Related papers (2024-04-03T08:01:00Z) - Latent Embedding Clustering for Occlusion Robust Head Pose Estimation [7.620379605206596]
Head pose estimation has become a crucial area of research in computer vision given its usefulness in a wide range of applications.
One of the most difficult challenges in this field is managing head occlusions that frequently take place in real-world scenarios.
We propose a novel and efficient framework that is robust in real world head occlusion scenarios.
arXiv Detail & Related papers (2024-03-29T15:57:38Z) - Towards Robust and Accurate Visual Prompting [11.918195429308035]
We study whether a visual prompt derived from a robust model can inherit the robustness while suffering from the generalization performance decline.
We introduce a novel technique named Prompt Boundary Loose (PBL) to effectively mitigates the suboptimal results of visual prompt on standard accuracy.
Our findings are universal and demonstrate the significant benefits of our proposed method.
arXiv Detail & Related papers (2023-11-18T07:00:56Z) - 3DGazeNet: Generalizing Gaze Estimation with Weak-Supervision from
Synthetic Views [67.00931529296788]
We propose to train general gaze estimation models which can be directly employed in novel environments without adaptation.
We create a large-scale dataset of diverse faces with gaze pseudo-annotations, which we extract based on the 3D geometry of the scene.
We test our method in the task of gaze generalization, in which we demonstrate improvement of up to 30% compared to state-of-the-art when no ground truth data are available.
arXiv Detail & Related papers (2022-12-06T14:15:17Z) - Evaluating the Label Efficiency of Contrastive Self-Supervised Learning
for Multi-Resolution Satellite Imagery [0.0]
Self-supervised learning has been applied in the remote sensing domain to exploit readily-available unlabeled data.
In this paper, we study self-supervised visual representation learning through the lens of label efficiency.
arXiv Detail & Related papers (2022-10-13T06:54:13Z) - Weakly-Supervised Physically Unconstrained Gaze Estimation [80.66438763587904]
We tackle the previously unexplored problem of weakly-supervised gaze estimation from videos of human interactions.
We propose a training algorithm along with several novel loss functions especially designed for the task.
We show significant improvements in (a) the accuracy of semi-supervised gaze estimation and (b) cross-domain generalization on the state-of-the-art physically unconstrained in-the-wild Gaze360 gaze estimation benchmark.
arXiv Detail & Related papers (2021-05-20T14:58:52Z) - SuctionNet-1Billion: A Large-Scale Benchmark for Suction Grasping [47.221326169627666]
We propose a new physical model to analytically evaluate seal formation and wrench resistance of a suction grasping.
A two-step methodology is adopted to generate annotations on a large-scale dataset collected in real-world cluttered scenarios.
A standard online evaluation system is proposed to evaluate suction poses in continuous operation space.
arXiv Detail & Related papers (2021-03-23T05:02:52Z) - Calibrating Self-supervised Monocular Depth Estimation [77.77696851397539]
In the recent years, many methods demonstrated the ability of neural networks to learn depth and pose changes in a sequence of images, using only self-supervision as the training signal.
We show that incorporating prior information about the camera configuration and the environment, we can remove the scale ambiguity and predict depth directly, still using the self-supervised formulation and not relying on any additional sensors.
arXiv Detail & Related papers (2020-09-16T14:35:45Z) - ETH-XGaze: A Large Scale Dataset for Gaze Estimation under Extreme Head
Pose and Gaze Variation [52.5465548207648]
ETH-XGaze is a new gaze estimation dataset consisting of over one million high-resolution images of varying gaze under extreme head poses.
We show that our dataset can significantly improve the robustness of gaze estimation methods across different head poses and gaze angles.
arXiv Detail & Related papers (2020-07-31T04:15:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.