Appearance-based Gaze Estimation With Deep Learning: A Review and Benchmark
- URL: http://arxiv.org/abs/2104.12668v2
- Date: Wed, 24 Apr 2024 16:17:13 GMT
- Title: Appearance-based Gaze Estimation With Deep Learning: A Review and Benchmark
- Authors: Yihua Cheng, Haofei Wang, Yiwei Bao, Feng Lu,
- Abstract summary: We present a systematic review of the appearance-based gaze estimation methods using deep learning.
We summarize the data pre-processing and post-processing methods, including face/eye detection, data rectification, 2D/3D gaze conversion and gaze origin conversion.
- Score: 14.306488668615883
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Human gaze provides valuable information on human focus and intentions, making it a crucial area of research. Recently, deep learning has revolutionized appearance-based gaze estimation. However, due to the unique features of gaze estimation research, such as the unfair comparison between 2D gaze positions and 3D gaze vectors and the different pre-processing and post-processing methods, there is a lack of a definitive guideline for developing deep learning-based gaze estimation algorithms. In this paper, we present a systematic review of the appearance-based gaze estimation methods using deep learning. Firstly, we survey the existing gaze estimation algorithms along the typical gaze estimation pipeline: deep feature extraction, deep learning model design, personal calibration and platforms. Secondly, to fairly compare the performance of different approaches, we summarize the data pre-processing and post-processing methods, including face/eye detection, data rectification, 2D/3D gaze conversion and gaze origin conversion. Finally, we set up a comprehensive benchmark for deep learning-based gaze estimation. We characterize all the public datasets and provide the source code of typical gaze estimation algorithms. This paper serves not only as a reference to develop deep learning-based gaze estimation methods, but also a guideline for future gaze estimation research. The project web page can be found at https://phi-ai.buaa.edu.cn/Gazehub.
Related papers
- TPP-Gaze: Modelling Gaze Dynamics in Space and Time with Neural Temporal Point Processes [63.95928298690001]
We present TPP-Gaze, a novel and principled approach to model scanpath dynamics based on Neural Temporal Point Process (TPP)
Our results show the overall superior performance of the proposed model compared to state-of-the-art approaches.
arXiv Detail & Related papers (2024-10-30T19:22:38Z) - Diffusion-Refined VQA Annotations for Semi-Supervised Gaze Following [74.30960564603917]
Training gaze following models requires a large number of images with gaze target coordinates annotated by human annotators.
We propose the first semi-supervised method for gaze following by introducing two novel priors to the task.
Our method outperforms simple pseudo-annotation generation baselines on the GazeFollow image dataset.
arXiv Detail & Related papers (2024-06-04T20:43:26Z) - Modeling State Shifting via Local-Global Distillation for Event-Frame Gaze Tracking [61.44701715285463]
This paper tackles the problem of passive gaze estimation using both event and frame data.
We reformulate gaze estimation as the quantification of the state shifting from the current state to several prior registered anchor states.
To improve the generalization ability, instead of learning a large gaze estimation network directly, we align a group of local experts with a student network.
arXiv Detail & Related papers (2024-03-31T03:30:37Z) - 3DGazeNet: Generalizing Gaze Estimation with Weak-Supervision from
Synthetic Views [67.00931529296788]
We propose to train general gaze estimation models which can be directly employed in novel environments without adaptation.
We create a large-scale dataset of diverse faces with gaze pseudo-annotations, which we extract based on the 3D geometry of the scene.
We test our method in the task of gaze generalization, in which we demonstrate improvement of up to 30% compared to state-of-the-art when no ground truth data are available.
arXiv Detail & Related papers (2022-12-06T14:15:17Z) - LatentGaze: Cross-Domain Gaze Estimation through Gaze-Aware Analytic
Latent Code Manipulation [0.0]
We propose a gaze-aware analytic manipulation method, based on a data-driven approach with generative adversarial network inversion's disentanglement characteristics.
By utilizing GAN-based encoder-generator process, we shift the input image from the target domain to the source domain image, which a gaze estimator is sufficiently aware.
arXiv Detail & Related papers (2022-09-21T08:05:53Z) - Active Gaze Control for Foveal Scene Exploration [124.11737060344052]
We propose a methodology to emulate how humans and robots with foveal cameras would explore a scene.
The proposed method achieves an increase in detection F1-score of 2-3 percentage points for the same number of gaze shifts.
arXiv Detail & Related papers (2022-08-24T14:59:28Z) - Eye Gaze Estimation Model Analysis [2.4366811507669124]
We discuss various model types for eye gaze estimation and present the results from predicting gaze direction using eye landmarks in unconstrained settings.
In unconstrained real-world settings, feature-based and model-based methods are outperformed by recent appearance-based methods due to factors like illumination changes and other visual artifacts.
arXiv Detail & Related papers (2022-07-28T20:40:03Z) - GazeOnce: Real-Time Multi-Person Gaze Estimation [18.16091280655655]
Appearance-based gaze estimation aims to predict the 3D eye gaze direction from a single image.
Recent deep learning-based approaches have demonstrated excellent performance, but cannot output multi-person gaze in real time.
We propose GazeOnce, which is capable of simultaneously predicting gaze directions for multiple faces in an image.
arXiv Detail & Related papers (2022-04-20T14:21:47Z) - Effect Of Personalized Calibration On Gaze Estimation Using
Deep-Learning [10.815594142396497]
We train a convolutional neural network and analyse its performance with and without calibration.
This evaluation provides clear insights on how calibration improved the performance of the Deep Learning model in estimating gaze in the wild.
arXiv Detail & Related papers (2021-09-27T05:14:12Z) - Probabilistic and Geometric Depth: Detecting Objects in Perspective [78.00922683083776]
3D object detection is an important capability needed in various practical applications such as driver assistance systems.
Monocular 3D detection, as an economical solution compared to conventional settings relying on binocular vision or LiDAR, has drawn increasing attention recently but still yields unsatisfactory results.
This paper first presents a systematic study on this problem and observes that the current monocular 3D detection problem can be simplified as an instance depth estimation problem.
arXiv Detail & Related papers (2021-07-29T16:30:33Z) - 360-Degree Gaze Estimation in the Wild Using Multiple Zoom Scales [26.36068336169795]
We develop a model that mimics humans' ability to estimate the gaze by aggregating from focused looks.
The model avoids the need to extract clear eye patches.
We extend the model to handle the challenging task of 360-degree gaze estimation.
arXiv Detail & Related papers (2020-09-15T08:45:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.