Related papers: ETH-XGaze: A Large Scale Dataset for Gaze Estimation under Extreme Head Pose and Gaze Variation

ETH-XGaze: A Large Scale Dataset for Gaze Estimation under Extreme Head Pose and Gaze Variation

URL: http://arxiv.org/abs/2007.15837v1
Date: Fri, 31 Jul 2020 04:15:53 GMT
Title: ETH-XGaze: A Large Scale Dataset for Gaze Estimation under Extreme Head Pose and Gaze Variation
Authors: Xucong Zhang and Seonwook Park and Thabo Beeler and Derek Bradley and Siyu Tang and Otmar Hilliges
Abstract summary: ETH-XGaze is a new gaze estimation dataset consisting of over one million high-resolution images of varying gaze under extreme head poses. We show that our dataset can significantly improve the robustness of gaze estimation methods across different head poses and gaze angles.
Score: 52.5465548207648
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Gaze estimation is a fundamental task in many applications of computer vision, human computer interaction and robotics. Many state-of-the-art methods are trained and tested on custom datasets, making comparison across methods challenging. Furthermore, existing gaze estimation datasets have limited head pose and gaze variations, and the evaluations are conducted using different protocols and metrics. In this paper, we propose a new gaze estimation dataset called ETH-XGaze, consisting of over one million high-resolution images of varying gaze under extreme head poses. We collect this dataset from 110 participants with a custom hardware setup including 18 digital SLR cameras and adjustable illumination conditions, and a calibrated system to record ground truth gaze targets. We show that our dataset can significantly improve the robustness of gaze estimation methods across different head poses and gaze angles. Additionally, we define a standardized experimental protocol and evaluation metric on ETH-XGaze, to better unify gaze estimation research going forward. The dataset and benchmark website are available at https://ait.ethz.ch/projects/2020/ETH-XGaze

Related papers

GEOBench-VLM: Benchmarking Vision-Language Models for Geospatial Tasks [84.86699025256705]
We present GEOBench-VLM, a benchmark specifically designed to evaluate Vision-Language Models (VLMs) on geospatial tasks. Our benchmark features over 10,000 manually verified instructions and spanning diverse visual conditions, object types, and scales. We evaluate several state-of-the-art VLMs to assess performance on geospatial-specific challenges.
arXiv Detail & Related papers (2024-11-28T18:59:56Z)
Merging Multiple Datasets for Improved Appearance-Based Gaze Estimation [10.682719521609743]
Two-stage Transformer-based Gaze-feature Fusion (TTGF) method uses transformers to merge information from each eye and the face separately and then merge across the two eyes. Our proposed Gaze Adaptation Module (GAM) method handles annotation inconsis-tency by applying a Gaze Adaption Module for each dataset to correct gaze estimates from a single shared estimator.
arXiv Detail & Related papers (2024-09-02T02:51:40Z)
Semi-Synthetic Dataset Augmentation for Application-Specific Gaze Estimation [0.3683202928838613]
We show how to generate a tridimensional mesh of the face and render the training images from a virtual camera at a specific position and orientation related to the application. This leads to an average 47% decrease in gaze estimation angular error.
arXiv Detail & Related papers (2023-10-27T20:27:22Z)
A Large Scale Homography Benchmark [52.55694707744518]
We present a large-scale dataset of Planes in 3D, Pi3D, of roughly 1000 planes observed in 10 000 images from the 1DSfM dataset. We also present HEB, a large-scale homography estimation benchmark leveraging Pi3D.
arXiv Detail & Related papers (2023-02-20T14:18:09Z)
Towards Precision in Appearance-based Gaze Estimation in the Wild [3.4253416336476246]
We present a large gaze estimation dataset, PARKS-Gaze, with wider head pose and illumination variation. The proposed dataset is more challenging and enable models to generalize on unseen participants better than the existing in-the-wild datasets.
arXiv Detail & Related papers (2023-02-05T10:09:35Z)
NeRF-Gaze: A Head-Eye Redirection Parametric Model for Gaze Estimation [37.977032771941715]
We propose a novel Head-Eye redirection parametric model based on Neural Radiance Field. Our model can decouple the face and eyes for separate neural rendering. It can achieve the purpose of separately controlling the attributes of the face, identity, illumination, and eye gaze direction.
arXiv Detail & Related papers (2022-12-30T13:52:28Z)
3DGazeNet: Generalizing Gaze Estimation with Weak-Supervision from Synthetic Views [67.00931529296788]
We propose to train general gaze estimation models which can be directly employed in novel environments without adaptation. We create a large-scale dataset of diverse faces with gaze pseudo-annotations, which we extract based on the 3D geometry of the scene. We test our method in the task of gaze generalization, in which we demonstrate improvement of up to 30% compared to state-of-the-art when no ground truth data are available.
arXiv Detail & Related papers (2022-12-06T14:15:17Z)
Gaze Estimation with an Ensemble of Four Architectures [116.53389064096139]
We train several gaze estimators adopting four different network architectures. We select the best six estimators and ensemble their predictions through a linear combination. The method ranks the first on the leader-board of ETH-XGaze Competition, achieving an average angular error of $3.11circ$ on the ETH-XGaze test set.
arXiv Detail & Related papers (2021-07-05T12:40:26Z)
360-Degree Gaze Estimation in the Wild Using Multiple Zoom Scales [26.36068336169795]
We develop a model that mimics humans' ability to estimate the gaze by aggregating from focused looks. The model avoids the need to extract clear eye patches. We extend the model to handle the challenging task of 360-degree gaze estimation.
arXiv Detail & Related papers (2020-09-15T08:45:12Z)
Towards End-to-end Video-based Eye-Tracking [50.0630362419371]
Estimating eye-gaze from images alone is a challenging task due to un-observable person-specific factors. We propose a novel dataset and accompanying method which aims to explicitly learn these semantic and temporal relationships. We demonstrate that the fusion of information from visual stimuli as well as eye images can lead towards achieving performance similar to literature-reported figures.
arXiv Detail & Related papers (2020-07-26T12:39:15Z)
Speak2Label: Using Domain Knowledge for Creating a Large Scale Driver Gaze Zone Estimation Dataset [55.391532084304494]
Driver Gaze in the Wild dataset contains 586 recordings, captured during different times of the day including evenings. Driver Gaze in the Wild dataset contains 338 subjects with an age range of 18-63 years.
arXiv Detail & Related papers (2020-04-13T14:47:34Z)

This list is automatically generated from the titles and abstracts of the papers in this site.