Predicting Eye Gaze Location on Websites
- URL: http://arxiv.org/abs/2211.08074v1
- Date: Tue, 15 Nov 2022 11:55:46 GMT
- Title: Predicting Eye Gaze Location on Websites
- Authors: Ciheng Zhang, Decky Aspandi, Steffen Staab
- Abstract summary: We propose an effective deep learning-based model that leverages both image and text spatial location.
We show the benefit of careful fine-tuning using our unified dataset to improve the accuracy of eye-gaze predictions.
- Score: 4.8633100732964705
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: World-wide-web, with the website and webpage as the main interface,
facilitates the dissemination of important information. Hence it is crucial to
optimize them for better user interaction, which is primarily done by analyzing
users' behavior, especially users' eye-gaze locations. However, gathering these
data is still considered to be labor and time intensive. In this work, we
enable the development of automatic eye-gaze estimations given a website
screenshots as the input. This is done by the curation of a unified dataset
that consists of website screenshots, eye-gaze heatmap and website's layout
information in the form of image and text masks. Our pre-processed dataset
allows us to propose an effective deep learning-based model that leverages both
image and text spatial location, which is combined through attention mechanism
for effective eye-gaze prediction. In our experiment, we show the benefit of
careful fine-tuning using our unified dataset to improve the accuracy of
eye-gaze predictions. We further observe the capability of our model to focus
on the targeted areas (images and text) to achieve high accuracy. Finally, the
comparison with other alternatives shows the state-of-the-art result of our
model establishing the benchmark for the eye-gaze prediction task.
Related papers
- Data Augmentation via Latent Diffusion for Saliency Prediction [67.88936624546076]
Saliency prediction models are constrained by the limited diversity and quantity of labeled data.
We propose a novel data augmentation method for deep saliency prediction that edits natural images while preserving the complexity and variability of real-world scenes.
arXiv Detail & Related papers (2024-09-11T14:36:24Z) - Deep Domain Adaptation: A Sim2Real Neural Approach for Improving Eye-Tracking Systems [80.62854148838359]
Eye image segmentation is a critical step in eye tracking that has great influence over the final gaze estimate.
We use dimensionality-reduction techniques to measure the overlap between the target eye images and synthetic training data.
Our methods result in robust, improved performance when tackling the discrepancy between simulation and real-world data samples.
arXiv Detail & Related papers (2024-03-23T22:32:06Z) - SeeBel: Seeing is Believing [0.9790236766474201]
We propose three visualizations that enable users to compare dataset statistics and AI performance for segmenting all images.
Our project tries to further increase the interpretability of the trained AI model for segmentation by visualizing its image attention weights.
We propose to conduct surveys on real users to study the efficacy of our visualization tool in computer vision and AI domain.
arXiv Detail & Related papers (2023-12-18T05:11:00Z) - 3DGazeNet: Generalizing Gaze Estimation with Weak-Supervision from
Synthetic Views [67.00931529296788]
We propose to train general gaze estimation models which can be directly employed in novel environments without adaptation.
We create a large-scale dataset of diverse faces with gaze pseudo-annotations, which we extract based on the 3D geometry of the scene.
We test our method in the task of gaze generalization, in which we demonstrate improvement of up to 30% compared to state-of-the-art when no ground truth data are available.
arXiv Detail & Related papers (2022-12-06T14:15:17Z) - RAZE: Region Guided Self-Supervised Gaze Representation Learning [5.919214040221055]
RAZE is a Region guided self-supervised gAZE representation learning framework which leverage from non-annotated facial image data.
Ize-Net is a capsule layer based CNN architecture which can efficiently capture rich eye representation.
arXiv Detail & Related papers (2022-08-04T06:23:49Z) - Gaze Estimation with Eye Region Segmentation and Self-Supervised
Multistream Learning [8.422257363944295]
We present a novel multistream network that learns robust eye representations for gaze estimation.
We first create a synthetic dataset containing eye region masks detailing the visible eyeball and iris using a simulator.
We then perform eye region segmentation with a U-Net type model which we later use to generate eye region masks for real-world images.
arXiv Detail & Related papers (2021-12-15T04:44:45Z) - HighlightMe: Detecting Highlights from Human-Centric Videos [52.84233165201391]
We present a domain- and user-preference-agnostic approach to detect highlightable excerpts from human-centric videos.
We use an autoencoder network equipped with spatial-temporal graph convolutions to detect human activities and interactions.
We observe a 4-12% improvement in the mean average precision of matching the human-annotated highlights over state-of-the-art methods.
arXiv Detail & Related papers (2021-10-05T01:18:15Z) - Towards End-to-end Video-based Eye-Tracking [50.0630362419371]
Estimating eye-gaze from images alone is a challenging task due to un-observable person-specific factors.
We propose a novel dataset and accompanying method which aims to explicitly learn these semantic and temporal relationships.
We demonstrate that the fusion of information from visual stimuli as well as eye images can lead towards achieving performance similar to literature-reported figures.
arXiv Detail & Related papers (2020-07-26T12:39:15Z) - Spatio-Temporal Graph for Video Captioning with Knowledge Distillation [50.034189314258356]
We propose a graph model for video captioning that exploits object interactions in space and time.
Our model builds interpretable links and is able to provide explicit visual grounding.
To avoid correlations caused by the variable number of objects, we propose an object-aware knowledge distillation mechanism.
arXiv Detail & Related papers (2020-03-31T03:58:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.