State-of-the-Art in Human Scanpath Prediction
- URL: http://arxiv.org/abs/2102.12239v1
- Date: Wed, 24 Feb 2021 12:01:28 GMT
- Title: State-of-the-Art in Human Scanpath Prediction
- Authors: Matthias K\"ummerer, Matthias Bethge
- Abstract summary: We evaluate models based on how well they predict each fixation in a scanpath given the previous scanpath history.
This makes model evaluation closely aligned with the biological processes thought to underly scanpath generation.
We evaluate many existing models of scanpath prediction on the datasets MIT1003, MIT300, CAT2000 train and CAT200 test.
- Score: 22.030889583780514
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The last years have seen a surge in models predicting the scanpaths of
fixations made by humans when viewing images. However, the field is lacking a
principled comparison of those models with respect to their predictive power.
In the past, models have usually been evaluated based on comparing human
scanpaths to scanpaths generated from the model. Here, instead we evaluate
models based on how well they predict each fixation in a scanpath given the
previous scanpath history. This makes model evaluation closely aligned with the
biological processes thought to underly scanpath generation and allows to apply
established saliency metrics like AUC and NSS in an intuitive and interpretable
way. We evaluate many existing models of scanpath prediction on the datasets
MIT1003, MIT300, CAT2000 train and CAT200 test, for the first time giving a
detailed picture of the current state of the art of human scanpath prediction.
We also show that the discussed method of model benchmarking allows for more
detailed analyses leading to interesting insights about where and when models
fail to predict human behaviour. The MIT/Tuebingen Saliency Benchmark will
implement the evaluation of scanpath models as detailed here, allowing
researchers to score their models on the established benchmark datasets MIT300
and CAT2000.
Related papers
- Unified Dynamic Scanpath Predictors Outperform Individually Trained Neural Models [18.327960366321655]
We develop a deep learning-based social cue integration model for saliency prediction to predict scanpaths in videos.
We evaluate our approach on gaze of dynamic social scenes, observed under the free-viewing condition.
Results indicate that a single unified model, trained on all the observers' scanpaths, performs on par or better than individually trained models.
arXiv Detail & Related papers (2024-05-05T13:15:11Z) - EyeFormer: Predicting Personalized Scanpaths with Transformer-Guided Reinforcement Learning [31.583764158565916]
We present EyeFormer, a machine learning model for predicting scanpaths in a visual user interface.
Our model has the unique capability of producing personalized predictions when given a few user scanpath samples.
It can predict full scanpath information, including fixation positions and duration, across individuals and various stimulus types.
arXiv Detail & Related papers (2024-04-15T22:26:27Z) - JRDB-Traj: A Dataset and Benchmark for Trajectory Forecasting in Crowds [79.00975648564483]
Trajectory forecasting models, employed in fields such as robotics, autonomous vehicles, and navigation, face challenges in real-world scenarios.
This dataset provides comprehensive data, including the locations of all agents, scene images, and point clouds, all from the robot's perspective.
The objective is to predict the future positions of agents relative to the robot using raw sensory input data.
arXiv Detail & Related papers (2023-11-05T18:59:31Z) - Scanpath Prediction in Panoramic Videos via Expected Code Length
Minimization [27.06179638588126]
We present a new criterion for scanpath prediction based on principles from lossy data compression.
This criterion suggests minimizing the expected code length of quantized scanpaths in a training set.
We also introduce a proportional-integral-derivative (PID) controller-based sampler to generate realistic human-like scanpaths.
arXiv Detail & Related papers (2023-05-04T04:10:47Z) - Continuous time recurrent neural networks: overview and application to
forecasting blood glucose in the intensive care unit [56.801856519460465]
Continuous time autoregressive recurrent neural networks (CTRNNs) are a deep learning model that account for irregular observations.
We demonstrate the application of these models to probabilistic forecasting of blood glucose in a critical care setting.
arXiv Detail & Related papers (2023-04-14T09:39:06Z) - Zero-shot Model Diagnosis [80.36063332820568]
A common approach to evaluate deep learning models is to build a labeled test set with attributes of interest and assess how well it performs.
This paper argues the case that Zero-shot Model Diagnosis (ZOOM) is possible without the need for a test set nor labeling.
arXiv Detail & Related papers (2023-03-27T17:59:33Z) - TempSAL -- Uncovering Temporal Information for Deep Saliency Prediction [64.63645677568384]
We introduce a novel saliency prediction model that learns to output saliency maps in sequential time intervals.
Our approach locally modulates the saliency predictions by combining the learned temporal maps.
Our code will be publicly available on GitHub.
arXiv Detail & Related papers (2023-01-05T22:10:16Z) - MRCLens: an MRC Dataset Bias Detection Toolkit [82.44296974850639]
We introduce MRCLens, a toolkit that detects whether biases exist before users train the full model.
For the convenience of introducing the toolkit, we also provide a categorization of common biases in MRC.
arXiv Detail & Related papers (2022-07-18T21:05:39Z) - A Probabilistic Time-Evolving Approach to Scanpath Prediction [8.669748138523758]
We present a probabilistic time-evolving approach to scanpath prediction, based on Bayesian deep learning.
Our model yields results that outperform those of current state-of-the-art approaches, and are almost on par with the human baseline.
arXiv Detail & Related papers (2022-04-20T11:50:29Z) - A Simple and efficient deep Scanpath Prediction [6.294759639481189]
We explore the efficiency of using common deep learning architectures, in a simple fully convolutional regressive manner.
We experiment how well these models can predict the scanpaths on 2 datasets.
We also compare the different leveraged backbone architectures based on their performances on the experiment to deduce which ones are the most suitable for the task.
arXiv Detail & Related papers (2021-12-08T22:43:45Z) - LoopReg: Self-supervised Learning of Implicit Surface Correspondences,
Pose and Shape for 3D Human Mesh Registration [123.62341095156611]
LoopReg is an end-to-end learning framework to register a corpus of scans to a common 3D human model.
A backward map, parameterized by a Neural Network, predicts the correspondence from every scan point to the surface of the human model.
A forward map, parameterized by a human model, transforms the corresponding points back to the scan based on the model parameters (pose and shape)
arXiv Detail & Related papers (2020-10-23T14:39:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.