Improving saliency models' predictions of the next fixation with humans'
intrinsic cost of gaze shifts
- URL: http://arxiv.org/abs/2207.04250v3
- Date: Sun, 18 Feb 2024 11:18:38 GMT
- Title: Improving saliency models' predictions of the next fixation with humans'
intrinsic cost of gaze shifts
- Authors: Florian Kadner, Tobias Thomas, David Hoppe and Constantin A. Rothkopf
- Abstract summary: We develop a principled framework for predicting the next gaze target and the empirical measurement of the human cost for gaze.
We provide an implementation of human gaze preferences, which can be used to improve arbitrary saliency models' predictions of humans' next gaze targets.
- Score: 6.315366433343492
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The human prioritization of image regions can be modeled in a time invariant
fashion with saliency maps or sequentially with scanpath models. However, while
both types of models have steadily improved on several benchmarks and datasets,
there is still a considerable gap in predicting human gaze. Here, we leverage
two recent developments to reduce this gap: theoretical analyses establishing a
principled framework for predicting the next gaze target and the empirical
measurement of the human cost for gaze switches independently of image content.
We introduce an algorithm in the framework of sequential decision making, which
converts any static saliency map into a sequence of dynamic history-dependent
value maps, which are recomputed after each gaze shift. These maps are based on
1) a saliency map provided by an arbitrary saliency model, 2) the recently
measured human cost function quantifying preferences in magnitude and direction
of eye movements, and 3) a sequential exploration bonus, which changes with
each subsequent gaze shift. The parameters of the spatial extent and temporal
decay of this exploration bonus are estimated from human gaze data. The
relative contributions of these three components were optimized on the MIT1003
dataset for the NSS score and are sufficient to significantly outperform
predictions of the next gaze target on NSS and AUC scores for five state of the
art saliency models on three image data sets. Thus, we provide an
implementation of human gaze preferences, which can be used to improve
arbitrary saliency models' predictions of humans' next gaze targets.
Related papers
- OPUS: Occupancy Prediction Using a Sparse Set [64.60854562502523]
We present a framework to simultaneously predict occupied locations and classes using a set of learnable queries.
OPUS incorporates a suite of non-trivial strategies to enhance model performance.
Our lightest model achieves superior RayIoU on the Occ3D-nuScenes dataset at near 2x FPS, while our heaviest model surpasses previous best results by 6.1 RayIoU.
arXiv Detail & Related papers (2024-09-14T07:44:22Z) - Spatio-Temporal Attention and Gaussian Processes for Personalized Video Gaze Estimation [7.545077734926115]
We propose a simple and novel deep learning model designed to estimate gaze from videos.
Our method employs a spatial attention mechanism that tracks spatial dynamics within videos.
Experimental results confirm the efficacy of the proposed approach, demonstrating its success in both within-dataset and cross-dataset settings.
arXiv Detail & Related papers (2024-04-08T06:07:32Z) - TempSAL -- Uncovering Temporal Information for Deep Saliency Prediction [64.63645677568384]
We introduce a novel saliency prediction model that learns to output saliency maps in sequential time intervals.
Our approach locally modulates the saliency predictions by combining the learned temporal maps.
Our code will be publicly available on GitHub.
arXiv Detail & Related papers (2023-01-05T22:10:16Z) - A generic diffusion-based approach for 3D human pose prediction in the
wild [68.00961210467479]
3D human pose forecasting, i.e., predicting a sequence of future human 3D poses given a sequence of past observed ones, is a challenging-temporal task.
We provide a unified formulation in which incomplete elements (no matter in the prediction or observation) are treated as noise and propose a conditional diffusion model that denoises them and forecasts plausible poses.
We investigate our findings on four standard datasets and obtain significant improvements over the state-of-the-art.
arXiv Detail & Related papers (2022-10-11T17:59:54Z) - Active Gaze Control for Foveal Scene Exploration [124.11737060344052]
We propose a methodology to emulate how humans and robots with foveal cameras would explore a scene.
The proposed method achieves an increase in detection F1-score of 2-3 percentage points for the same number of gaze shifts.
arXiv Detail & Related papers (2022-08-24T14:59:28Z) - Learned Vertex Descent: A New Direction for 3D Human Model Fitting [64.04726230507258]
We propose a novel optimization-based paradigm for 3D human model fitting on images and scans.
Our approach is able to capture the underlying body of clothed people with very different body shapes, achieving a significant improvement compared to state-of-the-art.
LVD is also applicable to 3D model fitting of humans and hands, for which we show a significant improvement to the SOTA with a much simpler and faster method.
arXiv Detail & Related papers (2022-05-12T17:55:51Z) - L2CS-Net: Fine-Grained Gaze Estimation in Unconstrained Environments [2.5234156040689237]
We propose a robust CNN-based model for predicting gaze in unconstrained settings.
We use two identical losses, one for each angle, to improve network learning and increase its generalization.
Our proposed model achieves state-of-the-art accuracy of 3.92deg and 10.41deg on MPIIGaze and Gaze360 datasets, respectively.
arXiv Detail & Related papers (2022-03-07T12:35:39Z) - An Adversarial Human Pose Estimation Network Injected with Graph
Structure [75.08618278188209]
In this paper, we design a novel generative adversarial network (GAN) to improve the localization accuracy of visible joints when some joints are invisible.
The network consists of two simple but efficient modules, Cascade Feature Network (CFN) and Graph Structure Network (GSN)
arXiv Detail & Related papers (2021-03-29T12:07:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.