Simulating Human Gaze with Neural Visual Attention
- URL: http://arxiv.org/abs/2211.12100v1
- Date: Tue, 22 Nov 2022 09:02:09 GMT
- Title: Simulating Human Gaze with Neural Visual Attention
- Authors: Leo Schwinn, Doina Precup, Bjoern Eskofier and Dario Zanca
- Abstract summary: We propose the Neural Visual Attention (NeVA) algorithm to integrate guidance of any downstream visual task into attention modeling.
We observe that biologically constrained neural networks generate human-like scanpaths without being trained for this objective.
- Score: 44.65733084492857
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Existing models of human visual attention are generally unable to incorporate
direct task guidance and therefore cannot model an intent or goal when
exploring a scene. To integrate guidance of any downstream visual task into
attention modeling, we propose the Neural Visual Attention (NeVA) algorithm. To
this end, we impose to neural networks the biological constraint of foveated
vision and train an attention mechanism to generate visual explorations that
maximize the performance with respect to the downstream task. We observe that
biologically constrained neural networks generate human-like scanpaths without
being trained for this objective. Extensive experiments on three common
benchmark datasets show that our method outperforms state-of-the-art
unsupervised human attention models in generating human-like scanpaths.
Related papers
- HINT: Learning Complete Human Neural Representations from Limited Viewpoints [69.76947323932107]
We propose a NeRF-based algorithm able to learn a detailed and complete human model from limited viewing angles.
As a result, our method can reconstruct complete humans even from a few viewing angles, increasing performance by more than 15% PSNR.
arXiv Detail & Related papers (2024-05-30T05:43:09Z) - Contrastive Language-Image Pretrained Models are Zero-Shot Human
Scanpath Predictors [2.524526956420465]
CapMIT1003 is a database of captions and click-contingent image explorations collected during captioning tasks.
NevaClip is a novel zero-shot method for predicting visual scanpaths.
arXiv Detail & Related papers (2023-05-21T07:24:50Z) - An Inter-observer consistent deep adversarial training for visual
scanpath prediction [66.46953851227454]
We propose an inter-observer consistent adversarial training approach for scanpath prediction through a lightweight deep neural network.
We show the competitiveness of our approach in regard to state-of-the-art methods.
arXiv Detail & Related papers (2022-11-14T13:22:29Z) - BI AVAN: Brain inspired Adversarial Visual Attention Network [67.05560966998559]
We propose a brain-inspired adversarial visual attention network (BI-AVAN) to characterize human visual attention directly from functional brain activity.
Our model imitates the biased competition process between attention-related/neglected objects to identify and locate the visual objects in a movie frame the human brain focuses on in an unsupervised manner.
arXiv Detail & Related papers (2022-10-27T22:20:36Z) - Guiding Visual Attention in Deep Convolutional Neural Networks Based on
Human Eye Movements [0.0]
Deep Convolutional Neural Networks (DCNNs) were originally inspired by principles of biological vision.
Recent advances in deep learning seem to decrease this similarity.
We investigate a purely data-driven approach to obtain useful models.
arXiv Detail & Related papers (2022-06-21T17:59:23Z) - Human Eyes Inspired Recurrent Neural Networks are More Robust Against Adversarial Noises [7.689542442882423]
We designed a dual-stream vision model inspired by the human brain.
This model features retina-like input layers and includes two streams: one determining the next point of focus (the fixation), while the other interprets the visuals surrounding the fixation.
We evaluated this model against various benchmarks in terms of object recognition, gaze behavior and adversarial robustness.
arXiv Detail & Related papers (2022-06-15T03:44:42Z) - Behind the Machine's Gaze: Biologically Constrained Neural Networks
Exhibit Human-like Visual Attention [40.878963450471026]
We propose the Neural Visual Attention (NeVA) algorithm to generate visual scanpaths in a top-down manner.
We show that the proposed method outperforms state-of-the-art unsupervised human attention models in terms of similarity to human scanpaths.
arXiv Detail & Related papers (2022-04-19T18:57:47Z) - Overcoming the Domain Gap in Neural Action Representations [60.47807856873544]
3D pose data can now be reliably extracted from multi-view video sequences without manual intervention.
We propose to use it to guide the encoding of neural action representations together with a set of neural and behavioral augmentations.
To reduce the domain gap, during training, we swap neural and behavioral data across animals that seem to be performing similar actions.
arXiv Detail & Related papers (2021-12-02T12:45:46Z) - Neural encoding with visual attention [17.020869686284165]
We propose a novel approach to neural encoding by including a trainable soft-attention module.
We find that attention locations estimated by the model on independent data agree well with the corresponding eye fixation patterns.
arXiv Detail & Related papers (2020-10-01T16:04:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.