Gravitational Models Explain Shifts on Human Visual Attention
- URL: http://arxiv.org/abs/2009.06963v1
- Date: Tue, 15 Sep 2020 10:12:41 GMT
- Title: Gravitational Models Explain Shifts on Human Visual Attention
- Authors: Dario Zanca, Marco Gori, Stefano Melacci, Alessandra Rufa
- Abstract summary: Visual attention refers to the human brain's ability to select relevant sensory information for preferential processing.
Various methods to estimate saliency have been proposed in the last three decades.
We propose a gravitational model (GRAV) to describe the attentional shifts.
- Score: 80.76475913429357
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Visual attention refers to the human brain's ability to select relevant
sensory information for preferential processing, improving performance in
visual and cognitive tasks. It proceeds in two phases. One in which visual
feature maps are acquired and processed in parallel. Another where the
information from these maps is merged in order to select a single location to
be attended for further and more complex computations and reasoning. Its
computational description is challenging, especially if the temporal dynamics
of the process are taken into account. Numerous methods to estimate saliency
have been proposed in the last three decades. They achieve almost perfect
performance in estimating saliency at the pixel level, but the way they
generate shifts in visual attention fully depends on winner-take-all (WTA)
circuitry. WTA is implemented} by the biological hardware in order to select a
location with maximum saliency, towards which to direct overt attention. In
this paper we propose a gravitational model (GRAV) to describe the attentional
shifts. Every single feature acts as an attractor and {the shifts are the
result of the joint effects of the attractors. In the current framework, the
assumption of a single, centralized saliency map is no longer necessary, though
still plausible. Quantitative results on two large image datasets show that
this model predicts shifts more accurately than winner-take-all.
Related papers
- GaitContour: Efficient Gait Recognition based on a Contour-Pose Representation [38.39173742709181]
Gait recognition holds the promise to robustly identify subjects based on walking patterns instead of appearance information.
In this work, we propose a novel, point-based Contour-Pose representation, which compactly expresses both body shape and body parts information.
We further propose a local-to-global architecture, called GaitContour, to leverage this novel representation.
arXiv Detail & Related papers (2023-11-27T17:06:25Z) - What Makes Pre-Trained Visual Representations Successful for Robust
Manipulation? [57.92924256181857]
We find that visual representations designed for manipulation and control tasks do not necessarily generalize under subtle changes in lighting and scene texture.
We find that emergent segmentation ability is a strong predictor of out-of-distribution generalization among ViT models.
arXiv Detail & Related papers (2023-11-03T18:09:08Z) - Calculating and Visualizing Counterfactual Feature Importance Values [0.0]
Counterfactual explanations surged as one potential solution to explain individual decision results.
Two major drawbacks directly impact their usability: (1) the isonomic view of feature changes, in which it is not possible to observe textithow much each modified feature influences the prediction, and (2) the lack of graphical resources to visualize the counterfactual explanation.
We introduce Counterfactual Feature (change) Importance Constellation (CFI) values as a solution: a way of assigning an importance value to each feature change in a given counterfactual explanation.
arXiv Detail & Related papers (2023-06-10T18:54:15Z) - Active Gaze Control for Foveal Scene Exploration [124.11737060344052]
We propose a methodology to emulate how humans and robots with foveal cameras would explore a scene.
The proposed method achieves an increase in detection F1-score of 2-3 percentage points for the same number of gaze shifts.
arXiv Detail & Related papers (2022-08-24T14:59:28Z) - DisenHCN: Disentangled Hypergraph Convolutional Networks for
Spatiotemporal Activity Prediction [53.76601630407521]
We propose a hypergraph network model called DisenHCN to bridge the gaps in existing solutions.
In particular, we first unify fine-grained user similarity and the complex matching between user preferences andtemporal activity into a heterogeneous hypergraph.
We then disentangle the user representations into different aspects (location-aware, time-aware, and activity-aware) and aggregate corresponding aspect's features on the constructed hypergraph.
arXiv Detail & Related papers (2022-08-14T06:51:54Z) - Understanding Character Recognition using Visual Explanations Derived
from the Human Visual System and Deep Networks [6.734853055176694]
We examine the congruence, or lack thereof, in the information-gathering strategies of deep neural networks.
The deep learning model considered similar regions in character, which humans have fixated in the case of correctly classified characters.
We propose to use the visual fixation maps obtained from the eye-tracking experiment as a supervisory input to align the model's focus on relevant character regions.
arXiv Detail & Related papers (2021-08-10T10:09:37Z) - Understanding top-down attention using task-oriented ablation design [0.22940141855172028]
Top-down attention allows neural networks, both artificial and biological, to focus on the information most relevant for a given task.
We aim to answer this with a computational experiment based on a general framework called task-oriented ablation design.
We compare the performance of two neural networks, one with top-down attention and one without.
arXiv Detail & Related papers (2021-06-08T21:01:47Z) - Robust Person Re-Identification through Contextual Mutual Boosting [77.1976737965566]
We propose the Contextual Mutual Boosting Network (CMBN) to localize pedestrians.
It localizes pedestrians and recalibrates features by effectively exploiting contextual information and statistical inference.
Experiments on the benchmarks demonstrate the superiority of the architecture compared the state-of-the-art.
arXiv Detail & Related papers (2020-09-16T06:33:35Z) - Leveraging the Self-Transition Probability of Ordinal Pattern Transition
Graph for Transportation Mode Classification [0.0]
We propose the use of a feature retained from the Ordinal Pattern Transition Graph, called the probability of self-transition for transportation mode classification.
The proposed feature presents better accuracy results than Permutation Entropy and Statistical Complexity, even when these two are combined.
arXiv Detail & Related papers (2020-07-16T23:25:09Z) - Orientation Attentive Robotic Grasp Synthesis with Augmented Grasp Map
Representation [62.79160608266713]
morphological characteristics in objects may offer a wide range of plausible grasping orientations that obfuscates the visual learning of robotic grasping.
Existing grasp generation approaches are cursed to construct discontinuous grasp maps by aggregating annotations for drastically different orientations per grasping point.
We propose a novel augmented grasp map representation, suitable for pixel-wise synthesis, that locally disentangles grasping orientations by partitioning the angle space into multiple bins.
arXiv Detail & Related papers (2020-06-09T08:54:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.