What Do Deep Saliency Models Learn about Visual Attention?
- URL: http://arxiv.org/abs/2310.09679v1
- Date: Sat, 14 Oct 2023 23:15:57 GMT
- Title: What Do Deep Saliency Models Learn about Visual Attention?
- Authors: Shi Chen, Ming Jiang, Qi Zhao
- Abstract summary: We present a novel analytic framework that sheds light on the implicit features learned by saliency models.
Our approach decomposes these implicit features into interpretable bases that are explicitly aligned with semantic attributes.
- Score: 28.023464783469738
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In recent years, deep saliency models have made significant progress in
predicting human visual attention. However, the mechanisms behind their success
remain largely unexplained due to the opaque nature of deep neural networks. In
this paper, we present a novel analytic framework that sheds light on the
implicit features learned by saliency models and provides principled
interpretation and quantification of their contributions to saliency
prediction. Our approach decomposes these implicit features into interpretable
bases that are explicitly aligned with semantic attributes and reformulates
saliency prediction as a weighted combination of probability maps connecting
the bases and saliency. By applying our framework, we conduct extensive
analyses from various perspectives, including the positive and negative weights
of semantics, the impact of training data and architectural designs, the
progressive influences of fine-tuning, and common failure patterns of
state-of-the-art deep saliency models. Additionally, we demonstrate the
effectiveness of our framework by exploring visual attention characteristics in
various application scenarios, such as the atypical attention of people with
autism spectrum disorder, attention to emotion-eliciting stimuli, and attention
evolution over time. Our code is publicly available at
\url{https://github.com/szzexpoi/saliency_analysis}.
Related papers
- SoK: On Finding Common Ground in Loss Landscapes Using Deep Model Merging Techniques [4.013324399289249]
We present a novel taxonomy of model merging techniques organized by their core algorithmic principles.
We distill repeated empirical observations from the literature in these fields into characterizations of four major aspects of loss landscape geometry.
arXiv Detail & Related papers (2024-10-16T18:14:05Z) - A Survey on Transferability of Adversarial Examples across Deep Neural Networks [53.04734042366312]
adversarial examples can manipulate machine learning models into making erroneous predictions.
The transferability of adversarial examples enables black-box attacks which circumvent the need for detailed knowledge of the target model.
This survey explores the landscape of the adversarial transferability of adversarial examples.
arXiv Detail & Related papers (2023-10-26T17:45:26Z) - Causal Analysis for Robust Interpretability of Neural Networks [0.2519906683279152]
We develop a robust interventional-based method to capture cause-effect mechanisms in pre-trained neural networks.
We apply our method to vision models trained on classification tasks.
arXiv Detail & Related papers (2023-05-15T18:37:24Z) - Study of Distractors in Neural Models of Code [4.043200001974071]
Finding important features that contribute to the prediction of neural models is an active area of research in explainable AI.
In this work, we present an inverse perspective of distractor features: features that cast doubt about the prediction by affecting the model's confidence in its prediction.
Our experiments across various tasks, models, and datasets of code reveal that the removal of tokens can have a significant impact on the confidence of models in their predictions.
arXiv Detail & Related papers (2023-03-03T06:54:01Z) - Learnable Visual Words for Interpretable Image Recognition [70.85686267987744]
We propose the Learnable Visual Words (LVW) to interpret the model prediction behaviors with two novel modules.
The semantic visual words learning relaxes the category-specific constraint, enabling the general visual words shared across different categories.
Our experiments on six visual benchmarks demonstrate the superior effectiveness of our proposed LVW in both accuracy and model interpretation.
arXiv Detail & Related papers (2022-05-22T03:24:45Z) - Explainable Adversarial Attacks in Deep Neural Networks Using Activation
Profiles [69.9674326582747]
This paper presents a visual framework to investigate neural network models subjected to adversarial examples.
We show how observing these elements can quickly pinpoint exploited areas in a model.
arXiv Detail & Related papers (2021-03-18T13:04:21Z) - Variational Structured Attention Networks for Deep Visual Representation
Learning [49.80498066480928]
We propose a unified deep framework to jointly learn both spatial attention maps and channel attention in a principled manner.
Specifically, we integrate the estimation and the interaction of the attentions within a probabilistic representation learning framework.
We implement the inference rules within the neural network, thus allowing for end-to-end learning of the probabilistic and the CNN front-end parameters.
arXiv Detail & Related papers (2021-03-05T07:37:24Z) - Deep Co-Attention Network for Multi-View Subspace Learning [73.3450258002607]
We propose a deep co-attention network for multi-view subspace learning.
It aims to extract both the common information and the complementary information in an adversarial setting.
In particular, it uses a novel cross reconstruction loss and leverages the label information to guide the construction of the latent representation.
arXiv Detail & Related papers (2021-02-15T18:46:44Z) - Proactive Pseudo-Intervention: Causally Informed Contrastive Learning
For Interpretable Vision Models [103.64435911083432]
We present a novel contrastive learning strategy called it Proactive Pseudo-Intervention (PPI)
PPI leverages proactive interventions to guard against image features with no causal relevance.
We also devise a novel causally informed salience mapping module to identify key image pixels to intervene, and show it greatly facilitates model interpretability.
arXiv Detail & Related papers (2020-12-06T20:30:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.