Related papers: Improve the Interpretability of Attention: A Fast, Accurate, and Interpretable High-Resolution Attention Model

Improve the Interpretability of Attention: A Fast, Accurate, and Interpretable High-Resolution Attention Model

URL: http://arxiv.org/abs/2106.02566v2
Date: Mon, 7 Jun 2021 10:25:01 GMT
Title: Improve the Interpretability of Attention: A Fast, Accurate, and Interpretable High-Resolution Attention Model
Authors: Tristan Gomez, Suiyi Ling, Thomas Fr\'eour, Harold Mouch\`ere
Abstract summary: We propose a novel Bilinear Representative Non-Parametric Attention (BR-NPA) strategy that captures the task-relevant human-interpretable information. The proposed model can be easily adapted in a wide variety of modern deep models, where classification is involved. It is also more accurate, faster, and with a smaller memory footprint than usual neural attention modules.
Score: 6.906621279967867
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The prevalence of employing attention mechanisms has brought along concerns on the interpretability of attention distributions. Although it provides insights about how a model is operating, utilizing attention as the explanation of model predictions is still highly dubious. The community is still seeking more interpretable strategies for better identifying local active regions that contribute the most to the final decision. To improve the interpretability of existing attention models, we propose a novel Bilinear Representative Non-Parametric Attention (BR-NPA) strategy that captures the task-relevant human-interpretable information. The target model is first distilled to have higher-resolution intermediate feature maps. From which, representative features are then grouped based on local pairwise feature similarity, to produce finer-grained, more precise attention maps highlighting task-relevant parts of the input. The obtained attention maps are ranked according to the `active level' of the compound feature, which provides information regarding the important level of the highlighted regions. The proposed model can be easily adapted in a wide variety of modern deep models, where classification is involved. It is also more accurate, faster, and with a smaller memory footprint than usual neural attention modules. Extensive experiments showcase more comprehensive visual explanations compared to the state-of-the-art visualization model across multiple tasks including few-shot classification, person re-identification, fine-grained image classification. The proposed visualization model sheds imperative light on how neural networks `pay their attention' differently in different tasks.

Related papers

Gaze-Guided Learning: Avoiding Shortcut Bias in Visual Classification [3.1208151315473622]
We introduce Gaze-CIFAR-10, a human gaze time-series dataset, along with a dual-sequence gaze encoder. In parallel, a Vision Transformer (ViT) is employed to learn the sequential representation of image content. Our framework integrates human gaze priors with machine-derived visual sequences, effectively correcting inaccurate localization in image feature representations.
arXiv Detail & Related papers (2025-04-08T00:40:46Z)
Locally Grouped and Scale-Guided Attention for Dense Pest Counting [1.9580473532948401]
This study introduces a new dense pest counting problem to predict densely distributed pests captured by digital traps. To address these problems, it is essential to incorporate the local attention mechanism. This study presents a novel design that integrates locally grouped and scale-guided attention into a multiscale CenterNet framework.
arXiv Detail & Related papers (2024-08-29T13:02:01Z)
Robust Saliency-Aware Distillation for Few-shot Fine-grained Visual Recognition [57.08108545219043]
Recognizing novel sub-categories with scarce samples is an essential and challenging research topic in computer vision. Existing literature addresses this challenge by employing local-based representation approaches. This article proposes a novel model, Robust Saliency-aware Distillation (RSaD), for few-shot fine-grained visual recognition.
arXiv Detail & Related papers (2023-05-12T00:13:17Z)
Where to Look: A Unified Attention Model for Visual Recognition with Reinforcement Learning [5.247711598719703]
We propose to unify the top-down and bottom-up attention together for recurrent visual attention. Our model exploits the image pyramids and Q-learning to select regions of interests in the top-down attention mechanism. We train our model in an end-to-end reinforcement learning framework, and evaluate our method on visual classification tasks.
arXiv Detail & Related papers (2021-11-13T18:44:50Z)
Bayesian Attention Belief Networks [59.183311769616466]
Attention-based neural networks have achieved state-of-the-art results on a wide range of tasks. This paper introduces Bayesian attention belief networks, which construct a decoder network by modeling unnormalized attention weights. We show that our method outperforms deterministic attention and state-of-the-art attention in accuracy, uncertainty estimation, generalization across domains, and adversarial attacks.
arXiv Detail & Related papers (2021-06-09T17:46:22Z)
Variational Structured Attention Networks for Deep Visual Representation Learning [49.80498066480928]
We propose a unified deep framework to jointly learn both spatial attention maps and channel attention in a principled manner. Specifically, we integrate the estimation and the interaction of the attentions within a probabilistic representation learning framework. We implement the inference rules within the neural network, thus allowing for end-to-end learning of the probabilistic and the CNN front-end parameters.
arXiv Detail & Related papers (2021-03-05T07:37:24Z)
Attention-gating for improved radio galaxy classification [0.0]
We introduce attention as a state of the art mechanism for classification of radio galaxies using convolutional neural networks. We show how the selection of normalisation and aggregation methods used in attention-gating can affect the output of individual models. The resulting attention maps can be used to interpret the classification choices made by the model.
arXiv Detail & Related papers (2020-12-02T14:49:53Z)
Bayesian Attention Modules [65.52970388117923]
We propose a scalable version of attention that is easy to implement and optimize. Our experiments show the proposed method brings consistent improvements over the corresponding baselines.
arXiv Detail & Related papers (2020-10-20T20:30:55Z)
Attention Model Enhanced Network for Classification of Breast Cancer Image [54.83246945407568]
AMEN is formulated in a multi-branch fashion with pixel-wised attention model and classification submodular. To focus more on subtle detail information, the sample image is enhanced by the pixel-wised attention map generated from former branch. Experiments conducted on three benchmark datasets demonstrate the superiority of the proposed method under various scenarios.
arXiv Detail & Related papers (2020-10-07T08:44:21Z)
Few-shot Classification via Adaptive Attention [93.06105498633492]
We propose a novel few-shot learning method via optimizing and fast adapting the query sample representation based on very few reference samples. As demonstrated experimentally, the proposed model achieves state-of-the-art classification results on various benchmark few-shot classification and fine-grained recognition datasets.
arXiv Detail & Related papers (2020-08-06T05:52:59Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.