Is Attention Interpretation? A Quantitative Assessment On Sets
- URL: http://arxiv.org/abs/2207.13018v1
- Date: Tue, 26 Jul 2022 16:25:38 GMT
- Title: Is Attention Interpretation? A Quantitative Assessment On Sets
- Authors: Jonathan Haab and Nicolas Deutschmann and Maria Rodr\'iguez Mart\'inez
- Abstract summary: We study the interpretability of attention in the context of set machine learning.
We find that attention distributions are indeed often reflective of the relative importance of individual instances.
We propose to use ensembling to minimize the risk of misleading attention-based explanations.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The debate around the interpretability of attention mechanisms is centered on
whether attention scores can be used as a proxy for the relative amounts of
signal carried by sub-components of data. We propose to study the
interpretability of attention in the context of set machine learning, where
each data point is composed of an unordered collection of instances with a
global label. For classical multiple-instance-learning problems and simple
extensions, there is a well-defined "importance" ground truth that can be
leveraged to cast interpretation as a binary classification problem, which we
can quantitatively evaluate. By building synthetic datasets over several data
modalities, we perform a systematic assessment of attention-based
interpretations. We find that attention distributions are indeed often
reflective of the relative importance of individual instances, but that silent
failures happen where a model will have high classification performance but
attention patterns that do not align with expectations. Based on these
observations, we propose to use ensembling to minimize the risk of misleading
attention-based explanations.
Related papers
- Perturbation-based Self-supervised Attention for Attention Bias in Text
Classification [31.144857032681905]
We propose a perturbation-based self-supervised attention approach to guide attention learning.
We add as much noise as possible to all the words in the sentence without changing their semantics and predictions.
Experimental results on three text classification tasks show that our approach can significantly improve the performance of current attention-based models.
arXiv Detail & Related papers (2023-05-25T03:18:18Z) - Guiding Visual Question Answering with Attention Priors [76.21671164766073]
We propose to guide the attention mechanism using explicit linguistic-visual grounding.
This grounding is derived by connecting structured linguistic concepts in the query to their referents among the visual objects.
The resultant algorithm is capable of probing attention-based reasoning models, injecting relevant associative knowledge, and regulating the core reasoning process.
arXiv Detail & Related papers (2022-05-25T09:53:47Z) - Resolving label uncertainty with implicit posterior models [71.62113762278963]
We propose a method for jointly inferring labels across a collection of data samples.
By implicitly assuming the existence of a generative model for which a differentiable predictor is the posterior, we derive a training objective that allows learning under weak beliefs.
arXiv Detail & Related papers (2022-02-28T18:09:44Z) - Learning to Detect Instance-level Salient Objects Using Complementary
Image Labels [55.049347205603304]
We present the first weakly-supervised approach to the salient instance detection problem.
We propose a novel weakly-supervised network with three branches: a Saliency Detection Branch leveraging class consistency information to locate candidate objects; a Boundary Detection Branch exploiting class discrepancy information to delineate object boundaries; and a Centroid Detection Branch using subitizing information to detect salient instance centroids.
arXiv Detail & Related papers (2021-11-19T10:15:22Z) - Improve the Interpretability of Attention: A Fast, Accurate, and
Interpretable High-Resolution Attention Model [6.906621279967867]
We propose a novel Bilinear Representative Non-Parametric Attention (BR-NPA) strategy that captures the task-relevant human-interpretable information.
The proposed model can be easily adapted in a wide variety of modern deep models, where classification is involved.
It is also more accurate, faster, and with a smaller memory footprint than usual neural attention modules.
arXiv Detail & Related papers (2021-06-04T15:57:37Z) - Is Sparse Attention more Interpretable? [52.85910570651047]
We investigate how sparsity affects our ability to use attention as an explainability tool.
We find that only a weak relationship between inputs and co-indexed intermediate representations exists -- under sparse attention.
We observe in this setting that inducing sparsity may make it less plausible that attention can be used as a tool for understanding model behavior.
arXiv Detail & Related papers (2021-06-02T11:42:56Z) - Disambiguation of weak supervision with exponential convergence rates [88.99819200562784]
In supervised learning, data are annotated with incomplete yet discriminative information.
In this paper, we focus on partial labelling, an instance of weak supervision where, from a given input, we are given a set of potential targets.
We propose an empirical disambiguation algorithm to recover full supervision from weak supervision.
arXiv Detail & Related papers (2021-02-04T18:14:32Z) - Why Attentions May Not Be Interpretable? [46.69116768203185]
Recent research found that attention-as-importance interpretations often do not work as we expected.
We show that one root cause of this phenomenon is shortcuts, which means that the attention weights themselves may carry extra information.
We propose two methods to mitigate this issue.
arXiv Detail & Related papers (2020-06-10T05:08:30Z) - Salience Estimation with Multi-Attention Learning for Abstractive Text
Summarization [86.45110800123216]
In the task of text summarization, salience estimation for words, phrases or sentences is a critical component.
We propose a Multi-Attention Learning framework which contains two new attention learning components for salience estimation.
arXiv Detail & Related papers (2020-04-07T02:38:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.