Why Attentions May Not Be Interpretable?
- URL: http://arxiv.org/abs/2006.05656v4
- Date: Thu, 3 Jun 2021 06:51:16 GMT
- Title: Why Attentions May Not Be Interpretable?
- Authors: Bing Bai, Jian Liang, Guanhua Zhang, Hao Li, Kun Bai, Fei Wang
- Abstract summary: Recent research found that attention-as-importance interpretations often do not work as we expected.
We show that one root cause of this phenomenon is shortcuts, which means that the attention weights themselves may carry extra information.
We propose two methods to mitigate this issue.
- Score: 46.69116768203185
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Attention-based methods have played important roles in model interpretations,
where the calculated attention weights are expected to highlight the critical
parts of inputs~(e.g., keywords in sentences). However, recent research found
that attention-as-importance interpretations often do not work as we expected.
For example, learned attention weights sometimes highlight less meaningful
tokens like "[SEP]", ",", and ".", and are frequently uncorrelated with other
feature importance indicators like gradient-based measures. A recent debate
over whether attention is an explanation or not has drawn considerable
interest. In this paper, we demonstrate that one root cause of this phenomenon
is the combinatorial shortcuts, which means that, in addition to the
highlighted parts, the attention weights themselves may carry extra information
that could be utilized by downstream models after attention layers. As a
result, the attention weights are no longer pure importance indicators. We
theoretically analyze combinatorial shortcuts, design one intuitive experiment
to show their existence, and propose two methods to mitigate this issue. We
conduct empirical studies on attention-based interpretation models. The results
show that the proposed methods can effectively improve the interpretability of
attention mechanisms.
Related papers
- Attention Meets Post-hoc Interpretability: A Mathematical Perspective [6.492879435794228]
We mathematically study a simple attention-based architecture and pinpoint the differences between post-hoc and attention-based explanations.
We show that they provide quite different results, and that, despite their limitations, post-hoc methods are capable of capturing more useful insights than merely examining the attention weights.
arXiv Detail & Related papers (2024-02-05T19:56:56Z) - Is Attention Interpretation? A Quantitative Assessment On Sets [0.0]
We study the interpretability of attention in the context of set machine learning.
We find that attention distributions are indeed often reflective of the relative importance of individual instances.
We propose to use ensembling to minimize the risk of misleading attention-based explanations.
arXiv Detail & Related papers (2022-07-26T16:25:38Z) - Guiding Visual Question Answering with Attention Priors [76.21671164766073]
We propose to guide the attention mechanism using explicit linguistic-visual grounding.
This grounding is derived by connecting structured linguistic concepts in the query to their referents among the visual objects.
The resultant algorithm is capable of probing attention-based reasoning models, injecting relevant associative knowledge, and regulating the core reasoning process.
arXiv Detail & Related papers (2022-05-25T09:53:47Z) - Rethinking Attention-Model Explainability through Faithfulness Violation
Test [29.982295060192904]
We study the explainability of current attention-based techniques, such as Attentio$odot$Gradient and LRP-based attention explanations.
We show that most tested explanation methods are unexpectedly hindered by the faithfulness violation issue.
arXiv Detail & Related papers (2022-01-28T13:42:31Z) - Attention cannot be an Explanation [99.37090317971312]
We ask how effective are attention based explanations in increasing human trust and reliance in the underlying models?
We perform extensive human study experiments that aim to qualitatively and quantitatively assess the degree to which attention based explanations are suitable.
Our experiment results show that attention cannot be used as an explanation.
arXiv Detail & Related papers (2022-01-26T21:34:05Z) - Is Sparse Attention more Interpretable? [52.85910570651047]
We investigate how sparsity affects our ability to use attention as an explainability tool.
We find that only a weak relationship between inputs and co-indexed intermediate representations exists -- under sparse attention.
We observe in this setting that inducing sparsity may make it less plausible that attention can be used as a tool for understanding model behavior.
arXiv Detail & Related papers (2021-06-02T11:42:56Z) - SparseBERT: Rethinking the Importance Analysis in Self-attention [107.68072039537311]
Transformer-based models are popular for natural language processing (NLP) tasks due to its powerful capacity.
Attention map visualization of a pre-trained model is one direct method for understanding self-attention mechanism.
We propose a Differentiable Attention Mask (DAM) algorithm, which can be also applied in guidance of SparseBERT design.
arXiv Detail & Related papers (2021-02-25T14:13:44Z) - Explain and Improve: LRP-Inference Fine-Tuning for Image Captioning
Models [82.3793660091354]
This paper analyzes the predictions of image captioning models with attention mechanisms beyond visualizing the attention itself.
We develop variants of layer-wise relevance propagation (LRP) and gradient-based explanation methods, tailored to image captioning models with attention mechanisms.
arXiv Detail & Related papers (2020-01-04T05:15:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.