Is Sparse Attention more Interpretable?
- URL: http://arxiv.org/abs/2106.01087v1
- Date: Wed, 2 Jun 2021 11:42:56 GMT
- Title: Is Sparse Attention more Interpretable?
- Authors: Clara Meister, Stefan Lazov, Isabelle Augenstein, Ryan Cotterell
- Abstract summary: We investigate how sparsity affects our ability to use attention as an explainability tool.
We find that only a weak relationship between inputs and co-indexed intermediate representations exists -- under sparse attention.
We observe in this setting that inducing sparsity may make it less plausible that attention can be used as a tool for understanding model behavior.
- Score: 52.85910570651047
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Sparse attention has been claimed to increase model interpretability under
the assumption that it highlights influential inputs. Yet the attention
distribution is typically over representations internal to the model rather
than the inputs themselves, suggesting this assumption may not have merit. We
build on the recent work exploring the interpretability of attention; we design
a set of experiments to help us understand how sparsity affects our ability to
use attention as an explainability tool. On three text classification tasks, we
verify that only a weak relationship between inputs and co-indexed intermediate
representations exists -- under sparse attention and otherwise. Further, we do
not find any plausible mappings from sparse attention distributions to a sparse
set of influential inputs through other avenues. Rather, we observe in this
setting that inducing sparsity may make it less plausible that attention can be
used as a tool for understanding model behavior.
Related papers
- Revisiting Attention Weights as Explanations from an Information
Theoretic Perspective [4.499369811647602]
We show that attention mechanisms have the potential to function as a shortcut to model explanations when they are carefully combined with other model elements.
Our findings indicate that attention mechanisms do have the potential to function as a shortcut to model explanations when they are carefully combined with other model elements.
arXiv Detail & Related papers (2022-10-31T12:53:20Z) - Is Attention Interpretation? A Quantitative Assessment On Sets [0.0]
We study the interpretability of attention in the context of set machine learning.
We find that attention distributions are indeed often reflective of the relative importance of individual instances.
We propose to use ensembling to minimize the risk of misleading attention-based explanations.
arXiv Detail & Related papers (2022-07-26T16:25:38Z) - Guiding Visual Question Answering with Attention Priors [76.21671164766073]
We propose to guide the attention mechanism using explicit linguistic-visual grounding.
This grounding is derived by connecting structured linguistic concepts in the query to their referents among the visual objects.
The resultant algorithm is capable of probing attention-based reasoning models, injecting relevant associative knowledge, and regulating the core reasoning process.
arXiv Detail & Related papers (2022-05-25T09:53:47Z) - Attention cannot be an Explanation [99.37090317971312]
We ask how effective are attention based explanations in increasing human trust and reliance in the underlying models?
We perform extensive human study experiments that aim to qualitatively and quantitatively assess the degree to which attention based explanations are suitable.
Our experiment results show that attention cannot be used as an explanation.
arXiv Detail & Related papers (2022-01-26T21:34:05Z) - Alignment Attention by Matching Key and Query Distributions [48.93793773929006]
This paper introduces alignment attention that explicitly encourages self-attention to match the distributions of the key and query within each head.
It is simple to convert any models with self-attention, including pre-trained ones, to the proposed alignment attention.
On a variety of language understanding tasks, we show the effectiveness of our method in accuracy, uncertainty estimation, generalization across domains, and robustness to adversarial attacks.
arXiv Detail & Related papers (2021-10-25T00:54:57Z) - More Identifiable yet Equally Performant Transformers for Text
Classification [13.439554931699695]
Transformer's predictions are widely explained by attention weights, i.e., a probability distribution generated at its self-attention unit (head)
Current empirical studies provide shreds of evidence that attention weights are not explanations by proving that they are not unique.
For a given input to a head and its output, if the attention weights generated in it are unique, we call the weights identifiable.
We provide a variant of the encoder layer that decouples the relationship between key and value vector and provides identifiable weights up to the desired length of the input.
arXiv Detail & Related papers (2021-06-02T16:21:38Z) - SparseBERT: Rethinking the Importance Analysis in Self-attention [107.68072039537311]
Transformer-based models are popular for natural language processing (NLP) tasks due to its powerful capacity.
Attention map visualization of a pre-trained model is one direct method for understanding self-attention mechanism.
We propose a Differentiable Attention Mask (DAM) algorithm, which can be also applied in guidance of SparseBERT design.
arXiv Detail & Related papers (2021-02-25T14:13:44Z) - Why Attentions May Not Be Interpretable? [46.69116768203185]
Recent research found that attention-as-importance interpretations often do not work as we expected.
We show that one root cause of this phenomenon is shortcuts, which means that the attention weights themselves may carry extra information.
We propose two methods to mitigate this issue.
arXiv Detail & Related papers (2020-06-10T05:08:30Z) - Towards Transparent and Explainable Attention Models [34.0557018891191]
We first explain why current attention mechanisms in LSTM based encoders can neither provide a faithful nor a plausible explanation of the model's predictions.
We propose a modified LSTM cell with a diversity-driven training objective that ensures that the hidden representations learned at different time steps are diverse.
Human evaluations indicate that the attention distributions learned by our model offer a plausible explanation of the model's predictions.
arXiv Detail & Related papers (2020-04-29T14:47:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.