Related papers: More Identifiable yet Equally Performant Transformers for Text Classification

More Identifiable yet Equally Performant Transformers for Text Classification

URL: http://arxiv.org/abs/2106.01269v1
Date: Wed, 2 Jun 2021 16:21:38 GMT
Title: More Identifiable yet Equally Performant Transformers for Text Classification
Authors: Rishabh Bhardwaj, Navonil Majumder, Soujanya Poria, Eduard Hovy
Abstract summary: Transformer's predictions are widely explained by attention weights, i.e., a probability distribution generated at its self-attention unit (head) Current empirical studies provide shreds of evidence that attention weights are not explanations by proving that they are not unique. For a given input to a head and its output, if the attention weights generated in it are unique, we call the weights identifiable. We provide a variant of the encoder layer that decouples the relationship between key and value vector and provides identifiable weights up to the desired length of the input.
Score: 13.439554931699695
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: Interpretability is an important aspect of the trustworthiness of a model's predictions. Transformer's predictions are widely explained by the attention weights, i.e., a probability distribution generated at its self-attention unit (head). Current empirical studies provide shreds of evidence that attention weights are not explanations by proving that they are not unique. A recent study showed theoretical justifications to this observation by proving the non-identifiability of attention weights. For a given input to a head and its output, if the attention weights generated in it are unique, we call the weights identifiable. In this work, we provide deeper theoretical analysis and empirical observations on the identifiability of attention weights. Ignored in the previous works, we find the attention weights are more identifiable than we currently perceive by uncovering the hidden role of the key vector. However, the weights are still prone to be non-unique attentions that make them unfit for interpretation. To tackle this issue, we provide a variant of the encoder layer that decouples the relationship between key and value vector and provides identifiable weights up to the desired length of the input. We prove the applicability of such variations by providing empirical justifications on varied text classification tasks. The implementations are available at https://github.com/declare-lab/identifiable-transformers.

Related papers

Why Can Accurate Models Be Learned from Inaccurate Annotations? [48.528799044535155]
Despite the presence of erroneous labels, models trained on noisy data often retain the ability to make accurate predictions.<n>This intriguing phenomenon raises a fundamental yet largely unexplored question: why models can still extract correct label information from inaccurate annotations remains unexplored.<n>We propose LIP, a lightweight plug-in designed to help classifiers retain principal subspace information while mitigating noise induced by label inaccuracy.
arXiv Detail & Related papers (2025-05-22T03:00:15Z)
Regularization, Semi-supervision, and Supervision for a Plausible Attention-Based Explanation [0.2499907423888049]
Empirical studies postulate that attention maps can be provided as an explanation for model output. Recent studies show that attention weights in the RNN encoders are hardly plausible because they spread on input tokens. We propose 3 additional constraints to the learning objective function to improve the plausibility of the attention map.
arXiv Detail & Related papers (2025-01-22T10:17:20Z)
An Analysis of Attention via the Lens of Exchangeability and Latent Variable Models [64.87562101662952]
We show that input tokens are often exchangeable since they already include positional encodings. We establish the existence of a sufficient and minimal representation of input tokens. We prove that attention with the desired parameter infers the latent posterior up to an approximation error.
arXiv Detail & Related papers (2022-12-30T17:59:01Z)
Guiding Visual Question Answering with Attention Priors [76.21671164766073]
We propose to guide the attention mechanism using explicit linguistic-visual grounding. This grounding is derived by connecting structured linguistic concepts in the query to their referents among the visual objects. The resultant algorithm is capable of probing attention-based reasoning models, injecting relevant associative knowledge, and regulating the core reasoning process.
arXiv Detail & Related papers (2022-05-25T09:53:47Z)
Rethinking Attention-Model Explainability through Faithfulness Violation Test [29.982295060192904]
We study the explainability of current attention-based techniques, such as Attentio$odot$Gradient and LRP-based attention explanations. We show that most tested explanation methods are unexpectedly hindered by the faithfulness violation issue.
arXiv Detail & Related papers (2022-01-28T13:42:31Z)
Attention cannot be an Explanation [99.37090317971312]
We ask how effective are attention based explanations in increasing human trust and reliance in the underlying models? We perform extensive human study experiments that aim to qualitatively and quantitatively assess the degree to which attention based explanations are suitable. Our experiment results show that attention cannot be used as an explanation.
arXiv Detail & Related papers (2022-01-26T21:34:05Z)
Is Sparse Attention more Interpretable? [52.85910570651047]
We investigate how sparsity affects our ability to use attention as an explainability tool. We find that only a weak relationship between inputs and co-indexed intermediate representations exists -- under sparse attention. We observe in this setting that inducing sparsity may make it less plausible that attention can be used as a tool for understanding model behavior.
arXiv Detail & Related papers (2021-06-02T11:42:56Z)
SparseBERT: Rethinking the Importance Analysis in Self-attention [107.68072039537311]
Transformer-based models are popular for natural language processing (NLP) tasks due to its powerful capacity. Attention map visualization of a pre-trained model is one direct method for understanding self-attention mechanism. We propose a Differentiable Attention Mask (DAM) algorithm, which can be also applied in guidance of SparseBERT design.
arXiv Detail & Related papers (2021-02-25T14:13:44Z)
Why Attentions May Not Be Interpretable? [46.69116768203185]
Recent research found that attention-as-importance interpretations often do not work as we expected. We show that one root cause of this phenomenon is shortcuts, which means that the attention weights themselves may carry extra information. We propose two methods to mitigate this issue.
arXiv Detail & Related papers (2020-06-10T05:08:30Z)
Towards Transparent and Explainable Attention Models [34.0557018891191]
We first explain why current attention mechanisms in LSTM based encoders can neither provide a faithful nor a plausible explanation of the model's predictions. We propose a modified LSTM cell with a diversity-driven training objective that ensures that the hidden representations learned at different time steps are diverse. Human evaluations indicate that the attention distributions learned by our model offer a plausible explanation of the model's predictions.
arXiv Detail & Related papers (2020-04-29T14:47:50Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.