Explaining Text Similarity in Transformer Models
- URL: http://arxiv.org/abs/2405.06604v1
- Date: Fri, 10 May 2024 17:11:31 GMT
- Title: Explaining Text Similarity in Transformer Models
- Authors: Alexandros Vasileiou, Oliver Eberle,
- Abstract summary: Recent advances in explainable AI have made it possible to mitigate limitations by leveraging improved explanations for Transformers.
We use BiLRP, an extension developed for computing second-order explanations in bilinear similarity models, to investigate which feature interactions drive similarity in NLP models.
Our findings contribute to a deeper understanding of different semantic similarity tasks and models, highlighting how novel explainable AI methods enable in-depth analyses and corpus-level insights.
- Score: 52.571158418102584
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: As Transformers have become state-of-the-art models for natural language processing (NLP) tasks, the need to understand and explain their predictions is increasingly apparent. Especially in unsupervised applications, such as information retrieval tasks, similarity models built on top of foundation model representations have been widely applied. However, their inner prediction mechanisms have mostly remained opaque. Recent advances in explainable AI have made it possible to mitigate these limitations by leveraging improved explanations for Transformers through layer-wise relevance propagation (LRP). Using BiLRP, an extension developed for computing second-order explanations in bilinear similarity models, we investigate which feature interactions drive similarity in NLP models. We validate the resulting explanations and demonstrate their utility in three corpus-level use cases, analyzing grammatical interactions, multilingual semantics, and biomedical text retrieval. Our findings contribute to a deeper understanding of different semantic similarity tasks and models, highlighting how novel explainable AI methods enable in-depth analyses and corpus-level insights.
Related papers
- LatentExplainer: Explaining Latent Representations in Deep Generative Models with Multi-modal Foundation Models [4.675123839851372]
textitLatentExplainer is a framework for automatically generating semantically meaningful explanations of latent variables in deep generative models.
Our approach perturbs latent variables, interpreting changes in generated data, and uses multi-modal large language models (MLLMs) to produce human-understandable explanations.
arXiv Detail & Related papers (2024-06-21T04:39:03Z) - Probabilistic Transformer: A Probabilistic Dependency Model for
Contextual Word Representation [52.270712965271656]
We propose a new model of contextual word representation, not from a neural perspective, but from a purely syntactic and probabilistic perspective.
We find that the graph of our model resembles transformers, with correspondences between dependencies and self-attention.
Experiments show that our model performs competitively to transformers on small to medium sized datasets.
arXiv Detail & Related papers (2023-11-26T06:56:02Z) - Explainability for Large Language Models: A Survey [59.67574757137078]
Large language models (LLMs) have demonstrated impressive capabilities in natural language processing.
This paper introduces a taxonomy of explainability techniques and provides a structured overview of methods for explaining Transformer-based language models.
arXiv Detail & Related papers (2023-09-02T22:14:26Z) - Counterfactuals of Counterfactuals: a back-translation-inspired approach
to analyse counterfactual editors [3.4253416336476246]
We focus on the analysis of counterfactual, contrastive explanations.
We propose a new back translation-inspired evaluation methodology.
We show that by iteratively feeding the counterfactual to the explainer we can obtain valuable insights into the behaviour of both the predictor and the explainer models.
arXiv Detail & Related papers (2023-05-26T16:04:28Z) - Learning Semantic Textual Similarity via Topic-informed Discrete Latent
Variables [17.57873577962635]
We develop a topic-informed discrete latent variable model for semantic textual similarity.
Our model learns a shared latent space for sentence-pair representation via vector quantization.
We show that our model is able to surpass several strong neural baselines in semantic textual similarity tasks.
arXiv Detail & Related papers (2022-11-07T15:09:58Z) - A Latent-Variable Model for Intrinsic Probing [93.62808331764072]
We propose a novel latent-variable formulation for constructing intrinsic probes.
We find empirical evidence that pre-trained representations develop a cross-lingually entangled notion of morphosyntax.
arXiv Detail & Related papers (2022-01-20T15:01:12Z) - Combining Transformers with Natural Language Explanations [13.167758466408825]
We propose an extension to transformer models that makes use of external memories to store natural language explanations and use them to explain classification outputs.
We conduct an experimental evaluation on two domains, legal text analysis and argument mining, to show that our approach can produce relevant explanations while retaining or even improving classification performance.
arXiv Detail & Related papers (2021-09-02T09:17:04Z) - Towards Interpretable Natural Language Understanding with Explanations
as Latent Variables [146.83882632854485]
We develop a framework for interpretable natural language understanding that requires only a small set of human annotated explanations for training.
Our framework treats natural language explanations as latent variables that model the underlying reasoning process of a neural model.
arXiv Detail & Related papers (2020-10-24T02:05:56Z) - Ontology-based Interpretable Machine Learning for Textual Data [35.01650633374998]
We introduce a novel interpreting framework that learns an interpretable model based on sampling technique to explain prediction models.
To narrow down the search space for explanations, we design a learnable anchor algorithm.
A set of regulations is further introduced, regarding combining learned interpretable representations with anchors to generate comprehensible explanations.
arXiv Detail & Related papers (2020-04-01T02:51:57Z) - Temporal Embeddings and Transformer Models for Narrative Text
Understanding [72.88083067388155]
We present two approaches to narrative text understanding for character relationship modelling.
The temporal evolution of these relations is described by dynamic word embeddings, that are designed to learn semantic changes over time.
A supervised learning approach based on the state-of-the-art transformer model BERT is used instead to detect static relations between characters.
arXiv Detail & Related papers (2020-03-19T14:23:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.