Related papers: Interpretable Text Embeddings and Text Similarity Explanation: A Primer

Interpretable Text Embeddings and Text Similarity Explanation: A Primer

URL: http://arxiv.org/abs/2502.14862v1
Date: Thu, 20 Feb 2025 18:59:34 GMT
Title: Interpretable Text Embeddings and Text Similarity Explanation: A Primer
Authors: Juri Opitz, Lucas Möller, Andrianos Michail, Simon Clematide,
Abstract summary: We give a structured overview of interpretability methods specializing in explaining obtained similarity scores.<n>We study the methods' individual ideas and techniques, evaluating their potential for improving interpretability of text embeddings and explaining predicted similarities.
Score: 5.474797258314828
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Text embeddings and text embedding models are a backbone of many AI and NLP systems, particularly those involving search. However, interpretability challenges persist, especially in explaining obtained similarity scores, which is crucial for applications requiring transparency. In this paper, we give a structured overview of interpretability methods specializing in explaining those similarity scores, an emerging research area. We study the methods' individual ideas and techniques, evaluating their potential for improving interpretability of text embeddings and explaining predicted similarities.

Related papers

Integration of Contextual Descriptors in Ontology Alignment for Enrichment of Semantic Correspondence [13.69268253901738]
A formalization was developed that enables the integration of essential and contextual descriptors to create a comprehensive knowledge model.<n>The hierarchical structure of the semantic approach and the mathematical apparatus for analyzing potential conflicts between concepts are demonstrated.
arXiv Detail & Related papers (2024-11-28T12:59:32Z)
Explaining Text Similarity in Transformer Models [52.571158418102584]
Recent advances in explainable AI have made it possible to mitigate limitations by leveraging improved explanations for Transformers. We use BiLRP, an extension developed for computing second-order explanations in bilinear similarity models, to investigate which feature interactions drive similarity in NLP models. Our findings contribute to a deeper understanding of different semantic similarity tasks and models, highlighting how novel explainable AI methods enable in-depth analyses and corpus-level insights.
arXiv Detail & Related papers (2024-05-10T17:11:31Z)
How Well Do Text Embedding Models Understand Syntax? [50.440590035493074]
The ability of text embedding models to generalize across a wide range of syntactic contexts remains under-explored. Our findings reveal that existing text embedding models have not sufficiently addressed these syntactic understanding challenges. We propose strategies to augment the generalization ability of text embedding models in diverse syntactic scenarios.
arXiv Detail & Related papers (2023-11-14T08:51:00Z)
Composition-contrastive Learning for Sentence Embeddings [23.85590618900386]
This work is the first to do so without incurring costs in auxiliary training objectives or additional network parameters. Experimental results on semantic textual similarity tasks show improvements over baselines that are comparable with state-of-the-art approaches.
arXiv Detail & Related papers (2023-07-14T14:39:35Z)
Natural Language Decompositions of Implicit Content Enable Better Text Representations [56.85319224208865]
We introduce a method for the analysis of text that takes implicitly communicated content explicitly into account. We use a large language model to produce sets of propositions that are inferentially related to the text that has been observed. Our results suggest that modeling the meanings behind observed language, rather than the literal text alone, is a valuable direction for NLP.
arXiv Detail & Related papers (2023-05-23T23:45:20Z)
An Inclusive Notion of Text [69.36678873492373]
We argue that clarity on the notion of text is crucial for reproducible and generalizable NLP. We introduce a two-tier taxonomy of linguistic and non-linguistic elements that are available in textual sources and can be used in NLP modeling.
arXiv Detail & Related papers (2022-11-10T14:26:43Z)
Interpreting BERT-based Text Similarity via Activation and Saliency Maps [26.279593839644836]
We present an unsupervised technique for explaining paragraph similarities inferred by pre-trained BERT models. By looking at a pair of paragraphs, our technique identifies important words that dictate each paragraph's semantics, matches between the words in both paragraphs, and retrieves the most important pairs that explain the similarity between the two.
arXiv Detail & Related papers (2022-08-13T10:06:24Z)
On the Faithfulness Measurements for Model Interpretations [100.2730234575114]
Post-hoc interpretations aim to uncover how natural language processing (NLP) models make predictions. To tackle these issues, we start with three criteria: the removal-based criterion, the sensitivity of interpretations, and the stability of interpretations. Motivated by the desideratum of these faithfulness notions, we introduce a new class of interpretation methods that adopt techniques from the adversarial domain.
arXiv Detail & Related papers (2021-04-18T09:19:44Z)
Interpretable Deep Learning: Interpretations, Interpretability, Trustworthiness, and Beyond [49.93153180169685]
We introduce and clarify two basic concepts-interpretations and interpretability-that people usually get confused. We elaborate the design of several recent interpretation algorithms, from different perspectives, through proposing a new taxonomy. We summarize the existing work in evaluating models' interpretability using "trustworthy" interpretation algorithms.
arXiv Detail & Related papers (2021-03-19T08:40:30Z)
"Let's Eat Grandma": When Punctuation Matters in Sentence Representation for Sentiment Analysis [13.873803872380229]
We argue that punctuation could play a significant role in sentiment analysis and propose a novel representation model to improve syntactic and contextual performance. We conduct experiments on publicly available datasets and verify that our model can identify the sentiments more accurately over other state-of-the-art baseline methods.
arXiv Detail & Related papers (2020-12-10T19:07:31Z)

This list is automatically generated from the titles and abstracts of the papers in this site.