Related papers: Sentence-Based Model Agnostic NLP Interpretability

Sentence-Based Model Agnostic NLP Interpretability

URL: http://arxiv.org/abs/2012.13189v2
Date: Sun, 27 Dec 2020 17:54:38 GMT
Title: Sentence-Based Model Agnostic NLP Interpretability
Authors: Yves Rychener, Xavier Renard, Djam\'e Seddah, Pascal Frossard, Marcin Detyniecki
Abstract summary: We show that, when using complex classifiers like BERT, the word-based approach raises issues not only of computational complexity, but also of an out of distribution sampling, eventually leading to non founded explanations. By using sentences, the altered text remains in-distribution and the dimensionality of the problem is reduced for better fidelity to the black-box at comparable computational complexity.
Score: 45.44406712366411
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Today, interpretability of Black-Box Natural Language Processing (NLP) models based on surrogates, like LIME or SHAP, uses word-based sampling to build the explanations. In this paper we explore the use of sentences to tackle NLP interpretability. While this choice may seem straight forward, we show that, when using complex classifiers like BERT, the word-based approach raises issues not only of computational complexity, but also of an out of distribution sampling, eventually leading to non founded explanations. By using sentences, the altered text remains in-distribution and the dimensionality of the problem is reduced for better fidelity to the black-box at comparable computational complexity.

Related papers

Are We Merely Justifying Results ex Post Facto? Quantifying Explanatory Inversion in Post-Hoc Model Explanations [87.68633031231924]
Post-hoc explanation methods provide interpretation by attributing predictions to input features. Do these explanations unintentionally reverse the natural relationship between inputs and outputs? We propose Inversion Quantification (IQ), a framework that quantifies the degree to which explanations rely on outputs and deviate from faithful input-output relationships.
arXiv Detail & Related papers (2025-04-11T19:00:12Z)
Explaining Text Similarity in Transformer Models [52.571158418102584]
Recent advances in explainable AI have made it possible to mitigate limitations by leveraging improved explanations for Transformers. We use BiLRP, an extension developed for computing second-order explanations in bilinear similarity models, to investigate which feature interactions drive similarity in NLP models. Our findings contribute to a deeper understanding of different semantic similarity tasks and models, highlighting how novel explainable AI methods enable in-depth analyses and corpus-level insights.
arXiv Detail & Related papers (2024-05-10T17:11:31Z)
Understanding and Mitigating Classification Errors Through Interpretable Token Patterns [58.91023283103762]
Characterizing errors in easily interpretable terms gives insight into whether a classifier is prone to making systematic errors. We propose to discover those patterns of tokens that distinguish correct and erroneous predictions. We show that our method, Premise, performs well in practice.
arXiv Detail & Related papers (2023-11-18T00:24:26Z)
Alleviating Over-smoothing for Unsupervised Sentence Representation [96.19497378628594]
We present a Simple method named Self-Contrastive Learning (SSCL) to alleviate this issue. Our proposed method is quite simple and can be easily extended to various state-of-the-art models for performance boosting.
arXiv Detail & Related papers (2023-05-09T11:00:02Z)
Principled Paraphrase Generation with Parallel Corpora [52.78059089341062]
We formalize the implicit similarity function induced by round-trip Machine Translation. We show that it is susceptible to non-paraphrase pairs sharing a single ambiguous translation. We design an alternative similarity metric that mitigates this issue.
arXiv Detail & Related papers (2022-05-24T17:22:42Z)
Argumentative Explanations for Pattern-Based Text Classifiers [15.81939090849456]
We focus on explanations for a specific interpretable model, namely pattern-based logistic regression (PLR) for binary text classification. We propose AXPLR, a novel explanation method using (forms of) computational argumentation to generate explanations.
arXiv Detail & Related papers (2022-05-22T21:16:49Z)
Obtaining Better Static Word Embeddings Using Contextual Embedding Models [53.86080627007695]
Our proposed distillation method is a simple extension of CBOW-based training. As a side-effect, our approach also allows a fair comparison of both contextual and static embeddings.
arXiv Detail & Related papers (2021-06-08T12:59:32Z)
On Guaranteed Optimal Robust Explanations for NLP Models [16.358394218953833]
We build on abduction-based explanations for ma-chine learning and develop a method for computing local explanations for neural network models. We present two solution algorithms, respectively based on implicit hitting sets and maximum universal subsets. We evaluate our framework on three widely used sentiment analysis tasks and texts of up to100words from SST, Twitter and IMDB datasets.
arXiv Detail & Related papers (2021-05-08T08:44:48Z)
Disentangled Contrastive Learning for Learning Robust Textual Representations [13.880693856907037]
We introduce the concept of momentum representation consistency to align features and leverage power normalization while conforming the uniformity. Our experimental results for the NLP benchmarks demonstrate that our approach can obtain better results compared with the baselines.
arXiv Detail & Related papers (2021-04-11T03:32:49Z)
Interpretation of NLP models through input marginalization [28.031961925541466]
Several methods have been proposed to interpret predictions by measuring the change in prediction probability after erasing each token of an input. Since existing methods replace each token with a predefined value (i.e., zero), the resulting sentence lies out of the training data distribution, yielding misleading interpretations. In this study, we raise the out-of-distribution problem induced by the existing interpretation methods and present a remedy. We interpret various NLP models trained for sentiment analysis and natural language inference using the proposed method.
arXiv Detail & Related papers (2020-10-27T01:40:41Z)
Considering Likelihood in NLP Classification Explanations with Occlusion and Language Modeling [11.594541142399223]
Occlusion is a well established method that provides explanations on discrete language data. We argue that current Occlusion-based methods often produce invalid or syntactically incorrect language data. We propose OLM: a novel explanation method that combines Occlusion and language models to sample valid and syntactically correct replacements.
arXiv Detail & Related papers (2020-04-21T10:37:44Z)

This list is automatically generated from the titles and abstracts of the papers in this site.