On Guaranteed Optimal Robust Explanations for NLP Models
- URL: http://arxiv.org/abs/2105.03640v1
- Date: Sat, 8 May 2021 08:44:48 GMT
- Title: On Guaranteed Optimal Robust Explanations for NLP Models
- Authors: Emanuele La Malfa, Agnieszka Zbrzezny, Rhiannon Michelmore, Nicola
Paoletti and Marta Kwiatkowska
- Abstract summary: We build on abduction-based explanations for ma-chine learning and develop a method for computing local explanations for neural network models.
We present two solution algorithms, respectively based on implicit hitting sets and maximum universal subsets.
We evaluate our framework on three widely used sentiment analysis tasks and texts of up to100words from SST, Twitter and IMDB datasets.
- Score: 16.358394218953833
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We build on abduction-based explanations for ma-chine learning and develop a
method for computing local explanations for neural network models in natural
language processing (NLP). Our explanations comprise a subset of the words of
the in-put text that satisfies two key features: optimality w.r.t. a
user-defined cost function, such as the length of explanation, and robustness,
in that they ensure prediction invariance for any bounded perturbation in the
embedding space of the left out words. We present two solution algorithms,
respectively based on implicit hitting sets and maximum universal subsets,
introducing a number of algorithmic improvements to speed up convergence of
hard instances. We show how our method can be con-figured with different
perturbation sets in the em-bedded space and used to detect bias in predictions
by enforcing include/exclude constraints on biased terms, as well as to enhance
existing heuristic-based NLP explanation frameworks such as Anchors. We
evaluate our framework on three widely used sentiment analysis tasks and texts
of up to100words from SST, Twitter and IMDB datasets,demonstrating the
effectiveness of the derived explanations.
Related papers
- Enhancing adversarial robustness in Natural Language Inference using explanations [41.46494686136601]
We cast the spotlight on the underexplored task of Natural Language Inference (NLI)
We validate the usage of natural language explanation as a model-agnostic defence strategy through extensive experimentation.
We research the correlation of widely used language generation metrics with human perception, in order for them to serve as a proxy towards robust NLI models.
arXiv Detail & Related papers (2024-09-11T17:09:49Z) - Reconsidering Degeneration of Token Embeddings with Definitions for Encoder-based Pre-trained Language Models [20.107727903240065]
We propose DefinitionEMB to re-construct isotropically distributed and semantics-related token embeddings for encoder-based language models.
Our experiments demonstrate the effectiveness of leveraging definitions from Wiktionary to re-construct such embeddings.
arXiv Detail & Related papers (2024-08-02T15:00:05Z) - Explaining Text Similarity in Transformer Models [52.571158418102584]
Recent advances in explainable AI have made it possible to mitigate limitations by leveraging improved explanations for Transformers.
We use BiLRP, an extension developed for computing second-order explanations in bilinear similarity models, to investigate which feature interactions drive similarity in NLP models.
Our findings contribute to a deeper understanding of different semantic similarity tasks and models, highlighting how novel explainable AI methods enable in-depth analyses and corpus-level insights.
arXiv Detail & Related papers (2024-05-10T17:11:31Z) - Efficient Model-Free Exploration in Low-Rank MDPs [76.87340323826945]
Low-Rank Markov Decision Processes offer a simple, yet expressive framework for RL with function approximation.
Existing algorithms are either (1) computationally intractable, or (2) reliant upon restrictive statistical assumptions.
We propose the first provably sample-efficient algorithm for exploration in Low-Rank MDPs.
arXiv Detail & Related papers (2023-07-08T15:41:48Z) - Exploiting Inferential Structure in Neural Processes [15.058161307401864]
Neural Processes (NPs) are appealing due to their ability to perform fast adaptation based on a context set.
We provide a framework that allows NPs' latent variable to be given a rich prior defined by a graphical model.
arXiv Detail & Related papers (2023-06-27T03:01:43Z) - SUN: Exploring Intrinsic Uncertainties in Text-to-SQL Parsers [61.48159785138462]
This paper aims to improve the performance of text-to-dependence by exploring the intrinsic uncertainties in the neural network based approaches (called SUN)
Extensive experiments on five benchmark datasets demonstrate that our method significantly outperforms competitors and achieves new state-of-the-art results.
arXiv Detail & Related papers (2022-09-14T06:27:51Z) - Optimal Counterfactual Explanations in Tree Ensembles [3.8073142980733]
We advocate for a model-based search aiming at "optimal" explanations and propose efficient mixed-integer programming approaches.
We show that isolation forests can be modeled within our framework to focus the search on plausible explanations with a low outlier score.
arXiv Detail & Related papers (2021-06-11T22:44:27Z) - Obtaining Better Static Word Embeddings Using Contextual Embedding
Models [53.86080627007695]
Our proposed distillation method is a simple extension of CBOW-based training.
As a side-effect, our approach also allows a fair comparison of both contextual and static embeddings.
arXiv Detail & Related papers (2021-06-08T12:59:32Z) - Sentence-Based Model Agnostic NLP Interpretability [45.44406712366411]
We show that, when using complex classifiers like BERT, the word-based approach raises issues not only of computational complexity, but also of an out of distribution sampling, eventually leading to non founded explanations.
By using sentences, the altered text remains in-distribution and the dimensionality of the problem is reduced for better fidelity to the black-box at comparable computational complexity.
arXiv Detail & Related papers (2020-12-24T10:32:41Z) - A Constraint-Based Algorithm for the Structural Learning of
Continuous-Time Bayesian Networks [70.88503833248159]
We propose the first constraint-based algorithm for learning the structure of continuous-time Bayesian networks.
We discuss the different statistical tests and the underlying hypotheses used by our proposal to establish conditional independence.
arXiv Detail & Related papers (2020-07-07T07:34:09Z) - Defense against Adversarial Attacks in NLP via Dirichlet Neighborhood
Ensemble [163.3333439344695]
Dirichlet Neighborhood Ensemble (DNE) is a randomized smoothing method for training a robust model to defense substitution-based attacks.
DNE forms virtual sentences by sampling embedding vectors for each word in an input sentence from a convex hull spanned by the word and its synonyms, and it augments them with the training data.
We demonstrate through extensive experimentation that our method consistently outperforms recently proposed defense methods by a significant margin across different network architectures and multiple data sets.
arXiv Detail & Related papers (2020-06-20T18:01:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.