Related papers: Explaining Natural Language Processing Classifiers with Occlusion and Language Modeling

Explaining Natural Language Processing Classifiers with Occlusion and Language Modeling

URL: http://arxiv.org/abs/2101.11889v1
Date: Thu, 28 Jan 2021 09:44:04 GMT
Title: Explaining Natural Language Processing Classifiers with Occlusion and Language Modeling
Authors: David Harbecke
Abstract summary: We present a novel explanation method, called OLM, for natural language processing classifiers. OLM gives explanations that are theoretically sound and easy to understand. We make several contributions to the theory of explanation methods.
Score: 4.9342793303029975
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Deep neural networks are powerful statistical learners. However, their predictions do not come with an explanation of their process. To analyze these models, explanation methods are being developed. We present a novel explanation method, called OLM, for natural language processing classifiers. This method combines occlusion and language modeling, which are techniques central to explainability and NLP, respectively. OLM gives explanations that are theoretically sound and easy to understand. We make several contributions to the theory of explanation methods. Axioms for explanation methods are an interesting theoretical concept to explore their basics and deduce methods. We introduce a new axiom, give its intuition and show it contradicts another existing axiom. Additionally, we point out theoretical difficulties of existing gradient-based and some occlusion-based explanation methods in natural language processing. We provide an extensive argument why evaluation of explanation methods is difficult. We compare OLM to other explanation methods and underline its uniqueness experimentally. Finally, we investigate corner cases of OLM and discuss its validity and possible improvements.

Related papers

Reasoning-Grounded Natural Language Explanations for Language Models [2.7855886538423182]
We propose a large language model explainability technique for obtaining faithful natural language explanations. When converted to a sequence of tokens, the outputs of the reasoning process can become part of the model context. We show that the proposed use of reasoning can also improve the quality of the answers.
arXiv Detail & Related papers (2025-03-14T10:00:03Z)
Towards More Faithful Natural Language Explanation Using Multi-Level Contrastive Learning in VQA [7.141288053123662]
Natural language explanation in visual question answer (VQA-NLE) aims to explain the decision-making process of models by generating natural language sentences to increase users' trust in the black-box systems. Existing post-hoc explanations are not always aligned with human logical inference, suffering from the issues on: 1) Deductive unsatisfiability, the generated explanations do not logically lead to the answer; 2) Factual inconsistency, the model falsifies its counterfactual explanation for answers without considering the facts in images; and 3) Semantic perturbation insensitivity, the model can not recognize the semantic changes caused by small perturbations
arXiv Detail & Related papers (2023-12-21T05:51:55Z)
Abductive Commonsense Reasoning Exploiting Mutually Exclusive Explanations [118.0818807474809]
Abductive reasoning aims to find plausible explanations for an event. Existing approaches for abductive reasoning in natural language processing often rely on manually generated annotations for supervision. This work proposes an approach for abductive commonsense reasoning that exploits the fact that only a subset of explanations is correct for a given context.
arXiv Detail & Related papers (2023-05-24T01:35:10Z)
MaNtLE: Model-agnostic Natural Language Explainer [9.43206883360088]
We introduce MaNtLE, a model-agnostic natural language explainer that analyzes multiple classifier predictions. MaNtLE uses multi-task training on thousands of synthetic classification tasks to generate faithful explanations. Simulated user studies indicate that, on average, MaNtLE-generated explanations are at least 11% more faithful compared to LIME and Anchors explanations.
arXiv Detail & Related papers (2023-05-22T12:58:06Z)
NELLIE: A Neuro-Symbolic Inference Engine for Grounded, Compositional, and Explainable Reasoning [59.16962123636579]
This paper proposes a new take on Prolog-based inference engines. We replace handcrafted rules with a combination of neural language modeling, guided generation, and semi dense retrieval. Our implementation, NELLIE, is the first system to demonstrate fully interpretable, end-to-end grounded QA.
arXiv Detail & Related papers (2022-09-16T00:54:44Z)
Learning to Scaffold: Optimizing Model Explanations for Teaching [74.25464914078826]
We train models on three natural language processing and computer vision tasks. We find that students trained with explanations extracted with our framework are able to simulate the teacher significantly more effectively than ones produced with previous methods.
arXiv Detail & Related papers (2022-04-22T16:43:39Z)
Do Explanations Explain? Model Knows Best [39.86131552976105]
It is a mystery which input features contribute to a neural network's output. We propose a framework for evaluating the explanations using the neural network model itself.
arXiv Detail & Related papers (2022-03-04T12:39:29Z)
Human Interpretation of Saliency-based Explanation Over Text [65.29015910991261]
We study saliency-based explanations over textual data. We find that people often mis-interpret the explanations. We propose a method to adjust saliencies based on model estimates of over- and under-perception.
arXiv Detail & Related papers (2022-01-27T15:20:32Z)
Explaining by Removing: A Unified Framework for Model Explanation [14.50261153230204]
Removal-based explanations are based on the principle of simulating feature removal to quantify each feature's influence. We develop a framework that characterizes each method along three dimensions: 1) how the method removes features, 2) what model behavior the method explains, and 3) how the method summarizes each feature's influence. This newly understood class of explanation methods has rich connections that we examine using tools that have been largely overlooked by the explainability literature.
arXiv Detail & Related papers (2020-11-21T00:47:48Z)
Towards Interpretable Natural Language Understanding with Explanations as Latent Variables [146.83882632854485]
We develop a framework for interpretable natural language understanding that requires only a small set of human annotated explanations for training. Our framework treats natural language explanations as latent variables that model the underlying reasoning process of a neural model.
arXiv Detail & Related papers (2020-10-24T02:05:56Z)
Evaluating Explainable AI: Which Algorithmic Explanations Help Users Predict Model Behavior? [97.77183117452235]
We carry out human subject tests to isolate the effect of algorithmic explanations on model interpretability. Clear evidence of method effectiveness is found in very few cases. Our results provide the first reliable and comprehensive estimates of how explanations influence simulatability.
arXiv Detail & Related papers (2020-05-04T20:35:17Z)

This list is automatically generated from the titles and abstracts of the papers in this site.