The Irrationality of Neural Rationale Models
- URL: http://arxiv.org/abs/2110.07550v1
- Date: Thu, 14 Oct 2021 17:22:10 GMT
- Title: The Irrationality of Neural Rationale Models
- Authors: Yiming Zheng, Serena Booth, Julie Shah, Yilun Zhou
- Abstract summary: We argue to the contrary, with both philosophical perspectives and empirical evidence suggesting that rationale models are, perhaps, less rational and interpretable than expected.
We call for more rigorous and comprehensive evaluations of these models to ensure desired properties of interpretability are indeed achieved.
- Score: 6.159428088113691
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Neural rationale models are popular for interpretable predictions of NLP
tasks. In these, a selector extracts segments of the input text, called
rationales, and passes these segments to a classifier for prediction. Since the
rationale is the only information accessible to the classifier, it is plausibly
defined as the explanation. Is such a characterization unconditionally correct?
In this paper, we argue to the contrary, with both philosophical perspectives
and empirical evidence suggesting that rationale models are, perhaps, less
rational and interpretable than expected. We call for more rigorous and
comprehensive evaluations of these models to ensure desired properties of
interpretability are indeed achieved. The code can be found at
https://github.com/yimingz89/Neural-Rationale-Analysis.
Related papers
- AURA: Natural Language Reasoning for Aleatoric Uncertainty in Rationales [0.0]
Rationales behind answers not only explain model decisions but boost language models to reason well on complex reasoning tasks.
It is non-trivial to estimate the degree to which the rationales are faithful enough to encourage model performance.
We propose how to deal with imperfect rationales causing aleatoric uncertainty.
arXiv Detail & Related papers (2024-02-22T07:12:34Z) - Plausible Extractive Rationalization through Semi-Supervised Entailment
Signal [33.35604728012685]
We take a semi-supervised approach to optimize for the plausibility of extracted rationales.
We adopt a pre-trained natural language inference (NLI) model and further fine-tune it on a small set of supervised rationales.
We show that, by enforcing the alignment agreement between the explanation and answer in a question-answering task, the performance can be improved without access to ground truth labels.
arXiv Detail & Related papers (2024-02-13T14:12:32Z) - You Only Forward Once: Prediction and Rationalization in A Single
Forward Pass [10.998983921416533]
Unsupervised rationale extraction aims to extract concise and contiguous text snippets to support model predictions without any rationale.
Previous studies have used a two-phase framework known as the Rationalizing Neural Prediction (RNP) framework, which follows a generate-then-predict paradigm.
We propose a novel single-phase framework called You Only Forward Once (YOFO), derived from a relaxed version of rationale where rationales aim to support model predictions rather than make predictions.
arXiv Detail & Related papers (2023-11-04T08:04:28Z) - Rationalizing Predictions by Adversarial Information Calibration [65.19407304154177]
We train two models jointly: one is a typical neural model that solves the task at hand in an accurate but black-box manner, and the other is a selector-predictor model that additionally produces a rationale for its prediction.
We use an adversarial technique to calibrate the information extracted by the two models such that the difference between them is an indicator of the missed or over-selected features.
arXiv Detail & Related papers (2023-01-15T03:13:09Z) - Discovering Invariant Rationales for Graph Neural Networks [104.61908788639052]
Intrinsic interpretability of graph neural networks (GNNs) is to find a small subset of the input graph's features.
We propose a new strategy of discovering invariant rationale (DIR) to construct intrinsically interpretable GNNs.
arXiv Detail & Related papers (2022-01-30T16:43:40Z) - Instance-Based Neural Dependency Parsing [56.63500180843504]
We develop neural models that possess an interpretable inference process for dependency parsing.
Our models adopt instance-based inference, where dependency edges are extracted and labeled by comparing them to edges in a training set.
arXiv Detail & Related papers (2021-09-28T05:30:52Z) - Rationales for Sequential Predictions [117.93025782838123]
Sequence models are a critical component of modern NLP systems, but their predictions are difficult to explain.
We consider model explanations though rationales, subsets of context that can explain individual model predictions.
We propose an efficient greedy algorithm to approximate this objective.
arXiv Detail & Related papers (2021-09-14T01:25:15Z) - Measuring Association Between Labels and Free-Text Rationales [60.58672852655487]
In interpretable NLP, we require faithful rationales that reflect the model's decision-making process for an explained instance.
We demonstrate that pipelines, existing models for faithful extractive rationalization on information-extraction style tasks, do not extend as reliably to "reasoning" tasks requiring free-text rationales.
We turn to models that jointly predict and rationalize, a class of widely used high-performance models for free-text rationalization whose faithfulness is not yet established.
arXiv Detail & Related papers (2020-10-24T03:40:56Z) - Learning to Faithfully Rationalize by Construction [36.572594249534866]
In many settings it is important to be able to understand why a model made a particular prediction.
We propose a simpler variant of this approach that provides faithful explanations by construction.
In both automatic and manual evaluations we find that variants of this simple framework yield superior to end-to-end' approaches.
arXiv Detail & Related papers (2020-04-30T21:45:40Z) - Invariant Rationalization [84.1861516092232]
A typical rationalization criterion, i.e. maximum mutual information (MMI), finds the rationale that maximizes the prediction performance based only on the rationale.
We introduce a game-theoretic invariant rationalization criterion where the rationales are constrained to enable the same predictor to be optimal across different environments.
We show both theoretically and empirically that the proposed rationales can rule out spurious correlations, generalize better to different test scenarios, and align better with human judgments.
arXiv Detail & Related papers (2020-03-22T00:50:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.