Enhancing the Rationale-Input Alignment for Self-explaining
Rationalization
- URL: http://arxiv.org/abs/2312.04103v2
- Date: Fri, 15 Dec 2023 03:04:04 GMT
- Title: Enhancing the Rationale-Input Alignment for Self-explaining
Rationalization
- Authors: Wei Liu, Haozhao Wang, Jun Wang, Zhiying Deng, YuanKai Zhang, Cheng
Wang, Ruixuan Li
- Abstract summary: We introduce a novel approach called DAR (textbfDiscriminatively textbfAligned textbfRationalization) to align the selected rationale and the original input.
Experiments on two widely used real-world benchmarks show that the proposed method significantly improves the explanation quality.
- Score: 22.74436500022893
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Rationalization empowers deep learning models with self-explaining
capabilities through a cooperative game, where a generator selects a
semantically consistent subset of the input as a rationale, and a subsequent
predictor makes predictions based on the selected rationale. In this paper, we
discover that rationalization is prone to a problem named \emph{rationale
shift}, which arises from the algorithmic bias of the cooperative game.
Rationale shift refers to a situation where the semantics of the selected
rationale may deviate from the original input, but the predictor still produces
accurate predictions based on the deviation, resulting in a compromised
generator with misleading feedback.
To address this issue, we first demonstrate the importance of the alignment
between the rationale and the full input through both empirical observations
and theoretical analysis. Subsequently, we introduce a novel approach called
DAR (\textbf{D}iscriminatively \textbf{A}ligned \textbf{R}ationalization),
which utilizes an auxiliary module pretrained on the full input to
discriminatively align the selected rationale and the original input. We
theoretically illustrate how DAR accomplishes the desired alignment, thereby
overcoming the rationale shift problem. The experiments on two widely used
real-world benchmarks show that the proposed method significantly improves the
explanation quality (measured by the overlap between the model-selected
explanation and the human-annotated rationale) as compared to state-of-the-art
techniques. Additionally, results on two synthetic settings further validate
the effectiveness of DAR in addressing the rationale shift problem.
Related papers
- Towards Trustworthy Explanation: On Causal Rationalization [9.48539398357156]
We propose a new model of rationalization based on two causal desiderata, non-spuriousness and efficiency.
The superior performance of the proposed causal rationalization is demonstrated on real-world review and medical datasets.
arXiv Detail & Related papers (2023-06-25T03:34:06Z) - Decoupled Rationalization with Asymmetric Learning Rates: A Flexible
Lipschitz Restraint [16.54547887989801]
Self-explaining rationalization model is generally constructed by a cooperative game where a generator selects the most human-intelligible pieces from the input text as rationales, followed by a predictor that makes predictions based on the selected rationales.
Such a cooperative game may incur the degeneration problem where the predictor overfits to the uninformative pieces generated by a not yet well-trained generator and in turn, leads the generator to converge to a sub-optimal model that tends to select senseless pieces.
We empirically propose a simple but effective method named DR, which can naturally and flexibly restrain the Lipschitz constant of the
arXiv Detail & Related papers (2023-05-23T02:01:13Z) - Rationalizing Predictions by Adversarial Information Calibration [65.19407304154177]
We train two models jointly: one is a typical neural model that solves the task at hand in an accurate but black-box manner, and the other is a selector-predictor model that additionally produces a rationale for its prediction.
We use an adversarial technique to calibrate the information extracted by the two models such that the difference between them is an indicator of the missed or over-selected features.
arXiv Detail & Related papers (2023-01-15T03:13:09Z) - Logical Satisfiability of Counterfactuals for Faithful Explanations in
NLI [60.142926537264714]
We introduce the methodology of Faithfulness-through-Counterfactuals.
It generates a counterfactual hypothesis based on the logical predicates expressed in the explanation.
It then evaluates if the model's prediction on the counterfactual is consistent with that expressed logic.
arXiv Detail & Related papers (2022-05-25T03:40:59Z) - Understanding Interlocking Dynamics of Cooperative Rationalization [90.6863969334526]
Selective rationalization explains the prediction of complex neural networks by finding a small subset of the input that is sufficient to predict the neural model output.
We reveal a major problem with such cooperative rationalization paradigm -- model interlocking.
We propose a new rationalization framework, called A2R, which introduces a third component into the architecture, a predictor driven by soft attention as opposed to selection.
arXiv Detail & Related papers (2021-10-26T17:39:18Z) - Rationales for Sequential Predictions [117.93025782838123]
Sequence models are a critical component of modern NLP systems, but their predictions are difficult to explain.
We consider model explanations though rationales, subsets of context that can explain individual model predictions.
We propose an efficient greedy algorithm to approximate this objective.
arXiv Detail & Related papers (2021-09-14T01:25:15Z) - SPECTRA: Sparse Structured Text Rationalization [0.0]
We present a unified framework for deterministic extraction of structured explanations via constrained inference on a factor graph.
Our approach greatly eases training and rationale regularization, generally outperforming previous work on plausibility extracted explanations.
arXiv Detail & Related papers (2021-09-09T20:39:56Z) - Distribution Matching for Rationalization [30.54889533406428]
rationalization aims to extract pieces of input text as rationales to justify neural network predictions on text classification tasks.
We propose a novel rationalization method that matches the distributions of rationales and input text in both the feature space and output space.
arXiv Detail & Related papers (2021-06-01T08:49:32Z) - An Information Bottleneck Approach for Controlling Conciseness in
Rationale Extraction [84.49035467829819]
We show that it is possible to better manage this trade-off by optimizing a bound on the Information Bottleneck (IB) objective.
Our fully unsupervised approach jointly learns an explainer that predicts sparse binary masks over sentences, and an end-task predictor that considers only the extracted rationale.
arXiv Detail & Related papers (2020-05-01T23:26:41Z) - Invariant Rationalization [84.1861516092232]
A typical rationalization criterion, i.e. maximum mutual information (MMI), finds the rationale that maximizes the prediction performance based only on the rationale.
We introduce a game-theoretic invariant rationalization criterion where the rationales are constrained to enable the same predictor to be optimal across different environments.
We show both theoretically and empirically that the proposed rationales can rule out spurious correlations, generalize better to different test scenarios, and align better with human judgments.
arXiv Detail & Related papers (2020-03-22T00:50:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.