You Only Forward Once: Prediction and Rationalization in A Single
Forward Pass
- URL: http://arxiv.org/abs/2311.02344v1
- Date: Sat, 4 Nov 2023 08:04:28 GMT
- Title: You Only Forward Once: Prediction and Rationalization in A Single
Forward Pass
- Authors: Han Jiang, Junwen Duan, Zhe Qu, and Jianxin Wang
- Abstract summary: Unsupervised rationale extraction aims to extract concise and contiguous text snippets to support model predictions without any rationale.
Previous studies have used a two-phase framework known as the Rationalizing Neural Prediction (RNP) framework, which follows a generate-then-predict paradigm.
We propose a novel single-phase framework called You Only Forward Once (YOFO), derived from a relaxed version of rationale where rationales aim to support model predictions rather than make predictions.
- Score: 10.998983921416533
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Unsupervised rationale extraction aims to extract concise and contiguous text
snippets to support model predictions without any annotated rationale. Previous
studies have used a two-phase framework known as the Rationalizing Neural
Prediction (RNP) framework, which follows a generate-then-predict paradigm.
They assumed that the extracted explanation, called rationale, should be
sufficient to predict the golden label. However, the assumption above deviates
from the original definition and is too strict to perform well. Furthermore,
these two-phase models suffer from the interlocking problem and spurious
correlations. To solve the above problems, we propose a novel single-phase
framework called You Only Forward Once (YOFO), derived from a relaxed version
of rationale where rationales aim to support model predictions rather than make
predictions. In our framework, A pre-trained language model like BERT is
deployed to simultaneously perform prediction and rationalization with less
impact from interlocking or spurious correlations. Directly choosing the
important tokens in an unsupervised manner is intractable. Instead of directly
choosing the important tokens, YOFO gradually removes unimportant tokens during
forward propagation. Through experiments on the BeerAdvocate and Hotel Review
datasets, we demonstrate that our model is able to extract rationales and make
predictions more accurately compared to RNP-based models. We observe an
improvement of up to 18.4\% in token-level F1 compared to previous
state-of-the-art methods. We also conducted analyses and experiments to explore
the extracted rationales and token decay strategies. The results show that YOFO
can extract precise and important rationales while removing unimportant tokens
in the middle part of the model.
Related papers
- Plausible Extractive Rationalization through Semi-Supervised Entailment Signal [29.67884478799914]
We take a semi-supervised approach to optimize for the plausibility of extracted rationales.
We adopt a pre-trained natural language inference (NLI) model and further fine-tune it on a small set of supervised rationales.
We show that, by enforcing the alignment agreement between the explanation and answer in a question-answering task, the performance can be improved without access to ground truth labels.
arXiv Detail & Related papers (2024-02-13T14:12:32Z) - Unsupervised Selective Rationalization with Noise Injection [7.17737088382948]
unsupervised selective rationalization produces rationales alongside predictions by chaining two jointly-trained components, a rationale generator and a predictor.
We introduce a novel training technique that effectively limits generation of implausible rationales by injecting noise between the generator and the predictor.
We achieve sizeable improvements in rationale plausibility and task accuracy over the state-of-the-art across a variety of tasks, including our new benchmark.
arXiv Detail & Related papers (2023-05-27T17:34:36Z) - Rationalizing Predictions by Adversarial Information Calibration [65.19407304154177]
We train two models jointly: one is a typical neural model that solves the task at hand in an accurate but black-box manner, and the other is a selector-predictor model that additionally produces a rationale for its prediction.
We use an adversarial technique to calibrate the information extracted by the two models such that the difference between them is an indicator of the missed or over-selected features.
arXiv Detail & Related papers (2023-01-15T03:13:09Z) - Extracting or Guessing? Improving Faithfulness of Event Temporal
Relation Extraction [87.04153383938969]
We improve the faithfulness of TempRel extraction models from two perspectives.
The first perspective is to extract genuinely based on contextual description.
The second perspective is to provide proper uncertainty estimation.
arXiv Detail & Related papers (2022-10-10T19:53:13Z) - Neuro-Symbolic Entropy Regularization [78.16196949641079]
In structured prediction, the goal is to jointly predict many output variables that together encode a structured object.
One approach -- entropy regularization -- posits that decision boundaries should lie in low-probability regions.
We propose a loss, neuro-symbolic entropy regularization, that encourages the model to confidently predict a valid object.
arXiv Detail & Related papers (2022-01-25T06:23:10Z) - Understanding Interlocking Dynamics of Cooperative Rationalization [90.6863969334526]
Selective rationalization explains the prediction of complex neural networks by finding a small subset of the input that is sufficient to predict the neural model output.
We reveal a major problem with such cooperative rationalization paradigm -- model interlocking.
We propose a new rationalization framework, called A2R, which introduces a third component into the architecture, a predictor driven by soft attention as opposed to selection.
arXiv Detail & Related papers (2021-10-26T17:39:18Z) - Rationales for Sequential Predictions [117.93025782838123]
Sequence models are a critical component of modern NLP systems, but their predictions are difficult to explain.
We consider model explanations though rationales, subsets of context that can explain individual model predictions.
We propose an efficient greedy algorithm to approximate this objective.
arXiv Detail & Related papers (2021-09-14T01:25:15Z) - Learning from the Best: Rationalizing Prediction by Adversarial
Information Calibration [39.685626118667074]
We train two models jointly: one is a typical neural model that solves the task at hand in an accurate but black-box manner, and the other is a selector-predictor model that additionally produces a rationale for its prediction.
We use an adversarial-based technique to calibrate the information extracted by the two models.
For natural language tasks, we propose to use a language-model-based regularizer to encourage the extraction of fluent rationales.
arXiv Detail & Related papers (2020-12-16T11:54:15Z) - Understanding Neural Abstractive Summarization Models via Uncertainty [54.37665950633147]
seq2seq abstractive summarization models generate text in a free-form manner.
We study the entropy, or uncertainty, of the model's token-level predictions.
We show that uncertainty is a useful perspective for analyzing summarization and text generation models more broadly.
arXiv Detail & Related papers (2020-10-15T16:57:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.