Related papers: You Only Forward Once: Prediction and Rationalization in A Single Forward Pass

You Only Forward Once: Prediction and Rationalization in A Single Forward Pass

URL: http://arxiv.org/abs/2311.02344v1
Date: Sat, 4 Nov 2023 08:04:28 GMT
Title: You Only Forward Once: Prediction and Rationalization in A Single Forward Pass
Authors: Han Jiang, Junwen Duan, Zhe Qu, and Jianxin Wang
Abstract summary: Unsupervised rationale extraction aims to extract concise and contiguous text snippets to support model predictions without any rationale. Previous studies have used a two-phase framework known as the Rationalizing Neural Prediction (RNP) framework, which follows a generate-then-predict paradigm. We propose a novel single-phase framework called You Only Forward Once (YOFO), derived from a relaxed version of rationale where rationales aim to support model predictions rather than make predictions.
Score: 10.998983921416533
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Unsupervised rationale extraction aims to extract concise and contiguous text snippets to support model predictions without any annotated rationale. Previous studies have used a two-phase framework known as the Rationalizing Neural Prediction (RNP) framework, which follows a generate-then-predict paradigm. They assumed that the extracted explanation, called rationale, should be sufficient to predict the golden label. However, the assumption above deviates from the original definition and is too strict to perform well. Furthermore, these two-phase models suffer from the interlocking problem and spurious correlations. To solve the above problems, we propose a novel single-phase framework called You Only Forward Once (YOFO), derived from a relaxed version of rationale where rationales aim to support model predictions rather than make predictions. In our framework, A pre-trained language model like BERT is deployed to simultaneously perform prediction and rationalization with less impact from interlocking or spurious correlations. Directly choosing the important tokens in an unsupervised manner is intractable. Instead of directly choosing the important tokens, YOFO gradually removes unimportant tokens during forward propagation. Through experiments on the BeerAdvocate and Hotel Review datasets, we demonstrate that our model is able to extract rationales and make predictions more accurately compared to RNP-based models. We observe an improvement of up to 18.4\% in token-level F1 compared to previous state-of-the-art methods. We also conducted analyses and experiments to explore the extracted rationales and token decay strategies. The results show that YOFO can extract precise and important rationales while removing unimportant tokens in the middle part of the model.

Related papers

ShortcutProbe: Probing Prediction Shortcuts for Learning Robust Models [26.544938760265136]
Deep learning models inadvertently learn spurious correlations between targets and non-essential features.<n>In this paper, we propose a novel post hoc spurious bias mitigation framework without requiring group labels.<n>Our framework, termed ShortcutProbe, identifies prediction shortcuts that reflect potential non-robustness in predictions in a given model's latent space.
arXiv Detail & Related papers (2025-05-20T04:21:17Z)
Plausible Extractive Rationalization through Semi-Supervised Entailment Signal [29.67884478799914]
We take a semi-supervised approach to optimize for the plausibility of extracted rationales. We adopt a pre-trained natural language inference (NLI) model and further fine-tune it on a small set of supervised rationales. We show that, by enforcing the alignment agreement between the explanation and answer in a question-answering task, the performance can be improved without access to ground truth labels.
arXiv Detail & Related papers (2024-02-13T14:12:32Z)
Unsupervised Selective Rationalization with Noise Injection [7.17737088382948]
unsupervised selective rationalization produces rationales alongside predictions by chaining two jointly-trained components, a rationale generator and a predictor. We introduce a novel training technique that effectively limits generation of implausible rationales by injecting noise between the generator and the predictor. We achieve sizeable improvements in rationale plausibility and task accuracy over the state-of-the-art across a variety of tasks, including our new benchmark.
arXiv Detail & Related papers (2023-05-27T17:34:36Z)
Rationalizing Predictions by Adversarial Information Calibration [65.19407304154177]
We train two models jointly: one is a typical neural model that solves the task at hand in an accurate but black-box manner, and the other is a selector-predictor model that additionally produces a rationale for its prediction. We use an adversarial technique to calibrate the information extracted by the two models such that the difference between them is an indicator of the missed or over-selected features.
arXiv Detail & Related papers (2023-01-15T03:13:09Z)
Extracting or Guessing? Improving Faithfulness of Event Temporal Relation Extraction [87.04153383938969]
We improve the faithfulness of TempRel extraction models from two perspectives. The first perspective is to extract genuinely based on contextual description. The second perspective is to provide proper uncertainty estimation.
arXiv Detail & Related papers (2022-10-10T19:53:13Z)
Neuro-Symbolic Entropy Regularization [78.16196949641079]
In structured prediction, the goal is to jointly predict many output variables that together encode a structured object. One approach -- entropy regularization -- posits that decision boundaries should lie in low-probability regions. We propose a loss, neuro-symbolic entropy regularization, that encourages the model to confidently predict a valid object.
arXiv Detail & Related papers (2022-01-25T06:23:10Z)
Understanding Interlocking Dynamics of Cooperative Rationalization [90.6863969334526]
Selective rationalization explains the prediction of complex neural networks by finding a small subset of the input that is sufficient to predict the neural model output. We reveal a major problem with such cooperative rationalization paradigm -- model interlocking. We propose a new rationalization framework, called A2R, which introduces a third component into the architecture, a predictor driven by soft attention as opposed to selection.
arXiv Detail & Related papers (2021-10-26T17:39:18Z)
Rationales for Sequential Predictions [117.93025782838123]
Sequence models are a critical component of modern NLP systems, but their predictions are difficult to explain. We consider model explanations though rationales, subsets of context that can explain individual model predictions. We propose an efficient greedy algorithm to approximate this objective.
arXiv Detail & Related papers (2021-09-14T01:25:15Z)
Learning from the Best: Rationalizing Prediction by Adversarial Information Calibration [39.685626118667074]
We train two models jointly: one is a typical neural model that solves the task at hand in an accurate but black-box manner, and the other is a selector-predictor model that additionally produces a rationale for its prediction. We use an adversarial-based technique to calibrate the information extracted by the two models. For natural language tasks, we propose to use a language-model-based regularizer to encourage the extraction of fluent rationales.
arXiv Detail & Related papers (2020-12-16T11:54:15Z)
Understanding Neural Abstractive Summarization Models via Uncertainty [54.37665950633147]
seq2seq abstractive summarization models generate text in a free-form manner. We study the entropy, or uncertainty, of the model's token-level predictions. We show that uncertainty is a useful perspective for analyzing summarization and text generation models more broadly.
arXiv Detail & Related papers (2020-10-15T16:57:27Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.