Related papers: Understanding Interlocking Dynamics of Cooperative Rationalization

Understanding Interlocking Dynamics of Cooperative Rationalization

URL: http://arxiv.org/abs/2110.13880v1
Date: Tue, 26 Oct 2021 17:39:18 GMT
Title: Understanding Interlocking Dynamics of Cooperative Rationalization
Authors: Mo Yu, Yang Zhang, Shiyu Chang, Tommi S. Jaakkola
Abstract summary: Selective rationalization explains the prediction of complex neural networks by finding a small subset of the input that is sufficient to predict the neural model output. We reveal a major problem with such cooperative rationalization paradigm -- model interlocking. We propose a new rationalization framework, called A2R, which introduces a third component into the architecture, a predictor driven by soft attention as opposed to selection.
Score: 90.6863969334526
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Selective rationalization explains the prediction of complex neural networks by finding a small subset of the input that is sufficient to predict the neural model output. The selection mechanism is commonly integrated into the model itself by specifying a two-component cascaded system consisting of a rationale generator, which makes a binary selection of the input features (which is the rationale), and a predictor, which predicts the output based only on the selected features. The components are trained jointly to optimize prediction performance. In this paper, we reveal a major problem with such cooperative rationalization paradigm -- model interlocking. Interlocking arises when the predictor overfits to the features selected by the generator thus reinforcing the generator's selection even if the selected rationales are sub-optimal. The fundamental cause of the interlocking problem is that the rationalization objective to be minimized is concave with respect to the generator's selection policy. We propose a new rationalization framework, called A2R, which introduces a third component into the architecture, a predictor driven by soft attention as opposed to selection. The generator now realizes both soft and hard attention over the features and these are fed into the two different predictors. While the generator still seeks to support the original predictor performance, it also minimizes a gap between the two predictors. As we will show theoretically, since the attention-based predictor exhibits a better convexity property, A2R can overcome the concavity barrier. Our experiments on two synthetic benchmarks and two real datasets demonstrate that A2R can significantly alleviate the interlock problem and find explanations that better align with human judgments. We release our code at https://github.com/Gorov/Understanding_Interlocking.

Related papers

Interlocking-free Selective Rationalization Through Genetic-based Learning [7.504525967508676]
We present GenSPP, the first interlocking-free architecture for selective rationalization that does not require any learning overhead. Experiments on a synthetic and a real-world benchmark show that our model outperforms several state-of-the-art competitors.
arXiv Detail & Related papers (2024-12-13T17:52:48Z)
Enhancing the Rationale-Input Alignment for Self-explaining Rationalization [22.74436500022893]
We introduce a novel approach called DAR (textbfDiscriminatively textbfAligned textbfRationalization) to align the selected rationale and the original input. Experiments on two widely used real-world benchmarks show that the proposed method significantly improves the explanation quality.
arXiv Detail & Related papers (2023-12-07T07:37:15Z)
Unsupervised Selective Rationalization with Noise Injection [7.17737088382948]
unsupervised selective rationalization produces rationales alongside predictions by chaining two jointly-trained components, a rationale generator and a predictor. We introduce a novel training technique that effectively limits generation of implausible rationales by injecting noise between the generator and the predictor. We achieve sizeable improvements in rationale plausibility and task accuracy over the state-of-the-art across a variety of tasks, including our new benchmark.
arXiv Detail & Related papers (2023-05-27T17:34:36Z)
Decoupled Rationalization with Asymmetric Learning Rates: A Flexible Lipschitz Restraint [16.54547887989801]
Self-explaining rationalization model is generally constructed by a cooperative game where a generator selects the most human-intelligible pieces from the input text as rationales, followed by a predictor that makes predictions based on the selected rationales. Such a cooperative game may incur the degeneration problem where the predictor overfits to the uninformative pieces generated by a not yet well-trained generator and in turn, leads the generator to converge to a sub-optimal model that tends to select senseless pieces. We empirically propose a simple but effective method named DR, which can naturally and flexibly restrain the Lipschitz constant of the
arXiv Detail & Related papers (2023-05-23T02:01:13Z)
Rationalizing Predictions by Adversarial Information Calibration [65.19407304154177]
We train two models jointly: one is a typical neural model that solves the task at hand in an accurate but black-box manner, and the other is a selector-predictor model that additionally produces a rationale for its prediction. We use an adversarial technique to calibrate the information extracted by the two models such that the difference between them is an indicator of the missed or over-selected features.
arXiv Detail & Related papers (2023-01-15T03:13:09Z)
Meta-Wrapper: Differentiable Wrapping Operator for User Interest Selection in CTR Prediction [97.99938802797377]
Click-through rate (CTR) prediction, whose goal is to predict the probability of the user to click on an item, has become increasingly significant in recommender systems. Recent deep learning models with the ability to automatically extract the user interest from his/her behaviors have achieved great success. We propose a novel approach under the framework of the wrapper method, which is named Meta-Wrapper.
arXiv Detail & Related papers (2022-06-28T03:28:15Z)
Holistic Transformer: A Joint Neural Network for Trajectory Prediction and Decision-Making of Autonomous Vehicles [15.024503096898634]
Trajectory prediction and behavioral decision-making are important tasks for autonomous vehicles. A joint neural network that combines multiple cues is proposed to predict trajectories and make behavioral decisions simultaneously.
arXiv Detail & Related papers (2022-06-17T14:38:11Z)
Rationales for Sequential Predictions [117.93025782838123]
Sequence models are a critical component of modern NLP systems, but their predictions are difficult to explain. We consider model explanations though rationales, subsets of context that can explain individual model predictions. We propose an efficient greedy algorithm to approximate this objective.
arXiv Detail & Related papers (2021-09-14T01:25:15Z)
On the Reproducibility of Neural Network Predictions [52.47827424679645]
We study the problem of churn, identify factors that cause it, and propose two simple means of mitigating it. We first demonstrate that churn is indeed an issue, even for standard image classification tasks. We propose using emphminimum entropy regularizers to increase prediction confidences. We present empirical results showing the effectiveness of both techniques in reducing churn while improving the accuracy of the underlying model.
arXiv Detail & Related papers (2021-02-05T18:51:01Z)
Invariant Rationalization [84.1861516092232]
A typical rationalization criterion, i.e. maximum mutual information (MMI), finds the rationale that maximizes the prediction performance based only on the rationale. We introduce a game-theoretic invariant rationalization criterion where the rationales are constrained to enable the same predictor to be optimal across different environments. We show both theoretically and empirically that the proposed rationales can rule out spurious correlations, generalize better to different test scenarios, and align better with human judgments.
arXiv Detail & Related papers (2020-03-22T00:50:27Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.