Related papers: Learnable Game-theoretic Policy Optimization for Data-centric Self-explanation Rationalization

Learnable Game-theoretic Policy Optimization for Data-centric Self-explanation Rationalization

URL: http://arxiv.org/abs/2510.13393v1
Date: Wed, 15 Oct 2025 10:42:52 GMT
Title: Learnable Game-theoretic Policy Optimization for Data-centric Self-explanation Rationalization
Authors: Yunxiao Zhao, Zhiqiang Wang, Xingtong Yu, Xiaoli Li, Jiye Liang, Ru Li,
Abstract summary: We study a cooperative game model where a generator generates the most human-intelligible parts of the input and a predictor makes predictions based on these generated rationales.<n>Conventional rationalization methods are suffering from a problem called mode collapse, in which the predictor produces correct predictions yet the generator consistently outputs rationales with collapsed patterns.<n>We propose a novel approach, Game-theoretic Policy Optimization oriented RATionalization, which introduces policy interventions to address the game equilibrium.
Score: 39.7708117567249
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Rationalization, a data-centric framework, aims to build self-explanatory models to explain the prediction outcome by generating a subset of human-intelligible pieces of the input data. It involves a cooperative game model where a generator generates the most human-intelligible parts of the input (i.e., rationales), followed by a predictor that makes predictions based on these generated rationales. Conventional rationalization methods typically impose constraints via regularization terms to calibrate or penalize undesired generation. However, these methods are suffering from a problem called mode collapse, in which the predictor produces correct predictions yet the generator consistently outputs rationales with collapsed patterns. Moreover, existing studies are typically designed separately for specific collapsed patterns, lacking a unified consideration. In this paper, we systematically revisit cooperative rationalization from a novel game-theoretic perspective and identify the fundamental cause of this problem: the generator no longer tends to explore new strategies to uncover informative rationales, ultimately leading the system to converge to a suboptimal game equilibrium (correct predictions v.s collapsed rationales). To solve this problem, we then propose a novel approach, Game-theoretic Policy Optimization oriented RATionalization (PORAT), which progressively introduces policy interventions to address the game equilibrium in the cooperative game process, thereby guiding the model toward a more optimal solution state. We theoretically analyse the cause of such a suboptimal equilibrium and prove the feasibility of the proposed method. Furthermore, we validate our method on nine widely used real-world datasets and two synthetic settings, where PORAT achieves up to 8.1% performance improvements over existing state-of-the-art methods.

Related papers

The Stability of Online Algorithms in Performative Prediction [32.283056647528845]
We show that any no-regret algorithm deployed in performative settings converges to a (mixed) performatively stable equilibrium.<n>Our work sheds light why common algorithms, like gradient descent, are naturally stabilizing and prevent runaway feedback loops.
arXiv Detail & Related papers (2026-02-27T17:35:03Z)
Adversarial Attack for Explanation Robustness of Rationalization Models [17.839644167949906]
Rationalization models select a subset of input text as rationale-crucial for humans to understand and trust predictions. This paper aims to undermine the explainability of rationalization models without altering their predictions, thereby eliciting distrust in these models from human users.
arXiv Detail & Related papers (2024-08-20T12:43:58Z)
Sequential Manipulation Against Rank Aggregation: Theory and Algorithm [119.57122943187086]
We leverage an online attack on the vulnerable data collection process. From the game-theoretic perspective, the confrontation scenario is formulated as a distributionally robust game. The proposed method manipulates the results of rank aggregation methods in a sequential manner.
arXiv Detail & Related papers (2024-07-02T03:31:21Z)
Source-Free Unsupervised Domain Adaptation with Hypothesis Consolidation of Prediction Rationale [53.152460508207184]
Source-Free Unsupervised Domain Adaptation (SFUDA) is a challenging task where a model needs to be adapted to a new domain without access to target domain labels or source domain data. This paper proposes a novel approach that considers multiple prediction hypotheses for each sample and investigates the rationale behind each hypothesis. To achieve the optimal performance, we propose a three-step adaptation process: model pre-adaptation, hypothesis consolidation, and semi-supervised learning.
arXiv Detail & Related papers (2024-02-02T05:53:22Z)
Enhancing the Rationale-Input Alignment for Self-explaining Rationalization [22.74436500022893]
We introduce a novel approach called DAR (textbfDiscriminatively textbfAligned textbfRationalization) to align the selected rationale and the original input. Experiments on two widely used real-world benchmarks show that the proposed method significantly improves the explanation quality.
arXiv Detail & Related papers (2023-12-07T07:37:15Z)
Unsupervised Selective Rationalization with Noise Injection [7.17737088382948]
unsupervised selective rationalization produces rationales alongside predictions by chaining two jointly-trained components, a rationale generator and a predictor. We introduce a novel training technique that effectively limits generation of implausible rationales by injecting noise between the generator and the predictor. We achieve sizeable improvements in rationale plausibility and task accuracy over the state-of-the-art across a variety of tasks, including our new benchmark.
arXiv Detail & Related papers (2023-05-27T17:34:36Z)
Decoupled Rationalization with Asymmetric Learning Rates: A Flexible Lipschitz Restraint [16.54547887989801]
Self-explaining rationalization model is generally constructed by a cooperative game where a generator selects the most human-intelligible pieces from the input text as rationales, followed by a predictor that makes predictions based on the selected rationales. Such a cooperative game may incur the degeneration problem where the predictor overfits to the uninformative pieces generated by a not yet well-trained generator and in turn, leads the generator to converge to a sub-optimal model that tends to select senseless pieces. We empirically propose a simple but effective method named DR, which can naturally and flexibly restrain the Lipschitz constant of the
arXiv Detail & Related papers (2023-05-23T02:01:13Z)
Extension of Dynamic Mode Decomposition for dynamic systems with incomplete information based on t-model of optimal prediction [69.81996031777717]
The Dynamic Mode Decomposition has proved to be a very efficient technique to study dynamic data. The application of this approach becomes problematic if the available data is incomplete because some dimensions of smaller scale either missing or unmeasured. We consider a first-order approximation of the Mori-Zwanzig decomposition, state the corresponding optimization problem and solve it with the gradient-based optimization method.
arXiv Detail & Related papers (2022-02-23T11:23:59Z)
Test-time Collective Prediction [73.74982509510961]
Multiple parties in machine learning want to jointly make predictions on future test points. Agents wish to benefit from the collective expertise of the full set of agents, but may not be willing to release their data or model parameters. We explore a decentralized mechanism to make collective predictions at test time, leveraging each agent's pre-trained model.
arXiv Detail & Related papers (2021-06-22T18:29:58Z)
Double Robust Representation Learning for Counterfactual Prediction [68.78210173955001]
We propose a novel scalable method to learn double-robust representations for counterfactual predictions. We make robust and efficient counterfactual predictions for both individual and average treatment effects. The algorithm shows competitive performance with the state-of-the-art on real world and synthetic data.
arXiv Detail & Related papers (2020-10-15T16:39:26Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.