Related papers: Adversarial Cooperative Rationalization: The Risk of Spurious Correlations in Even Clean Datasets

Adversarial Cooperative Rationalization: The Risk of Spurious Correlations in Even Clean Datasets

URL: http://arxiv.org/abs/2505.02118v5
Date: Wed, 06 Aug 2025 11:31:36 GMT
Title: Adversarial Cooperative Rationalization: The Risk of Spurious Correlations in Even Clean Datasets
Authors: Wei Liu, Zhongyu Niu, Lang Gao, Zhiying Deng, Jun Wang, Haozhao Wang, Ruixuan Li,
Abstract summary: This study investigates the self-rationalization framework constructed with a cooperative game.<n>We first uncover a potential caveat: such a cooperative game could unintentionally introduce a sampling bias during rationale extraction.
Score: 16.29120098985359
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: This study investigates the self-rationalization framework constructed with a cooperative game, where a generator initially extracts the most informative segment from raw input, and a subsequent predictor utilizes the selected subset for its input. The generator and predictor are trained collaboratively to maximize prediction accuracy. In this paper, we first uncover a potential caveat: such a cooperative game could unintentionally introduce a sampling bias during rationale extraction. Specifically, the generator might inadvertently create an incorrect correlation between the selected rationale candidate and the label, even when they are semantically unrelated in the original dataset. Subsequently, we elucidate the origins of this bias using both detailed theoretical analysis and empirical evidence. Our findings suggest a direction for inspecting these correlations through attacks, based on which we further introduce an instruction to prevent the predictor from learning the correlations. Through experiments on six text classification datasets and two graph classification datasets using three network architectures (GRUs, BERT, and GCN), we show that our method not only significantly outperforms recent rationalization methods, but also achieves comparable or even better results than a representative LLM (llama3.1-8b-instruct).

Related papers

A Reverse Causal Framework to Mitigate Spurious Correlations for Debiasing Scene Graph Generation [59.473751744275496]
Scene Graph Generation (SGG) frameworks typically incorporate a detector to extract relationship features and a classifier to categorize these relationships.<n>Such a causal chain structure can yield spurious correlations between the detector's inputs and the final predictions.<n>We propose reconstructing the causal chain structure into a reverse causal structure, wherein the classifier's inputs are treated as the confounder.
arXiv Detail & Related papers (2025-05-29T13:57:01Z)
ShortcutProbe: Probing Prediction Shortcuts for Learning Robust Models [26.544938760265136]
Deep learning models inadvertently learn spurious correlations between targets and non-essential features.<n>In this paper, we propose a novel post hoc spurious bias mitigation framework without requiring group labels.<n>Our framework, termed ShortcutProbe, identifies prediction shortcuts that reflect potential non-robustness in predictions in a given model's latent space.
arXiv Detail & Related papers (2025-05-20T04:21:17Z)
Ranking and Combining Latent Structured Predictive Scores without Labeled Data [2.5064967708371553]
This paper introduces a novel structured unsupervised ensemble learning model (SUEL) It exploits the dependency between a set of predictors with continuous predictive scores, rank the predictors without labeled data and combine them to an ensembled score with weights. The efficacy of the proposed methods is rigorously assessed through both simulation studies and real-world application of risk genes discovery.
arXiv Detail & Related papers (2024-08-14T20:14:42Z)
Learning Robust Classifiers with Self-Guided Spurious Correlation Mitigation [26.544938760265136]
Deep neural classifiers rely on spurious correlations between spurious attributes of inputs and targets to make predictions. We propose a self-guided spurious correlation mitigation framework. We show that training the classifier to distinguish different prediction behaviors reduces its reliance on spurious correlations without knowing them a priori.
arXiv Detail & Related papers (2024-05-06T17:12:21Z)
Enhancing the Rationale-Input Alignment for Self-explaining Rationalization [22.74436500022893]
We introduce a novel approach called DAR (textbfDiscriminatively textbfAligned textbfRationalization) to align the selected rationale and the original input. Experiments on two widely used real-world benchmarks show that the proposed method significantly improves the explanation quality.
arXiv Detail & Related papers (2023-12-07T07:37:15Z)
XAL: EXplainable Active Learning Makes Classifiers Better Low-resource Learners [71.8257151788923]
We propose a novel Explainable Active Learning framework (XAL) for low-resource text classification.<n>XAL encourages classifiers to justify their inferences and delve into unlabeled data for which they cannot provide reasonable explanations.<n>Experiments on six datasets show that XAL achieves consistent improvement over 9 strong baselines.
arXiv Detail & Related papers (2023-10-09T08:07:04Z)
Decoupled Rationalization with Asymmetric Learning Rates: A Flexible Lipschitz Restraint [16.54547887989801]
Self-explaining rationalization model is generally constructed by a cooperative game where a generator selects the most human-intelligible pieces from the input text as rationales, followed by a predictor that makes predictions based on the selected rationales. Such a cooperative game may incur the degeneration problem where the predictor overfits to the uninformative pieces generated by a not yet well-trained generator and in turn, leads the generator to converge to a sub-optimal model that tends to select senseless pieces. We empirically propose a simple but effective method named DR, which can naturally and flexibly restrain the Lipschitz constant of the
arXiv Detail & Related papers (2023-05-23T02:01:13Z)
An Evaluation Study of Generative Adversarial Networks for Collaborative Filtering [75.83628561622287]
This work successfully replicates the results published in the original paper and discusses the impact of certain differences between the CFGAN framework and the model used in the original evaluation. The work further expands the experimental analysis comparing CFGAN against a selection of simple and well-known properly optimized baselines, observing that CFGAN is not consistently competitive against them despite its high computational cost.
arXiv Detail & Related papers (2022-01-05T20:53:27Z)
Understanding Interlocking Dynamics of Cooperative Rationalization [90.6863969334526]
Selective rationalization explains the prediction of complex neural networks by finding a small subset of the input that is sufficient to predict the neural model output. We reveal a major problem with such cooperative rationalization paradigm -- model interlocking. We propose a new rationalization framework, called A2R, which introduces a third component into the architecture, a predictor driven by soft attention as opposed to selection.
arXiv Detail & Related papers (2021-10-26T17:39:18Z)
Learning Bias-Invariant Representation by Cross-Sample Mutual Information Minimization [77.8735802150511]
We propose a cross-sample adversarial debiasing (CSAD) method to remove the bias information misused by the target task. The correlation measurement plays a critical role in adversarial debiasing and is conducted by a cross-sample neural mutual information estimator. We conduct thorough experiments on publicly available datasets to validate the advantages of the proposed method over state-of-the-art approaches.
arXiv Detail & Related papers (2021-08-11T21:17:02Z)
Decorrelated Clustering with Data Selection Bias [55.91842043124102]
We propose a novel Decorrelation regularized K-Means algorithm (DCKM) for clustering with data selection bias. Our DCKM algorithm achieves significant performance gains, indicating the necessity of removing unexpected feature correlations induced by selection bias.
arXiv Detail & Related papers (2020-06-29T08:55:50Z)

This list is automatically generated from the titles and abstracts of the papers in this site.