Debiasing Stance Detection Models with Counterfactual Reasoning and
Adversarial Bias Learning
- URL: http://arxiv.org/abs/2212.10392v1
- Date: Tue, 20 Dec 2022 16:20:56 GMT
- Title: Debiasing Stance Detection Models with Counterfactual Reasoning and
Adversarial Bias Learning
- Authors: Jianhua Yuan and Yanyan Zhao and Bing Qin
- Abstract summary: Stance detection models tend to rely on dataset bias in the text part as a shortcut.
We propose an adversarial bias learning module to model the bias more accurately.
- Score: 15.68462203989933
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Stance detection models may tend to rely on dataset bias in the text part as
a shortcut and thus fail to sufficiently learn the interaction between the
targets and texts. Recent debiasing methods usually treated features learned by
small models or big models at earlier steps as bias features and proposed to
exclude the branch learning those bias features during inference. However, most
of these methods fail to disentangle the ``good'' stance features and ``bad''
bias features in the text part. In this paper, we investigate how to mitigate
dataset bias in stance detection. Motivated by causal effects, we leverage a
novel counterfactual inference framework, which enables us to capture the
dataset bias in the text part as the direct causal effect of the text on
stances and reduce the dataset bias in the text part by subtracting the direct
text effect from the total causal effect. We novelly model bias features as
features that correlate with the stance labels but fail on intermediate stance
reasoning subtasks and propose an adversarial bias learning module to model the
bias more accurately. To verify whether our model could better model the
interaction between texts and targets, we test our model on recently proposed
test sets to evaluate the understanding of the task from various aspects.
Experiments demonstrate that our proposed method (1) could better model the
bias features, and (2) outperforms existing debiasing baselines on both the
original dataset and most of the newly constructed test sets.
Related papers
- Projective Methods for Mitigating Gender Bias in Pre-trained Language Models [10.418595661963062]
Projective methods are fast to implement, use a small number of saved parameters, and make no updates to the existing model parameters.
We find that projective methods can be effective at both intrinsic bias and downstream bias mitigation, but that the two outcomes are not necessarily correlated.
arXiv Detail & Related papers (2024-03-27T17:49:31Z) - Debiasing Vision-Language Models via Biased Prompts [79.04467131711775]
We propose a general approach for debiasing vision-language foundation models by projecting out biased directions in the text embedding.
We show that debiasing only the text embedding with a calibrated projection matrix suffices to yield robust classifiers and fair generative models.
arXiv Detail & Related papers (2023-01-31T20:09:33Z) - Feature-Level Debiased Natural Language Understanding [86.8751772146264]
Existing natural language understanding (NLU) models often rely on dataset biases to achieve high performance on specific datasets.
We propose debiasing contrastive learning (DCT) to mitigate biased latent features and neglect the dynamic nature of bias.
DCT outperforms state-of-the-art baselines on out-of-distribution datasets while maintaining in-distribution performance.
arXiv Detail & Related papers (2022-12-11T06:16:14Z) - Mind Your Bias: A Critical Review of Bias Detection Methods for
Contextual Language Models [2.170169149901781]
We conduct a rigorous analysis and comparison of bias detection methods for contextual language models.
Our results show that minor design and implementation decisions (or errors) have a substantial and often significant impact on the derived bias scores.
arXiv Detail & Related papers (2022-11-15T19:27:54Z) - Less Learn Shortcut: Analyzing and Mitigating Learning of Spurious
Feature-Label Correlation [44.319739489968164]
Deep neural networks often take dataset biases as a shortcut to make decisions rather than understand tasks.
In this study, we focus on the spurious correlation between word features and labels that models learn from the biased data distribution.
We propose a training strategy Less-Learn-Shortcut (LLS): our strategy quantifies the biased degree of the biased examples and down-weights them accordingly.
arXiv Detail & Related papers (2022-05-25T09:08:35Z) - General Greedy De-bias Learning [163.65789778416172]
We propose a General Greedy De-bias learning framework (GGD), which greedily trains the biased models and the base model like gradient descent in functional space.
GGD can learn a more robust base model under the settings of both task-specific biased models with prior knowledge and self-ensemble biased model without prior knowledge.
arXiv Detail & Related papers (2021-12-20T14:47:32Z) - Balancing out Bias: Achieving Fairness Through Training Reweighting [58.201275105195485]
Bias in natural language processing arises from models learning characteristics of the author such as gender and race.
Existing methods for mitigating and measuring bias do not directly account for correlations between author demographics and linguistic variables.
This paper introduces a very simple but highly effective method for countering bias using instance reweighting.
arXiv Detail & Related papers (2021-09-16T23:40:28Z) - Double Perturbation: On the Robustness of Robustness and Counterfactual
Bias Evaluation [109.06060143938052]
We propose a "double perturbation" framework to uncover model weaknesses beyond the test dataset.
We apply this framework to study two perturbation-based approaches that are used to analyze models' robustness and counterfactual bias in English.
arXiv Detail & Related papers (2021-04-12T06:57:36Z) - Towards Robustifying NLI Models Against Lexical Dataset Biases [94.79704960296108]
This paper explores both data-level and model-level debiasing methods to robustify models against lexical dataset biases.
First, we debias the dataset through data augmentation and enhancement, but show that the model bias cannot be fully removed via this method.
The second approach employs a bag-of-words sub-model to capture the features that are likely to exploit the bias and prevents the original model from learning these biased features.
arXiv Detail & Related papers (2020-05-10T17:56:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.