CausalAPM: Generalizable Literal Disentanglement for NLU Debiasing
- URL: http://arxiv.org/abs/2305.02865v1
- Date: Thu, 4 May 2023 14:22:26 GMT
- Title: CausalAPM: Generalizable Literal Disentanglement for NLU Debiasing
- Authors: Songyang Gao, Shihan Dou, Junjie Shan, Qi Zhang, Xuanjing Huang
- Abstract summary: We analyze the causes of dataset bias from the perspective of causal inference.
We propose CausalAPM, a generalizable literal disentangling framework to ameliorate the bias problem from feature granularity.
- Score: 47.129713744669075
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Dataset bias, i.e., the over-reliance on dataset-specific literal heuristics,
is getting increasing attention for its detrimental effect on the
generalization ability of NLU models. Existing works focus on eliminating
dataset bias by down-weighting problematic data in the training process, which
induce the omission of valid feature information while mitigating bias. In this
work, We analyze the causes of dataset bias from the perspective of causal
inference and propose CausalAPM, a generalizable literal disentangling
framework to ameliorate the bias problem from feature granularity. The proposed
approach projects literal and semantic information into independent feature
subspaces, and constrains the involvement of literal information in subsequent
predictions. Extensive experiments on three NLP benchmarks (MNLI, FEVER, and
QQP) demonstrate that our proposed framework significantly improves the OOD
generalization performance while maintaining ID performance.
Related papers
- Towards Understanding Task-agnostic Debiasing Through the Lenses of Intrinsic Bias and Forgetfulness [10.081447621656523]
The impact on language modeling ability can be alleviated given a high-quality and long-contextualized debiasing corpus.
The effectiveness of task-agnostic debiasing hinges on the quantitative bias level of both the task-specific data used for downstream applications and the debiased model.
We propose a novel framework which can Propagate Socially-fair Debiasing to Downstream Fine-tuning, ProSocialTuning.
arXiv Detail & Related papers (2024-06-06T15:11:11Z) - Causality and Independence Enhancement for Biased Node Classification [56.38828085943763]
We propose a novel Causality and Independence Enhancement (CIE) framework, applicable to various graph neural networks (GNNs)
Our approach estimates causal and spurious features at the node representation level and mitigates the influence of spurious correlations.
Our approach CIE not only significantly enhances the performance of GNNs but outperforms state-of-the-art debiased node classification methods.
arXiv Detail & Related papers (2023-10-14T13:56:24Z) - Robust Natural Language Understanding with Residual Attention Debiasing [28.53546504339952]
We propose an end-to-end debiasing method that mitigates unintended biases from attention.
Experiments show that READ significantly improves the performance of BERT-based models on OOD data with shortcuts removed.
arXiv Detail & Related papers (2023-05-28T04:25:04Z) - Feature-Level Debiased Natural Language Understanding [86.8751772146264]
Existing natural language understanding (NLU) models often rely on dataset biases to achieve high performance on specific datasets.
We propose debiasing contrastive learning (DCT) to mitigate biased latent features and neglect the dynamic nature of bias.
DCT outperforms state-of-the-art baselines on out-of-distribution datasets while maintaining in-distribution performance.
arXiv Detail & Related papers (2022-12-11T06:16:14Z) - General Greedy De-bias Learning [163.65789778416172]
We propose a General Greedy De-bias learning framework (GGD), which greedily trains the biased models and the base model like gradient descent in functional space.
GGD can learn a more robust base model under the settings of both task-specific biased models with prior knowledge and self-ensemble biased model without prior knowledge.
arXiv Detail & Related papers (2021-12-20T14:47:32Z) - Learning Bias-Invariant Representation by Cross-Sample Mutual
Information Minimization [77.8735802150511]
We propose a cross-sample adversarial debiasing (CSAD) method to remove the bias information misused by the target task.
The correlation measurement plays a critical role in adversarial debiasing and is conducted by a cross-sample neural mutual information estimator.
We conduct thorough experiments on publicly available datasets to validate the advantages of the proposed method over state-of-the-art approaches.
arXiv Detail & Related papers (2021-08-11T21:17:02Z) - Regularizing Models via Pointwise Mutual Information for Named Entity
Recognition [17.767466724342064]
We propose a Pointwise Mutual Information (PMI) to enhance generalization ability while outperforming an in-domain performance.
Our approach enables to debias highly correlated word and labels in the benchmark datasets.
For long-named and complex-structure entities, our method can predict these entities through debiasing on conjunction or special characters.
arXiv Detail & Related papers (2021-04-15T05:47:27Z) - Adversarial Filters of Dataset Biases [96.090959788952]
Large neural models have demonstrated human-level performance on language and vision benchmarks.
Their performance degrades considerably on adversarial or out-of-distribution samples.
We propose AFLite, which adversarially filters such dataset biases.
arXiv Detail & Related papers (2020-02-10T21:59:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.