Improving the Out-Of-Distribution Generalization Capability of Language
Models: Counterfactually-Augmented Data is not Enough
- URL: http://arxiv.org/abs/2302.09345v1
- Date: Sat, 18 Feb 2023 14:39:03 GMT
- Title: Improving the Out-Of-Distribution Generalization Capability of Language
Models: Counterfactually-Augmented Data is not Enough
- Authors: Caoyun Fan, Wenqing Chen, Jidong Tian, Yitian Li, Hao He, Yaohui Jin
- Abstract summary: Counterfactually-Augmented Data (CAD) has the potential to improve language models' Out-Of-Distribution (OOD) generalization capability.
In this paper, we attribute the inefficiency to Myopia Phenomenon caused by CAD.
We design two additional constraints to help language models extract more complete causal features contained in CAD.
- Score: 19.38778317110205
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Counterfactually-Augmented Data (CAD) has the potential to improve language
models' Out-Of-Distribution (OOD) generalization capability, as CAD induces
language models to exploit causal features and exclude spurious correlations.
However, the empirical results of OOD generalization on CAD are not as
efficient as expected. In this paper, we attribute the inefficiency to Myopia
Phenomenon caused by CAD: language models only focus on causal features that
are edited in the augmentation and exclude other non-edited causal features. As
a result, the potential of CAD is not fully exploited. Based on the structural
properties of CAD, we design two additional constraints to help language models
extract more complete causal features contained in CAD, thus improving the OOD
generalization capability. We evaluate our method on two tasks: Sentiment
Analysis and Natural Language Inference, and the experimental results
demonstrate that our method could unlock CAD's potential and improve language
models' OOD generalization capability.
Related papers
- PairCFR: Enhancing Model Training on Paired Counterfactually Augmented Data through Contrastive Learning [49.60634126342945]
Counterfactually Augmented Data (CAD) involves creating new data samples by applying minimal yet sufficient modifications to flip the label of existing data samples to other classes.
Recent research reveals that training with CAD may lead models to overly focus on modified features while ignoring other important contextual information.
We employ contrastive learning to promote global feature alignment in addition to learning counterfactual clues.
arXiv Detail & Related papers (2024-06-09T07:29:55Z) - Zero-Shot Cross-Lingual Sentiment Classification under Distribution
Shift: an Exploratory Study [11.299638372051795]
We study generalization to out-of-distribution (OOD) test data specifically in zero-shot cross-lingual transfer settings.
We analyze performance impacts of both language and domain shifts between train and test data.
We propose two new approaches for OOD generalization that avoid the costly annotation process.
arXiv Detail & Related papers (2023-11-11T11:56:56Z) - Unlock the Potential of Counterfactually-Augmented Data in
Out-Of-Distribution Generalization [25.36416774024584]
Counterfactually-Augmented Data (CAD) has the potential to improve the Out-Of-Distribution (OOD) generalization capability of language models.
In this study, we attribute the inefficiency to the myopia phenomenon caused by CAD.
We introduce two additional constraints based on CAD's structural properties to help language models extract more complete causal features in CAD.
arXiv Detail & Related papers (2023-10-10T14:41:38Z) - L2CEval: Evaluating Language-to-Code Generation Capabilities of Large
Language Models [102.00201523306986]
We present L2CEval, a systematic evaluation of the language-to-code generation capabilities of large language models (LLMs)
We analyze the factors that potentially affect their performance, such as model size, pretraining data, instruction tuning, and different prompting methods.
In addition to assessing model performance, we measure confidence calibration for the models and conduct human evaluations of the output programs.
arXiv Detail & Related papers (2023-09-29T17:57:00Z) - How to Handle Different Types of Out-of-Distribution Scenarios in Computational Argumentation? A Comprehensive and Fine-Grained Field Study [59.13867562744973]
This work systematically assesses LMs' capabilities for out-of-distribution (OOD) scenarios.
We find that the efficacy of such learning paradigms varies with the type of OOD.
Specifically, while ICL excels for domain shifts, prompt-based fine-tuning surpasses for topic shifts.
arXiv Detail & Related papers (2023-09-15T11:15:47Z) - AutoCAD: Automatically Generating Counterfactuals for Mitigating
Shortcut Learning [70.70393006697383]
We present AutoCAD, a fully automatic and task-agnostic CAD generation framework.
In this paper, we present AutoCAD, a fully automatic and task-agnostic CAD generation framework.
arXiv Detail & Related papers (2022-11-29T13:39:53Z) - Counterfactually Augmented Data and Unintended Bias: The Case of Sexism
and Hate Speech Detection [35.29235215101502]
Over-relying on core features may lead to unintended model bias.
We test models for sexism and hate speech detection on challenging data.
Using a diverse set of CAD -- construct-driven and construct-agnostic -- reduces such unintended bias.
arXiv Detail & Related papers (2022-05-09T12:39:26Z) - Examining Scaling and Transfer of Language Model Architectures for
Machine Translation [51.69212730675345]
Language models (LMs) process sequences in a single stack of layers, and encoder-decoder models (EncDec) utilize separate layer stacks for input and output processing.
In machine translation, EncDec has long been the favoured approach, but with few studies investigating the performance of LMs.
arXiv Detail & Related papers (2022-02-01T16:20:15Z) - How Does Counterfactually Augmented Data Impact Models for Social
Computing Constructs? [35.29235215101502]
We investigate the benefits of counterfactually augmented data (CAD) for social NLP models by focusing on three social computing constructs -- sentiment, sexism, and hate speech.
We find that while models trained on CAD show lower in-domain performance, they generalize better out-of-domain.
arXiv Detail & Related papers (2021-09-14T23:46:39Z) - An Investigation of the (In)effectiveness of Counterfactually Augmented
Data [10.316235366821111]
We show that while counterfactually-augmented data (CAD) is effective at identifying robust features, it may prevent the model from learning unperturbed robust features.
Our results show that the lack of perturbation diversity in current CAD datasets limits its effectiveness on OOD generalization.
arXiv Detail & Related papers (2021-07-01T21:46:43Z) - Mixed-Lingual Pre-training for Cross-lingual Summarization [54.4823498438831]
Cross-lingual Summarization aims at producing a summary in the target language for an article in the source language.
We propose a solution based on mixed-lingual pre-training that leverages both cross-lingual tasks like translation and monolingual tasks like masked language models.
Our model achieves an improvement of 2.82 (English to Chinese) and 1.15 (Chinese to English) ROUGE-1 scores over state-of-the-art results.
arXiv Detail & Related papers (2020-10-18T00:21:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.