Unlock the Potential of Counterfactually-Augmented Data in
Out-Of-Distribution Generalization
- URL: http://arxiv.org/abs/2310.06666v1
- Date: Tue, 10 Oct 2023 14:41:38 GMT
- Title: Unlock the Potential of Counterfactually-Augmented Data in
Out-Of-Distribution Generalization
- Authors: Caoyun Fan, Wenqing Chen, Jidong Tian, Yitian Li, Hao He, Yaohui Jin
- Abstract summary: Counterfactually-Augmented Data (CAD) has the potential to improve the Out-Of-Distribution (OOD) generalization capability of language models.
In this study, we attribute the inefficiency to the myopia phenomenon caused by CAD.
We introduce two additional constraints based on CAD's structural properties to help language models extract more complete causal features in CAD.
- Score: 25.36416774024584
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Counterfactually-Augmented Data (CAD) -- minimal editing of sentences to flip
the corresponding labels -- has the potential to improve the
Out-Of-Distribution (OOD) generalization capability of language models, as CAD
induces language models to exploit domain-independent causal features and
exclude spurious correlations. However, the empirical results of CAD's OOD
generalization are not as efficient as anticipated. In this study, we attribute
the inefficiency to the myopia phenomenon caused by CAD: language models only
focus on causal features that are edited in the augmentation operation and
exclude other non-edited causal features. Therefore, the potential of CAD is
not fully exploited. To address this issue, we analyze the myopia phenomenon in
feature space from the perspective of Fisher's Linear Discriminant, then we
introduce two additional constraints based on CAD's structural properties
(dataset-level and sentence-level) to help language models extract more
complete causal features in CAD, thereby mitigating the myopia phenomenon and
improving OOD generalization capability. We evaluate our method on two tasks:
Sentiment Analysis and Natural Language Inference, and the experimental results
demonstrate that our method could unlock the potential of CAD and improve the
OOD generalization performance of language models by 1.0% to 5.9%.
Related papers
- Profiling Patient Transcript Using Large Language Model Reasoning Augmentation for Alzheimer's Disease Detection [4.961581278723015]
Alzheimer's disease (AD) stands as the predominant cause of dementia, characterized by a gradual decline in speech and language capabilities.
Recent deep-learning advancements have facilitated automated AD detection through spontaneous speech.
Common transcript-based detection methods directly model text patterns in each utterance without a global view of the patient's linguistic characteristics.
arXiv Detail & Related papers (2024-09-19T07:58:07Z) - PairCFR: Enhancing Model Training on Paired Counterfactually Augmented Data through Contrastive Learning [49.60634126342945]
Counterfactually Augmented Data (CAD) involves creating new data samples by applying minimal yet sufficient modifications to flip the label of existing data samples to other classes.
Recent research reveals that training with CAD may lead models to overly focus on modified features while ignoring other important contextual information.
We employ contrastive learning to promote global feature alignment in addition to learning counterfactual clues.
arXiv Detail & Related papers (2024-06-09T07:29:55Z) - L2CEval: Evaluating Language-to-Code Generation Capabilities of Large
Language Models [102.00201523306986]
We present L2CEval, a systematic evaluation of the language-to-code generation capabilities of large language models (LLMs)
We analyze the factors that potentially affect their performance, such as model size, pretraining data, instruction tuning, and different prompting methods.
In addition to assessing model performance, we measure confidence calibration for the models and conduct human evaluations of the output programs.
arXiv Detail & Related papers (2023-09-29T17:57:00Z) - Adaptive Contextual Perception: How to Generalize to New Backgrounds and
Ambiguous Objects [75.15563723169234]
We investigate how vision models adaptively use context for out-of-distribution generalization.
We show that models that excel in one setting tend to struggle in the other.
To replicate the generalization abilities of biological vision, computer vision models must have factorized object vs. background representations.
arXiv Detail & Related papers (2023-06-09T15:29:54Z) - Trusting Your Evidence: Hallucinate Less with Context-aware Decoding [91.91468712398385]
Language models (LMs) often struggle to pay enough attention to the input context, and generate texts that are unfaithful or contain hallucinations.
We present context-aware decoding (CAD), which follows a contrastive output distribution that amplifies the difference between the probabilities when a model is used with and without context.
arXiv Detail & Related papers (2023-05-24T05:19:15Z) - Improving the Out-Of-Distribution Generalization Capability of Language
Models: Counterfactually-Augmented Data is not Enough [19.38778317110205]
Counterfactually-Augmented Data (CAD) has the potential to improve language models' Out-Of-Distribution (OOD) generalization capability.
In this paper, we attribute the inefficiency to Myopia Phenomenon caused by CAD.
We design two additional constraints to help language models extract more complete causal features contained in CAD.
arXiv Detail & Related papers (2023-02-18T14:39:03Z) - CausalDialogue: Modeling Utterance-level Causality in Conversations [83.03604651485327]
We have compiled and expanded upon a new dataset called CausalDialogue through crowd-sourcing.
This dataset includes multiple cause-effect pairs within a directed acyclic graph (DAG) structure.
We propose a causality-enhanced method called Exponential Average Treatment Effect (ExMATE) to enhance the impact of causality at the utterance level in training neural conversation models.
arXiv Detail & Related papers (2022-12-20T18:31:50Z) - Counterfactually Augmented Data and Unintended Bias: The Case of Sexism
and Hate Speech Detection [35.29235215101502]
Over-relying on core features may lead to unintended model bias.
We test models for sexism and hate speech detection on challenging data.
Using a diverse set of CAD -- construct-driven and construct-agnostic -- reduces such unintended bias.
arXiv Detail & Related papers (2022-05-09T12:39:26Z) - Improving Classifier Training Efficiency for Automatic Cyberbullying
Detection with Feature Density [58.64907136562178]
We study the effectiveness of Feature Density (FD) using different linguistically-backed feature preprocessing methods.
We hypothesise that estimating dataset complexity allows for the reduction of the number of required experiments.
The difference in linguistic complexity of datasets allows us to additionally discuss the efficacy of linguistically-backed word preprocessing.
arXiv Detail & Related papers (2021-11-02T15:48:28Z) - How Does Counterfactually Augmented Data Impact Models for Social
Computing Constructs? [35.29235215101502]
We investigate the benefits of counterfactually augmented data (CAD) for social NLP models by focusing on three social computing constructs -- sentiment, sexism, and hate speech.
We find that while models trained on CAD show lower in-domain performance, they generalize better out-of-domain.
arXiv Detail & Related papers (2021-09-14T23:46:39Z) - An Investigation of the (In)effectiveness of Counterfactually Augmented
Data [10.316235366821111]
We show that while counterfactually-augmented data (CAD) is effective at identifying robust features, it may prevent the model from learning unperturbed robust features.
Our results show that the lack of perturbation diversity in current CAD datasets limits its effectiveness on OOD generalization.
arXiv Detail & Related papers (2021-07-01T21:46:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.