Generative Data Augmentation for Commonsense Reasoning
- URL: http://arxiv.org/abs/2004.11546v3
- Date: Tue, 17 Nov 2020 04:37:31 GMT
- Title: Generative Data Augmentation for Commonsense Reasoning
- Authors: Yiben Yang, Chaitanya Malaviya, Jared Fernandez, Swabha Swayamdipta,
Ronan Le Bras, Ji-Ping Wang, Chandra Bhagavatula, Yejin Choi, Doug Downey
- Abstract summary: G-DAUGC is a novel generative data augmentation method that aims to achieve more accurate and robust learning in the low-resource setting.
G-DAUGC consistently outperforms existing data augmentation methods based on back-translation.
Our analysis demonstrates that G-DAUGC produces a diverse set of fluent training examples, and that its selection and training approaches are important for performance.
- Score: 75.26876609249197
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent advances in commonsense reasoning depend on large-scale
human-annotated training data to achieve peak performance. However, manual
curation of training examples is expensive and has been shown to introduce
annotation artifacts that neural models can readily exploit and overfit on. We
investigate G-DAUG^C, a novel generative data augmentation method that aims to
achieve more accurate and robust learning in the low-resource setting. Our
approach generates synthetic examples using pretrained language models, and
selects the most informative and diverse set of examples for data augmentation.
In experiments with multiple commonsense reasoning benchmarks, G-DAUG^C
consistently outperforms existing data augmentation methods based on
back-translation, and establishes a new state-of-the-art on WinoGrande, CODAH,
and CommonsenseQA. Further, in addition to improvements in in-distribution
accuracy, G-DAUG^C-augmented training also enhances out-of-distribution
generalization, showing greater robustness against adversarial or perturbed
examples. Our analysis demonstrates that G-DAUG^C produces a diverse set of
fluent training examples, and that its selection and training approaches are
important for performance. Our findings encourage future research toward
generative data augmentation to enhance both in-distribution learning and
out-of-distribution generalization.
Related papers
- Back to Basics: A Simple Recipe for Improving Out-of-Domain Retrieval in
Dense Encoders [63.28408887247742]
We study whether training procedures can be improved to yield better generalization capabilities in the resulting models.
We recommend a simple recipe for training dense encoders: Train on MSMARCO with parameter-efficient methods, such as LoRA, and opt for using in-batch negatives unless given well-constructed hard negatives.
arXiv Detail & Related papers (2023-11-16T10:42:58Z) - Consistency Regularization for Generalizable Source-free Domain
Adaptation [62.654883736925456]
Source-free domain adaptation (SFDA) aims to adapt a well-trained source model to an unlabelled target domain without accessing the source dataset.
Existing SFDA methods ONLY assess their adapted models on the target training set, neglecting the data from unseen but identically distributed testing sets.
We propose a consistency regularization framework to develop a more generalizable SFDA method.
arXiv Detail & Related papers (2023-08-03T07:45:53Z) - Implicit Counterfactual Data Augmentation for Robust Learning [24.795542869249154]
This study proposes an Implicit Counterfactual Data Augmentation method to remove spurious correlations and make stable predictions.
Experiments have been conducted across various biased learning scenarios covering both image and text datasets.
arXiv Detail & Related papers (2023-04-26T10:36:40Z) - A Guide for Practical Use of ADMG Causal Data Augmentation [0.0]
Causal data augmentation strategies have been pointed out as a solution to handle these challenges.
This paper experimentally analyzed the ADMG causal augmentation method considering different settings.
arXiv Detail & Related papers (2023-04-03T09:31:13Z) - Instance-Conditioned GAN Data Augmentation for Representation Learning [29.36473147430433]
We introduce DA_IC-GAN, a learnable data augmentation module that can be used off-the-shelf in conjunction with most state-of-the-art training recipes.
We show that DA_IC-GAN can boost accuracy to between 1%p and 2%p with the highest capacity models.
We additionally couple DA_IC-GAN with a self-supervised training recipe and show that we can also achieve an improvement of 1%p in accuracy in some settings.
arXiv Detail & Related papers (2023-03-16T22:45:43Z) - An Empirical Study on Distribution Shift Robustness From the Perspective
of Pre-Training and Data Augmentation [91.62129090006745]
This paper studies the distribution shift problem from the perspective of pre-training and data augmentation.
We provide the first comprehensive empirical study focusing on pre-training and data augmentation.
arXiv Detail & Related papers (2022-05-25T13:04:53Z) - Negative Data Augmentation [127.28042046152954]
We show that negative data augmentation samples provide information on the support of the data distribution.
We introduce a new GAN training objective where we use NDA as an additional source of synthetic data for the discriminator.
Empirically, models trained with our method achieve improved conditional/unconditional image generation along with improved anomaly detection capabilities.
arXiv Detail & Related papers (2021-02-09T20:28:35Z) - Generalization in Reinforcement Learning by Soft Data Augmentation [11.752595047069505]
SOft Data Augmentation (SODA) is a method that decouples augmentation from policy learning.
We find SODA to significantly advance sample efficiency, generalization, and stability in training over state-of-the-art vision-based RL methods.
arXiv Detail & Related papers (2020-11-26T17:00:34Z) - On the Benefits of Invariance in Neural Networks [56.362579457990094]
We show that training with data augmentation leads to better estimates of risk and thereof gradients, and we provide a PAC-Bayes generalization bound for models trained with data augmentation.
We also show that compared to data augmentation, feature averaging reduces generalization error when used with convex losses, and tightens PAC-Bayes bounds.
arXiv Detail & Related papers (2020-05-01T02:08:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.