SSMBA: Self-Supervised Manifold Based Data Augmentation for Improving
Out-of-Domain Robustness
- URL: http://arxiv.org/abs/2009.10195v2
- Date: Sun, 4 Oct 2020 22:47:00 GMT
- Title: SSMBA: Self-Supervised Manifold Based Data Augmentation for Improving
Out-of-Domain Robustness
- Authors: Nathan Ng, Kyunghyun Cho, Marzyeh Ghassemi
- Abstract summary: In natural language, it is difficult to generate new examples that stay on the underlying data manifold.
We introduce SSMBA, a data augmentation method for generating synthetic training examples.
In experiments on benchmarks across 3 tasks and 9 datasets, SSMBA consistently outperforms existing data augmentation methods.
- Score: 66.37077266814822
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Models that perform well on a training domain often fail to generalize to
out-of-domain (OOD) examples. Data augmentation is a common method used to
prevent overfitting and improve OOD generalization. However, in natural
language, it is difficult to generate new examples that stay on the underlying
data manifold. We introduce SSMBA, a data augmentation method for generating
synthetic training examples by using a pair of corruption and reconstruction
functions to move randomly on a data manifold. We investigate the use of SSMBA
in the natural language domain, leveraging the manifold assumption to
reconstruct corrupted text with masked language models. In experiments on
robustness benchmarks across 3 tasks and 9 datasets, SSMBA consistently
outperforms existing data augmentation methods and baseline models on both
in-domain and OOD data, achieving gains of 0.8% accuracy on OOD Amazon reviews,
1.8% accuracy on OOD MNLI, and 1.4 BLEU on in-domain IWSLT14 German-English.
Related papers
- Drift-Resilient TabPFN: In-Context Learning Temporal Distribution Shifts on Tabular Data [39.40116554523575]
We present Drift-Resilient TabPFN, a fresh approach based on In-Context Learning with a Prior-Data Fitted Network.
It learns to approximate Bayesian inference on synthetic datasets drawn from a prior.
It improves accuracy from 0.688 to 0.744 and ROC AUC from 0.786 to 0.832 while maintaining stronger calibration.
arXiv Detail & Related papers (2024-11-15T23:49:23Z) - SMaRt: Improving GANs with Score Matching Regularity [94.81046452865583]
Generative adversarial networks (GANs) usually struggle in learning from highly diverse data, whose underlying manifold is complex.
We show that score matching serves as a promising solution to this issue thanks to its capability of persistently pushing the generated data points towards the real data manifold.
We propose to improve the optimization of GANs with score matching regularity (SMaRt)
arXiv Detail & Related papers (2023-11-30T03:05:14Z) - Preserving Knowledge Invariance: Rethinking Robustness Evaluation of
Open Information Extraction [50.62245481416744]
We present the first benchmark that simulates the evaluation of open information extraction models in the real world.
We design and annotate a large-scale testbed in which each example is a knowledge-invariant clique.
By further elaborating the robustness metric, a model is judged to be robust if its performance is consistently accurate on the overall cliques.
arXiv Detail & Related papers (2023-05-23T12:05:09Z) - Tailoring Language Generation Models under Total Variation Distance [55.89964205594829]
The standard paradigm of neural language generation adopts maximum likelihood estimation (MLE) as the optimizing method.
We develop practical bounds to apply it to language generation.
We introduce the TaiLr objective that balances the tradeoff of estimating TVD.
arXiv Detail & Related papers (2023-02-26T16:32:52Z) - GLUE-X: Evaluating Natural Language Understanding Models from an
Out-of-distribution Generalization Perspective [36.24251509242988]
This paper presents the first attempt at creating a unified benchmark named GLUE-X for evaluating OOD robustness in NLP models.
evaluations are conducted on 8 classic NLP tasks over 21 popularly used PLMs, including GPT-3 and GPT-3.5.
arXiv Detail & Related papers (2022-11-15T11:53:55Z) - Training OOD Detectors in their Natural Habitats [31.565635192716712]
Out-of-distribution (OOD) detection is important for machine learning models deployed in the wild.
Recent methods use auxiliary outlier data to regularize the model for improved OOD detection.
We propose a novel framework that leverages wild mixture data -- that naturally consists of both ID and OOD samples.
arXiv Detail & Related papers (2022-02-07T15:38:39Z) - Enhancing the Generalization for Intent Classification and Out-of-Domain
Detection in SLU [70.44344060176952]
Intent classification is a major task in spoken language understanding (SLU)
Recent works have shown that using extra data and labels can improve the OOD detection performance.
This paper proposes to train a model with only IND data while supporting both IND intent classification and OOD detection.
arXiv Detail & Related papers (2021-06-28T08:27:38Z) - SDA: Improving Text Generation with Self Data Augmentation [88.24594090105899]
We propose to improve the standard maximum likelihood estimation (MLE) paradigm by incorporating a self-imitation-learning phase for automatic data augmentation.
Unlike most existing sentence-level augmentation strategies, our method is more general and could be easily adapted to any MLE-based training procedure.
arXiv Detail & Related papers (2021-01-02T01:15:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.