A Simple but Tough-to-Beat Data Augmentation Approach for Natural
Language Understanding and Generation
- URL: http://arxiv.org/abs/2009.13818v2
- Date: Fri, 23 Oct 2020 03:19:58 GMT
- Title: A Simple but Tough-to-Beat Data Augmentation Approach for Natural
Language Understanding and Generation
- Authors: Dinghan Shen, Mingzhi Zheng, Yelong Shen, Yanru Qu, Weizhu Chen
- Abstract summary: We introduce a set of simple yet effective data augmentation strategies dubbed cutoff.
cutoff relies on sampling consistency and thus adds little computational overhead.
cutoff consistently outperforms adversarial training and achieves state-of-the-art results on the IWSLT2014 German-English dataset.
- Score: 53.8171136907856
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Adversarial training has been shown effective at endowing the learned
representations with stronger generalization ability. However, it typically
requires expensive computation to determine the direction of the injected
perturbations. In this paper, we introduce a set of simple yet effective data
augmentation strategies dubbed cutoff, where part of the information within an
input sentence is erased to yield its restricted views (during the fine-tuning
stage). Notably, this process relies merely on stochastic sampling and thus
adds little computational overhead. A Jensen-Shannon Divergence consistency
loss is further utilized to incorporate these augmented samples into the
training objective in a principled manner. To verify the effectiveness of the
proposed strategies, we apply cutoff to both natural language understanding and
generation problems. On the GLUE benchmark, it is demonstrated that cutoff, in
spite of its simplicity, performs on par or better than several competitive
adversarial-based approaches. We further extend cutoff to machine translation
and observe significant gains in BLEU scores (based upon the Transformer Base
model). Moreover, cutoff consistently outperforms adversarial training and
achieves state-of-the-art results on the IWSLT2014 German-English dataset.
Related papers
- Enhancing In-Context Learning via Implicit Demonstration Augmentation [26.78252788538567]
In-context learning (ICL) enables pre-trained language models to make predictions for unseen inputs without updating parameters.
Despite its potential, ICL's effectiveness heavily relies on the quality, quantity, and permutation of demonstrations.
In this paper, we tackle this challenge for the first time from the perspective of demonstration augmentation.
arXiv Detail & Related papers (2024-06-27T05:25:46Z) - Negative Preference Optimization: From Catastrophic Collapse to Effective Unlearning [28.059563581973432]
Large Language Models (LLMs) often have sensitive, private, or copyrighted data during pre-training.
LLMs unlearning aims to eliminate the influence of undesirable data from the pre-trained model.
We propose Negative Preference Optimization (NPO) as a simple alignment-inspired method that could efficiently unlearn a target dataset.
arXiv Detail & Related papers (2024-04-08T21:05:42Z) - Dynamic Transformers Provide a False Sense of Efficiency [75.39702559746533]
Multi-exit models make a trade-off between efficiency and accuracy, where the saving of computation comes from an early exit.
We propose a simple yet effective attacking framework, SAME, which is specially tailored to reduce the efficiency of the multi-exit models.
Experiments on the GLUE benchmark show that SAME can effectively diminish the efficiency gain of various multi-exit models by 80% on average.
arXiv Detail & Related papers (2023-05-20T16:41:48Z) - Adversarial Style Augmentation for Domain Generalization [41.72506801753435]
We introduce a novel Adrial Style Augmentation (ASA) method, which explores broader style spaces by generating more effective statistics perturbation.
To facilitate the application of ASA, we design a simple yet effective module, namely AdvStyle, which instantiates the ASA method in a plug-and-play manner.
Our method significantly outperforms its competitors on the PACS dataset under the single source generalization setting.
arXiv Detail & Related papers (2023-01-30T03:52:16Z) - Improving Pre-trained Language Model Fine-tuning with Noise Stability
Regularization [94.4409074435894]
We propose a novel and effective fine-tuning framework, named Layerwise Noise Stability Regularization (LNSR)
Specifically, we propose to inject the standard Gaussian noise and regularize hidden representations of the fine-tuned model.
We demonstrate the advantages of the proposed method over other state-of-the-art algorithms including L2-SP, Mixout and SMART.
arXiv Detail & Related papers (2022-06-12T04:42:49Z) - Phrase-level Adversarial Example Generation for Neural Machine
Translation [75.01476479100569]
We propose a phrase-level adversarial example generation (PAEG) method to enhance the robustness of the model.
We verify our method on three benchmarks, including LDC Chinese-English, IWSLT14 German-English, and WMT14 English-German tasks.
arXiv Detail & Related papers (2022-01-06T11:00:49Z) - A Simple Baseline for Semi-supervised Semantic Segmentation with Strong
Data Augmentation [74.8791451327354]
We propose a simple yet effective semi-supervised learning framework for semantic segmentation.
A set of simple design and training techniques can collectively improve the performance of semi-supervised semantic segmentation significantly.
Our method achieves state-of-the-art results in the semi-supervised settings on the Cityscapes and Pascal VOC datasets.
arXiv Detail & Related papers (2021-04-15T06:01:39Z) - DEALIO: Data-Efficient Adversarial Learning for Imitation from
Observation [57.358212277226315]
In imitation learning from observation IfO, a learning agent seeks to imitate a demonstrating agent using only observations of the demonstrated behavior without access to the control signals generated by the demonstrator.
Recent methods based on adversarial imitation learning have led to state-of-the-art performance on IfO problems, but they typically suffer from high sample complexity due to a reliance on data-inefficient, model-free reinforcement learning algorithms.
This issue makes them impractical to deploy in real-world settings, where gathering samples can incur high costs in terms of time, energy, and risk.
We propose a more data-efficient IfO algorithm
arXiv Detail & Related papers (2021-03-31T23:46:32Z) - Representation Learning via Invariant Causal Mechanisms [19.0976564154636]
Self-supervised learning has emerged as a strategy to reduce the reliance on costly supervised signal by pretraining representations only using unlabeled data.
We show how data augmentations can be more effectively utilized through explicit invariance constraints on the proxy classifiers employed during pretraining.
We propose a novel self-supervised objective, Representation Learning via In Causvariantal Mechanisms (ReLIC) that enforces invariant prediction of proxy targets across augmentations.
arXiv Detail & Related papers (2020-10-15T17:53:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.