Related papers: Sequence-Level Mixed Sample Data Augmentation

Sequence-Level Mixed Sample Data Augmentation

URL: http://arxiv.org/abs/2011.09039v1
Date: Wed, 18 Nov 2020 02:18:04 GMT
Title: Sequence-Level Mixed Sample Data Augmentation
Authors: Demi Guo, Yoon Kim and Alexander M. Rush
Abstract summary: This work proposes a simple data augmentation approach to encourage compositional behavior in neural models for sequence-to-sequence problems. Our approach, SeqMix, creates new synthetic examples by softly combining input/output sequences from the training set.
Score: 119.94667752029143
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Despite their empirical success, neural networks still have difficulty capturing compositional aspects of natural language. This work proposes a simple data augmentation approach to encourage compositional behavior in neural models for sequence-to-sequence problems. Our approach, SeqMix, creates new synthetic examples by softly combining input/output sequences from the training set. We connect this approach to existing techniques such as SwitchOut and word dropout, and show that these techniques are all approximating variants of a single objective. SeqMix consistently yields approximately 1.0 BLEU improvement on five different translation datasets over strong Transformer baselines. On tasks that require strong compositional generalization such as SCAN and semantic parsing, SeqMix also offers further improvements.

Related papers

Mutual Exclusivity Training and Primitive Augmentation to Induce Compositionality [84.94877848357896]
Recent datasets expose the lack of the systematic generalization ability in standard sequence-to-sequence models. We analyze this behavior of seq2seq models and identify two contributing factors: a lack of mutual exclusivity bias and the tendency to memorize whole examples. We show substantial empirical improvements using standard sequence-to-sequence models on two widely-used compositionality datasets.
arXiv Detail & Related papers (2022-11-28T17:36:41Z)
DoubleMix: Simple Interpolation-Based Data Augmentation for Text Classification [56.817386699291305]
This paper proposes a simple yet effective data augmentation approach termed DoubleMix. DoubleMix first generates several perturbed samples for each training data. It then uses the perturbed data and original data to carry out a two-step in the hidden space of neural models.
arXiv Detail & Related papers (2022-09-12T15:01:04Z)
A Well-Composed Text is Half Done! Composition Sampling for Diverse Conditional Generation [79.98319703471596]
We propose Composition Sampling, a simple but effective method to generate diverse outputs for conditional generation of higher quality. It builds on recently proposed plan-based neural generation models that are trained to first create a composition of the output and then generate by conditioning on it and the input.
arXiv Detail & Related papers (2022-03-28T21:24:03Z)
Improving Compositional Generalization with Latent Structure and Data Augmentation [39.24527889685699]
We present a more powerful data recombination method using a model called Compositional Structure Learner (CSL) CSL is a generative model with a quasi-synchronous context-free grammar backbone. This procedure effectively transfers most of CSL's compositional bias to T5 for diagnostic tasks.
arXiv Detail & Related papers (2021-12-14T18:03:28Z)
Inducing Transformer's Compositional Generalization Ability via Auxiliary Sequence Prediction Tasks [86.10875837475783]
Systematic compositionality is an essential mechanism in human language, allowing the recombination of known parts to create novel expressions. Existing neural models have been shown to lack this basic ability in learning symbolic structures. We propose two auxiliary sequence prediction tasks that track the progress of function and argument semantics.
arXiv Detail & Related papers (2021-09-30T16:41:19Z)
Mixup-Transformer: Dynamic Data Augmentation for NLP Tasks [75.69896269357005]
Mixup is the latest data augmentation technique that linearly interpolates input examples and the corresponding labels. In this paper, we explore how to apply mixup to natural language processing tasks. We incorporate mixup to transformer-based pre-trained architecture, named "mixup-transformer", for a wide range of NLP tasks.
arXiv Detail & Related papers (2020-10-05T23:37:30Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.