Learning to Recombine and Resample Data for Compositional Generalization
- URL: http://arxiv.org/abs/2010.03706v6
- Date: Tue, 8 Jun 2021 00:43:04 GMT
- Title: Learning to Recombine and Resample Data for Compositional Generalization
- Authors: Ekin Aky\"urek, Afra Feyza Aky\"urek, Jacob Andreas
- Abstract summary: We describe R&R, a learned data augmentation scheme that enables a large category of compositional generalizations without appeal to latent symbolic structure.
R&R has two components: recombination of original training examples via a prototype-based generative model and resampling of generated examples to encourage extrapolation.
- Score: 35.868789086531685
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Flexible neural sequence models outperform grammar- and automaton-based
counterparts on a variety of tasks. However, neural models perform poorly in
settings requiring compositional generalization beyond the training data --
particularly to rare or unseen subsequences. Past work has found symbolic
scaffolding (e.g. grammars or automata) essential in these settings. We
describe R&R, a learned data augmentation scheme that enables a large category
of compositional generalizations without appeal to latent symbolic structure.
R&R has two components: recombination of original training examples via a
prototype-based generative model and resampling of generated examples to
encourage extrapolation. Training an ordinary neural sequence model on a
dataset augmented with recombined and resampled examples significantly improves
generalization in two language processing problems -- instruction following
(SCAN) and morphological analysis (SIGMORPHON 2018) -- where R&R enables
learning of new constructions and tenses from as few as eight initial examples.
Related papers
- Iterative self-transfer learning: A general methodology for response
time-history prediction based on small dataset [0.0]
An iterative self-transfer learningmethod for training neural networks based on small datasets is proposed in this study.
The results show that the proposed method can improve the model performance by near an order of magnitude on small datasets.
arXiv Detail & Related papers (2023-06-14T18:48:04Z) - Real-World Compositional Generalization with Disentangled
Sequence-to-Sequence Learning [81.24269148865555]
A recently proposed Disentangled sequence-to-sequence model (Dangle) shows promising generalization capability.
We introduce two key modifications to this model which encourage more disentangled representations and improve its compute and memory efficiency.
Specifically, instead of adaptively re-encoding source keys and values at each time step, we disentangle their representations and only re-encode keys periodically.
arXiv Detail & Related papers (2022-12-12T15:40:30Z) - Mutual Exclusivity Training and Primitive Augmentation to Induce
Compositionality [84.94877848357896]
Recent datasets expose the lack of the systematic generalization ability in standard sequence-to-sequence models.
We analyze this behavior of seq2seq models and identify two contributing factors: a lack of mutual exclusivity bias and the tendency to memorize whole examples.
We show substantial empirical improvements using standard sequence-to-sequence models on two widely-used compositionality datasets.
arXiv Detail & Related papers (2022-11-28T17:36:41Z) - Neural-Symbolic Recursive Machine for Systematic Generalization [113.22455566135757]
We introduce the Neural-Symbolic Recursive Machine (NSR), whose core is a Grounded Symbol System (GSS)
NSR integrates neural perception, syntactic parsing, and semantic reasoning.
We evaluate NSR's efficacy across four challenging benchmarks designed to probe systematic generalization capabilities.
arXiv Detail & Related papers (2022-10-04T13:27:38Z) - Compositionality as Lexical Symmetry [42.37422271002712]
In tasks like semantic parsing, instruction following, and question answering, standard deep networks fail to generalize compositionally from small datasets.
We present a domain-general and model-agnostic formulation of compositionality as a constraint on symmetries of data distributions rather than models.
We describe a procedure called LEXSYM that discovers these transformations automatically, then applies them to training data for ordinary neural sequence models.
arXiv Detail & Related papers (2022-01-30T21:44:46Z) - Recursive Decoding: A Situated Cognition Approach to Compositional
Generation in Grounded Language Understanding [0.0]
We present Recursive Decoding, a novel procedure for training and using seq2seq models.
Rather than generating an entire output sequence in one pass, models are trained to predict one token at a time.
RD yields dramatic improvement on two previously neglected generalization tasks in gSCAN.
arXiv Detail & Related papers (2022-01-27T19:13:42Z) - Improving Compositional Generalization with Latent Structure and Data
Augmentation [39.24527889685699]
We present a more powerful data recombination method using a model called Compositional Structure Learner (CSL)
CSL is a generative model with a quasi-synchronous context-free grammar backbone.
This procedure effectively transfers most of CSL's compositional bias to T5 for diagnostic tasks.
arXiv Detail & Related papers (2021-12-14T18:03:28Z) - Finding needles in a haystack: Sampling Structurally-diverse Training
Sets from Synthetic Data for Compositional Generalization [33.30539396439008]
We investigate automatic generation of synthetic utterance-program pairs for improving compositional generalization in semantic parsing.
We select a subset of synthetic examples that are structurally-diverse and use them to improve compositional generalization.
We evaluate our approach on a new split of the schema2QA dataset, and show that it leads to dramatic improvements in compositional generalization as well as moderate improvements in the traditional i.i.d setup.
arXiv Detail & Related papers (2021-09-06T16:20:47Z) - Learning Algebraic Recombination for Compositional Generalization [71.78771157219428]
We propose LeAR, an end-to-end neural model to learn algebraic recombination for compositional generalization.
Key insight is to model the semantic parsing task as a homomorphism between a latent syntactic algebra and a semantic algebra.
Experiments on two realistic and comprehensive compositional generalization demonstrate the effectiveness of our model.
arXiv Detail & Related papers (2021-07-14T07:23:46Z) - Structured Reordering for Modeling Latent Alignments in Sequence
Transduction [86.94309120789396]
We present an efficient dynamic programming algorithm performing exact marginal inference of separable permutations.
The resulting seq2seq model exhibits better systematic generalization than standard models on synthetic problems and NLP tasks.
arXiv Detail & Related papers (2021-06-06T21:53:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.