Related papers: Improving Compositional Generalization with Latent Structure and Data Augmentation

Improving Compositional Generalization with Latent Structure and Data Augmentation

URL: http://arxiv.org/abs/2112.07610v1
Date: Tue, 14 Dec 2021 18:03:28 GMT
Title: Improving Compositional Generalization with Latent Structure and Data Augmentation
Authors: Linlu Qiu, Peter Shaw, Panupong Pasupat, Pawe{\l} Krzysztof Nowak, Tal Linzen, Fei Sha, Kristina Toutanova
Abstract summary: We present a more powerful data recombination method using a model called Compositional Structure Learner (CSL) CSL is a generative model with a quasi-synchronous context-free grammar backbone. This procedure effectively transfers most of CSL's compositional bias to T5 for diagnostic tasks.
Score: 39.24527889685699
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Generic unstructured neural networks have been shown to struggle on out-of-distribution compositional generalization. Compositional data augmentation via example recombination has transferred some prior knowledge about compositionality to such black-box neural models for several semantic parsing tasks, but this often required task-specific engineering or provided limited gains. We present a more powerful data recombination method using a model called Compositional Structure Learner (CSL). CSL is a generative model with a quasi-synchronous context-free grammar backbone, which we induce from the training data. We sample recombined examples from CSL and add them to the fine-tuning data of a pre-trained sequence-to-sequence model (T5). This procedure effectively transfers most of CSL's compositional bias to T5 for diagnostic tasks, and results in a model even stronger than a T5-CSL ensemble on two real world compositional generalization tasks. This results in new state-of-the-art performance for these challenging semantic parsing tasks requiring generalization to both natural language variation and novel compositions of elements.

Related papers

Learning to Substitute Components for Compositional Generalization [70.96410435337967]
We propose a novel compositional augmentation strategy called CompSub, which enables multi-grained composition of substantial substructures. We also introduce the Learning Component Substitution (LCS) framework, which empowers the learning of component substitution probabilities in CompSub. Our results demonstrate the superiority of CompSub, LCS, and LCS-ICL, with improvements of up to 66.5%, 10.3%, 1.4%, and 8.8%, respectively.
arXiv Detail & Related papers (2025-02-28T08:30:47Z)
Limits of Transformer Language Models on Learning to Compose Algorithms [77.2443883991608]
We evaluate training LLaMA models and prompting GPT-4 and Gemini on four tasks demanding to learn a composition of several discrete sub-tasks. Our results indicate that compositional learning in state-of-the-art Transformer language models is highly sample inefficient.
arXiv Detail & Related papers (2024-02-08T16:23:29Z)
Real-World Compositional Generalization with Disentangled Sequence-to-Sequence Learning [81.24269148865555]
A recently proposed Disentangled sequence-to-sequence model (Dangle) shows promising generalization capability. We introduce two key modifications to this model which encourage more disentangled representations and improve its compute and memory efficiency. Specifically, instead of adaptively re-encoding source keys and values at each time step, we disentangle their representations and only re-encode keys periodically.
arXiv Detail & Related papers (2022-12-12T15:40:30Z)
Compositional Generalisation with Structured Reordering and Fertility Layers [121.37328648951993]
Seq2seq models have been shown to struggle with compositional generalisation. We present a flexible end-to-end differentiable neural model that composes two structural operations.
arXiv Detail & Related papers (2022-10-06T19:51:31Z)
Compositionality as Lexical Symmetry [42.37422271002712]
In tasks like semantic parsing, instruction following, and question answering, standard deep networks fail to generalize compositionally from small datasets. We present a domain-general and model-agnostic formulation of compositionality as a constraint on symmetries of data distributions rather than models. We describe a procedure called LEXSYM that discovers these transformations automatically, then applies them to training data for ordinary neural sequence models.
arXiv Detail & Related papers (2022-01-30T21:44:46Z)
Learning to Generalize Compositionally by Transferring Across Semantic Parsing Tasks [37.66114618645146]
We investigate learning representations that facilitate transfer learning from one compositional task to another. We apply this method to semantic parsing, using three very different datasets. Our method significantly improves compositional generalization over baselines on the test set of the target task.
arXiv Detail & Related papers (2021-11-09T09:10:21Z)
Improving Compositional Generalization with Self-Training for Data-to-Text Generation [36.973617793800315]
We study the compositional generalization of current generation models in data-to-text tasks. By simulating structural shifts in the compositional Weather dataset, we show that T5 models fail to generalize to unseen structures. We propose an approach based on self-training using finetuned BLEURT for pseudo-response selection.
arXiv Detail & Related papers (2021-10-16T04:26:56Z)
Structured Reordering for Modeling Latent Alignments in Sequence Transduction [86.94309120789396]
We present an efficient dynamic programming algorithm performing exact marginal inference of separable permutations. The resulting seq2seq model exhibits better systematic generalization than standard models on synthetic problems and NLP tasks.
arXiv Detail & Related papers (2021-06-06T21:53:54Z)
Sequence-Level Mixed Sample Data Augmentation [119.94667752029143]
This work proposes a simple data augmentation approach to encourage compositional behavior in neural models for sequence-to-sequence problems. Our approach, SeqMix, creates new synthetic examples by softly combining input/output sequences from the training set.
arXiv Detail & Related papers (2020-11-18T02:18:04Z)
Learning to Recombine and Resample Data for Compositional Generalization [35.868789086531685]
We describe R&R, a learned data augmentation scheme that enables a large category of compositional generalizations without appeal to latent symbolic structure. R&R has two components: recombination of original training examples via a prototype-based generative model and resampling of generated examples to encourage extrapolation.
arXiv Detail & Related papers (2020-10-08T00:36:33Z)

This list is automatically generated from the titles and abstracts of the papers in this site.