Revisiting the Compositional Generalization Abilities of Neural Sequence
Models
- URL: http://arxiv.org/abs/2203.07402v1
- Date: Mon, 14 Mar 2022 18:03:21 GMT
- Title: Revisiting the Compositional Generalization Abilities of Neural Sequence
Models
- Authors: Arkil Patel, Satwik Bhattamishra, Phil Blunsom, Navin Goyal
- Abstract summary: We focus on one-shot primitive generalization as introduced by the popular SCAN benchmark.
We demonstrate that modifying the training distribution in simple and intuitive ways enables standard seq-to-seq models to achieve near-perfect generalization performance.
- Score: 23.665350744415004
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Compositional generalization is a fundamental trait in humans, allowing us to
effortlessly combine known phrases to form novel sentences. Recent works have
claimed that standard seq-to-seq models severely lack the ability to
compositionally generalize. In this paper, we focus on one-shot primitive
generalization as introduced by the popular SCAN benchmark. We demonstrate that
modifying the training distribution in simple and intuitive ways enables
standard seq-to-seq models to achieve near-perfect generalization performance,
thereby showing that their compositional generalization abilities were
previously underestimated. We perform detailed empirical analysis of this
phenomenon. Our results indicate that the generalization performance of models
is highly sensitive to the characteristics of the training data which should be
carefully considered while designing such benchmarks in future.
Related papers
- SLOG: A Structural Generalization Benchmark for Semantic Parsing [68.19511282584304]
The goal of compositional generalization benchmarks is to evaluate how well models generalize to new complex linguistic expressions.
Existing benchmarks often focus on lexical generalization, the interpretation of novel lexical items in syntactic structures familiar from training, are often underrepresented.
We introduce SLOG, a semantic parsing dataset that extends COGS with 17 structural generalization cases.
arXiv Detail & Related papers (2023-10-23T15:39:09Z) - Compositional Generalisation with Structured Reordering and Fertility
Layers [121.37328648951993]
Seq2seq models have been shown to struggle with compositional generalisation.
We present a flexible end-to-end differentiable neural model that composes two structural operations.
arXiv Detail & Related papers (2022-10-06T19:51:31Z) - Generalization Gap in Amortized Inference [17.951010274427187]
We study the generalizations of a popular class of probabilistic models - the Variational Auto-Encoder (VAE)
We show that the over-fitting phenomenon is usually dominated by the amortized inference network.
We propose a new training objective, inspired by the classic wake-sleep algorithm, to improve the generalizations properties of amortized inference.
arXiv Detail & Related papers (2022-05-23T21:28:47Z) - Compositional Generalization Requires Compositional Parsers [69.77216620997305]
We compare sequence-to-sequence models and models guided by compositional principles on the recent COGS corpus.
We show structural generalization is a key measure of compositional generalization and requires models that are aware of complex structure.
arXiv Detail & Related papers (2022-02-24T07:36:35Z) - Grounded Graph Decoding Improves Compositional Generalization in
Question Answering [68.72605660152101]
Question answering models struggle to generalize to novel compositions of training patterns, such as longer sequences or more complex test structures.
We propose Grounded Graph Decoding, a method to improve compositional generalization of language representations by grounding structured predictions with an attention mechanism.
Our model significantly outperforms state-of-the-art baselines on the Compositional Freebase Questions (CFQ) dataset, a challenging benchmark for compositional generalization in question answering.
arXiv Detail & Related papers (2021-11-05T17:50:14Z) - Disentangled Sequence to Sequence Learning for Compositional
Generalization [62.954842223732435]
We propose an extension to sequence-to-sequence models which allows us to learn disentangled representations by adaptively re-encoding the source input.
Experimental results on semantic parsing and machine translation empirically show that our proposal yields more disentangled representations and better generalization.
arXiv Detail & Related papers (2021-10-09T22:27:19Z) - Unlocking Compositional Generalization in Pre-trained Models Using
Intermediate Representations [27.244943870086175]
Sequence-to-sequence (seq2seq) models have been found to struggle at out-of-distribution compositional generalization.
We study the impact of intermediate representations on compositional generalization in pre-trained seq2seq models.
arXiv Detail & Related papers (2021-04-15T14:15:14Z) - Robustness to Augmentations as a Generalization metric [0.0]
Generalization is the ability of a model to predict on unseen domains.
We propose a method to predict the generalization performance of a model by using the concept that models that are robust to augmentations are more generalizable than those which are not.
The proposed method was the first runner up solution for the NeurIPS competition on Predicting Generalization in Deep Learning.
arXiv Detail & Related papers (2021-01-16T15:36:38Z) - Improving Compositional Generalization in Semantic Parsing [54.4720965813889]
Generalization of models to out-of-distribution (OOD) data has captured tremendous attention recently.
We investigate compositional generalization in semantic parsing, a natural test-bed for compositional generalization.
arXiv Detail & Related papers (2020-10-12T12:34:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.