Compositional Program Generation for Few-Shot Systematic Generalization
- URL: http://arxiv.org/abs/2309.16467v2
- Date: Thu, 18 Jan 2024 18:25:38 GMT
- Title: Compositional Program Generation for Few-Shot Systematic Generalization
- Authors: Tim Klinger and Luke Liu and Soham Dan and Maxwell Crouse and
Parikshit Ram and Alexander Gray
- Abstract summary: This study on a neuro-symbolic architecture called the Compositional Program Generator (CPG)
CPG has three key features: textitmodularity, textitcomposition, and textitabstraction, in the form of grammar rules.
It perfect achieves generalization on both the SCAN and COGS benchmarks using just 14 examples for SCAN and 22 examples for COGS.
- Score: 59.57656559816271
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Compositional generalization is a key ability of humans that enables us to
learn new concepts from only a handful examples. Neural machine learning
models, including the now ubiquitous Transformers, struggle to generalize in
this way, and typically require thousands of examples of a concept during
training in order to generalize meaningfully. This difference in ability
between humans and artificial neural architectures, motivates this study on a
neuro-symbolic architecture called the Compositional Program Generator (CPG).
CPG has three key features: \textit{modularity}, \textit{composition}, and
\textit{abstraction}, in the form of grammar rules, that enable it to
generalize both systematically to new concepts in a few-shot manner, as well as
productively by length on various sequence-to-sequence language tasks. For each
input, CPG uses a grammar of the input language and a parser to generate a
parse in which each grammar rule is assigned its own unique semantic module, a
probabilistic copy or substitution program. Instances with the same parse are
always processed with the same composed modules, while those with different
parses may be processed with different modules. CPG learns parameters for the
modules and is able to learn the semantics for new rules and types
incrementally, without forgetting or retraining on rules it's already seen. It
achieves perfect generalization on both the SCAN and COGS benchmarks using just
14 examples for SCAN and 22 examples for COGS -- state-of-the-art accuracy with
a 1000x improvement in sample efficiency.
Related papers
- ExeDec: Execution Decomposition for Compositional Generalization in Neural Program Synthesis [54.18659323181771]
We characterize several different forms of compositional generalization that are desirable in program synthesis.
We propose ExeDec, a novel decomposition-based strategy that predicts execution subgoals to solve problems step-by-step informed by program execution at each step.
arXiv Detail & Related papers (2023-07-26T01:07:52Z) - Real-World Compositional Generalization with Disentangled
Sequence-to-Sequence Learning [81.24269148865555]
A recently proposed Disentangled sequence-to-sequence model (Dangle) shows promising generalization capability.
We introduce two key modifications to this model which encourage more disentangled representations and improve its compute and memory efficiency.
Specifically, instead of adaptively re-encoding source keys and values at each time step, we disentangle their representations and only re-encode keys periodically.
arXiv Detail & Related papers (2022-12-12T15:40:30Z) - Compositional Generalization Requires Compositional Parsers [69.77216620997305]
We compare sequence-to-sequence models and models guided by compositional principles on the recent COGS corpus.
We show structural generalization is a key measure of compositional generalization and requires models that are aware of complex structure.
arXiv Detail & Related papers (2022-02-24T07:36:35Z) - Recursive Decoding: A Situated Cognition Approach to Compositional
Generation in Grounded Language Understanding [0.0]
We present Recursive Decoding, a novel procedure for training and using seq2seq models.
Rather than generating an entire output sequence in one pass, models are trained to predict one token at a time.
RD yields dramatic improvement on two previously neglected generalization tasks in gSCAN.
arXiv Detail & Related papers (2022-01-27T19:13:42Z) - Inducing Transformer's Compositional Generalization Ability via
Auxiliary Sequence Prediction Tasks [86.10875837475783]
Systematic compositionality is an essential mechanism in human language, allowing the recombination of known parts to create novel expressions.
Existing neural models have been shown to lack this basic ability in learning symbolic structures.
We propose two auxiliary sequence prediction tasks that track the progress of function and argument semantics.
arXiv Detail & Related papers (2021-09-30T16:41:19Z) - Sequence-to-Sequence Learning with Latent Neural Grammars [12.624691611049341]
Sequence-to-sequence learning with neural networks has become the de facto standard for sequence prediction tasks.
While flexible and performant, these models often require large datasets for training and can fail spectacularly on benchmarks designed to test for compositional generalization.
This work explores an alternative, hierarchical approach to sequence-to-sequence learning with quasi-synchronous grammars.
arXiv Detail & Related papers (2021-09-02T17:58:08Z) - Learning Compositional Rules via Neural Program Synthesis [67.62112086708859]
We present a neuro-symbolic model which learns entire rule systems from a small set of examples.
Instead of directly predicting outputs from inputs, we train our model to induce the explicit system of rules governing a set of previously seen examples.
arXiv Detail & Related papers (2020-03-12T01:06:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.