Compositional Generalization without Trees using Multiset Tagging and
Latent Permutations
- URL: http://arxiv.org/abs/2305.16954v1
- Date: Fri, 26 May 2023 14:09:35 GMT
- Title: Compositional Generalization without Trees using Multiset Tagging and
Latent Permutations
- Authors: Matthias Lindemann, Alexander Koller, Ivan Titov
- Abstract summary: We phrase semantic parsing as a two-step process: we first tag each input token with a multiset of output tokens.
Then we arrange the tokens into an output sequence using a new way of parameterizing and predicting permutations.
Our model outperforms pretrained seq2seq models and prior work on realistic semantic parsing tasks.
- Score: 121.37328648951993
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Seq2seq models have been shown to struggle with compositional generalization
in semantic parsing, i.e. generalizing to unseen compositions of phenomena that
the model handles correctly in isolation.
We phrase semantic parsing as a two-step process: we first tag each input
token with a multiset of output tokens. Then we arrange the tokens into an
output sequence using a new way of parameterizing and predicting permutations.
We formulate predicting a permutation as solving a regularized linear program
and we backpropagate through the solver. In contrast to prior work, our
approach does not place a priori restrictions on possible permutations, making
it very expressive.
Our model outperforms pretrained seq2seq models and prior work on realistic
semantic parsing tasks that require generalization to longer examples. We also
outperform non-tree-based models on structural generalization on the COGS
benchmark. For the first time, we show that a model without an inductive bias
provided by trees achieves high accuracy on generalization to deeper recursion.
Related papers
- Mutual Exclusivity Training and Primitive Augmentation to Induce
Compositionality [84.94877848357896]
Recent datasets expose the lack of the systematic generalization ability in standard sequence-to-sequence models.
We analyze this behavior of seq2seq models and identify two contributing factors: a lack of mutual exclusivity bias and the tendency to memorize whole examples.
We show substantial empirical improvements using standard sequence-to-sequence models on two widely-used compositionality datasets.
arXiv Detail & Related papers (2022-11-28T17:36:41Z) - Hierarchical Phrase-based Sequence-to-Sequence Learning [94.10257313923478]
We describe a neural transducer that maintains the flexibility of standard sequence-to-sequence (seq2seq) models while incorporating hierarchical phrases as a source of inductive bias during training and as explicit constraints during inference.
Our approach trains two models: a discriminative derivation based on a bracketing grammar whose tree hierarchically aligns source and target phrases, and a neural seq2seq model that learns to translate the aligned phrases one-by-one.
arXiv Detail & Related papers (2022-11-15T05:22:40Z) - Translate First Reorder Later: Leveraging Monotonicity in Semantic
Parsing [4.396860522241306]
TPol is a two-step approach that translates input sentences monotonically and then reorders them to obtain the correct output.
We test our approach on two popular semantic parsing datasets.
arXiv Detail & Related papers (2022-10-10T17:50:42Z) - Recursive Decoding: A Situated Cognition Approach to Compositional
Generation in Grounded Language Understanding [0.0]
We present Recursive Decoding, a novel procedure for training and using seq2seq models.
Rather than generating an entire output sequence in one pass, models are trained to predict one token at a time.
RD yields dramatic improvement on two previously neglected generalization tasks in gSCAN.
arXiv Detail & Related papers (2022-01-27T19:13:42Z) - Grounded Graph Decoding Improves Compositional Generalization in
Question Answering [68.72605660152101]
Question answering models struggle to generalize to novel compositions of training patterns, such as longer sequences or more complex test structures.
We propose Grounded Graph Decoding, a method to improve compositional generalization of language representations by grounding structured predictions with an attention mechanism.
Our model significantly outperforms state-of-the-art baselines on the Compositional Freebase Questions (CFQ) dataset, a challenging benchmark for compositional generalization in question answering.
arXiv Detail & Related papers (2021-11-05T17:50:14Z) - Structured Reordering for Modeling Latent Alignments in Sequence
Transduction [86.94309120789396]
We present an efficient dynamic programming algorithm performing exact marginal inference of separable permutations.
The resulting seq2seq model exhibits better systematic generalization than standard models on synthetic problems and NLP tasks.
arXiv Detail & Related papers (2021-06-06T21:53:54Z) - Span-based Semantic Parsing for Compositional Generalization [53.24255235340056]
SpanBasedSP predicts a span tree over an input utterance, explicitly encoding how partial programs compose over spans in the input.
On GeoQuery, SCAN and CLOSURE, SpanBasedSP performs similarly to strong seq2seq baselines on random splits, but dramatically improves performance compared to baselines on splits that require compositional generalization.
arXiv Detail & Related papers (2020-09-13T16:42:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.