SLOG: A Structural Generalization Benchmark for Semantic Parsing
- URL: http://arxiv.org/abs/2310.15040v1
- Date: Mon, 23 Oct 2023 15:39:09 GMT
- Title: SLOG: A Structural Generalization Benchmark for Semantic Parsing
- Authors: Bingzhi Li, Lucia Donatelli, Alexander Koller, Tal Linzen, Yuekun Yao,
Najoung Kim
- Abstract summary: The goal of compositional generalization benchmarks is to evaluate how well models generalize to new complex linguistic expressions.
Existing benchmarks often focus on lexical generalization, the interpretation of novel lexical items in syntactic structures familiar from training, are often underrepresented.
We introduce SLOG, a semantic parsing dataset that extends COGS with 17 structural generalization cases.
- Score: 68.19511282584304
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: The goal of compositional generalization benchmarks is to evaluate how well
models generalize to new complex linguistic expressions. Existing benchmarks
often focus on lexical generalization, the interpretation of novel lexical
items in syntactic structures familiar from training; structural generalization
tasks, where a model needs to interpret syntactic structures that are
themselves unfamiliar from training, are often underrepresented, resulting in
overly optimistic perceptions of how well models can generalize. We introduce
SLOG, a semantic parsing dataset that extends COGS (Kim and Linzen, 2020) with
17 structural generalization cases. In our experiments, the generalization
accuracy of Transformer models, including pretrained ones, only reaches 40.6%,
while a structure-aware parser only achieves 70.8%. These results are far from
the near-perfect accuracy existing models achieve on COGS, demonstrating the
role of SLOG in foregrounding the large discrepancy between models' lexical and
structural generalization capacities.
Related papers
- Evaluating Structural Generalization in Neural Machine Translation [13.880151307013318]
We construct SGET, a dataset covering various types of compositional generalization with control of words and sentence structures.
We show that neural machine translation models struggle more in structural generalization than in lexical generalization.
We also find different performance trends in semantic parsing and machine translation, which indicates the importance of evaluations across various tasks.
arXiv Detail & Related papers (2024-06-19T09:09:11Z) - Learning Syntax Without Planting Trees: Understanding When and Why Transformers Generalize Hierarchically [74.96551626420188]
Transformers trained on natural language data have been shown to learn its hierarchical structure and generalize to sentences with unseen syntactic structures.
We investigate sources of inductive bias in transformer models and their training that could cause such generalization behavior to emerge.
arXiv Detail & Related papers (2024-04-25T07:10:29Z) - Compositional Generalisation with Structured Reordering and Fertility
Layers [121.37328648951993]
Seq2seq models have been shown to struggle with compositional generalisation.
We present a flexible end-to-end differentiable neural model that composes two structural operations.
arXiv Detail & Related papers (2022-10-06T19:51:31Z) - Revisiting the Compositional Generalization Abilities of Neural Sequence
Models [23.665350744415004]
We focus on one-shot primitive generalization as introduced by the popular SCAN benchmark.
We demonstrate that modifying the training distribution in simple and intuitive ways enables standard seq-to-seq models to achieve near-perfect generalization performance.
arXiv Detail & Related papers (2022-03-14T18:03:21Z) - Compositional Generalization Requires Compositional Parsers [69.77216620997305]
We compare sequence-to-sequence models and models guided by compositional principles on the recent COGS corpus.
We show structural generalization is a key measure of compositional generalization and requires models that are aware of complex structure.
arXiv Detail & Related papers (2022-02-24T07:36:35Z) - Structural Supervision Improves Few-Shot Learning and Syntactic
Generalization in Neural Language Models [47.42249565529833]
Humans can learn structural properties about a word from minimal experience.
We assess the ability of modern neural language models to reproduce this behavior in English.
arXiv Detail & Related papers (2020-10-12T14:12:37Z) - Improving Compositional Generalization in Semantic Parsing [54.4720965813889]
Generalization of models to out-of-distribution (OOD) data has captured tremendous attention recently.
We investigate compositional generalization in semantic parsing, a natural test-bed for compositional generalization.
arXiv Detail & Related papers (2020-10-12T12:34:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.