A Benchmark for Systematic Generalization in Grounded Language
Understanding
- URL: http://arxiv.org/abs/2003.05161v2
- Date: Sat, 17 Oct 2020 17:02:02 GMT
- Title: A Benchmark for Systematic Generalization in Grounded Language
Understanding
- Authors: Laura Ruis, Jacob Andreas, Marco Baroni, Diane Bouchacourt, Brenden M.
Lake
- Abstract summary: Humans easily interpret expressions that describe unfamiliar situations composed from familiar parts.
Modern neural networks, by contrast, struggle to interpret novel compositions.
We introduce a new benchmark, gSCAN, for evaluating compositional generalization in situated language understanding.
- Score: 61.432407738682635
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Humans easily interpret expressions that describe unfamiliar situations
composed from familiar parts ("greet the pink brontosaurus by the ferris
wheel"). Modern neural networks, by contrast, struggle to interpret novel
compositions. In this paper, we introduce a new benchmark, gSCAN, for
evaluating compositional generalization in situated language understanding.
Going beyond a related benchmark that focused on syntactic aspects of
generalization, gSCAN defines a language grounded in the states of a grid
world, facilitating novel evaluations of acquiring linguistically motivated
rules. For example, agents must understand how adjectives such as 'small' are
interpreted relative to the current world state or how adverbs such as
'cautiously' combine with new verbs. We test a strong multi-modal baseline
model and a state-of-the-art compositional method finding that, in most cases,
they fail dramatically when generalization requires systematic compositional
rules.
Related papers
- Learning Symbolic Rules over Abstract Meaning Representations for
Textual Reinforcement Learning [63.148199057487226]
We propose a modular, NEuroSymbolic Textual Agent (NESTA) that combines a generic semantic generalization with a rule induction system to learn interpretable rules as policies.
Our experiments show that the proposed NESTA method outperforms deep reinforcement learning-based techniques by achieving better to unseen test games and learning from fewer training interactions.
arXiv Detail & Related papers (2023-07-05T23:21:05Z) - On Evaluating Multilingual Compositional Generalization with Translated
Datasets [34.51457321680049]
We show that compositional generalization abilities differ across languages.
We craft a faithful rule-based translation of the MCWQ dataset from English to Chinese and Japanese.
Even with the resulting robust benchmark, which we call MCWQ-R, we show that the distribution of compositions still suffers due to linguistic divergences.
arXiv Detail & Related papers (2023-06-20T10:03:57Z) - How Do In-Context Examples Affect Compositional Generalization? [86.57079616209474]
In this paper, we present CoFe, a test suite to investigate in-context compositional generalization.
We find that the compositional generalization performance can be easily affected by the selection of in-context examples.
Our systematic experiments indicate that in-context examples should be structurally similar to the test case, diverse from each other, and individually simple.
arXiv Detail & Related papers (2023-05-08T16:32:18Z) - Variational Cross-Graph Reasoning and Adaptive Structured Semantics
Learning for Compositional Temporal Grounding [143.5927158318524]
Temporal grounding is the task of locating a specific segment from an untrimmed video according to a query sentence.
We introduce a new Compositional Temporal Grounding task and construct two new dataset splits.
We argue that the inherent structured semantics inside the videos and language is the crucial factor to achieve compositional generalization.
arXiv Detail & Related papers (2023-01-22T08:02:23Z) - Compositional Generalization in Unsupervised Compositional
Representation Learning: A Study on Disentanglement and Emergent Language [48.37815764394315]
We study three unsupervised representation learning algorithms on two datasets that allow directly testing compositional generalization.
We find that directly using the bottleneck representation with simple models and few labels may lead to worse generalization than using representations from layers before or after the learned representation itself.
Surprisingly, we find that increasing pressure to produce a disentangled representation produces representations with worse generalization, while representations from EL models show strong compositional generalization.
arXiv Detail & Related papers (2022-10-02T10:35:53Z) - Compositional Temporal Grounding with Structured Variational Cross-Graph
Correspondence Learning [92.07643510310766]
Temporal grounding in videos aims to localize one target video segment that semantically corresponds to a given query sentence.
We introduce a new Compositional Temporal Grounding task and construct two new dataset splits.
We empirically find that they fail to generalize to queries with novel combinations of seen words.
We propose a variational cross-graph reasoning framework that explicitly decomposes video and language into multiple structured hierarchies.
arXiv Detail & Related papers (2022-03-24T12:55:23Z) - Compositional Networks Enable Systematic Generalization for Grounded
Language Understanding [21.481360281719006]
Humans are remarkably flexible when understanding new sentences that include combinations of concepts they have never encountered before.
Recent work has shown that while deep networks can mimic some human language abilities when presented with novel sentences, systematic variation uncovers the limitations in the language-understanding abilities of networks.
We demonstrate that these limitations can be overcome by addressing the generalization challenges in the gSCAN dataset.
arXiv Detail & Related papers (2020-08-06T16:17:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.