How Do In-Context Examples Affect Compositional Generalization?
- URL: http://arxiv.org/abs/2305.04835v3
- Date: Fri, 9 Jun 2023 02:25:29 GMT
- Title: How Do In-Context Examples Affect Compositional Generalization?
- Authors: Shengnan An, Zeqi Lin, Qiang Fu, Bei Chen, Nanning Zheng, Jian-Guang
Lou and Dongmei Zhang
- Abstract summary: In this paper, we present CoFe, a test suite to investigate in-context compositional generalization.
We find that the compositional generalization performance can be easily affected by the selection of in-context examples.
Our systematic experiments indicate that in-context examples should be structurally similar to the test case, diverse from each other, and individually simple.
- Score: 86.57079616209474
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Compositional generalization--understanding unseen combinations of seen
primitives--is an essential reasoning capability in human intelligence. The AI
community mainly studies this capability by fine-tuning neural networks on lots
of training samples, while it is still unclear whether and how in-context
learning--the prevailing few-shot paradigm based on large language
models--exhibits compositional generalization. In this paper, we present CoFe,
a test suite to investigate in-context compositional generalization. We find
that the compositional generalization performance can be easily affected by the
selection of in-context examples, thus raising the research question what the
key factors are to make good in-context examples for compositional
generalization. We study three potential factors: similarity, diversity and
complexity. Our systematic experiments indicate that in-context examples should
be structurally similar to the test case, diverse from each other, and
individually simple. Furthermore, two strong limitations are observed:
in-context compositional generalization on fictional words is much weaker than
that on commonly used ones; it is still critical that the in-context examples
should cover required linguistic structures, even though the backbone model has
been pre-trained on large corpus. We hope our analysis would facilitate the
understanding and utilization of in-context learning paradigm.
Related papers
- Toward Understanding In-context vs. In-weight Learning [50.24035812301655]
We identify simplified distributional properties that give rise to the emergence and disappearance of in-context learning.
We then extend the study to a full large language model, showing how fine-tuning on various collections of natural language prompts can elicit similar in-context and in-weight learning behaviour.
arXiv Detail & Related papers (2024-10-30T14:09:00Z) - Skews in the Phenomenon Space Hinder Generalization in Text-to-Image Generation [59.138470433237615]
We introduce statistical metrics that quantify both the linguistic and visual skew of a dataset for relational learning.
We show that systematically controlled metrics are strongly predictive of generalization performance.
This work informs an important direction towards quality-enhancing the data diversity or balance to scaling up the absolute size.
arXiv Detail & Related papers (2024-03-25T03:18:39Z) - Towards Understanding the Relationship between In-context Learning and Compositional Generalization [7.843029855730508]
We train a causal Transformer in a setting that renders ordinary learning very difficult.
The model can solve the task, however, by utilizing earlier examples to generalize to later ones.
In evaluations on the datasets, SCAN, COGS, and GeoQuery, models trained in this manner indeed show improved compositional generalization.
arXiv Detail & Related papers (2024-03-18T14:45:52Z) - On Evaluating Multilingual Compositional Generalization with Translated
Datasets [34.51457321680049]
We show that compositional generalization abilities differ across languages.
We craft a faithful rule-based translation of the MCWQ dataset from English to Chinese and Japanese.
Even with the resulting robust benchmark, which we call MCWQ-R, we show that the distribution of compositions still suffers due to linguistic divergences.
arXiv Detail & Related papers (2023-06-20T10:03:57Z) - Vector-based Representation is the Key: A Study on Disentanglement and
Compositional Generalization [77.57425909520167]
We show that it is possible to achieve both good concept recognition and novel concept composition.
We propose a method to reform the scalar-based disentanglement works to be vector-based to increase both capabilities.
arXiv Detail & Related papers (2023-05-29T13:05:15Z) - Defending Compositionality in Emergent Languages [3.04585143845864]
We argue that some conclusions are too strong and/or incomplete.
In the context of a two-agent communication game, we show that compositionality indeed seems essential for successful generalization when the evaluation is done on a proper dataset.
arXiv Detail & Related papers (2022-06-09T20:13:46Z) - Compositional Generalization Requires Compositional Parsers [69.77216620997305]
We compare sequence-to-sequence models and models guided by compositional principles on the recent COGS corpus.
We show structural generalization is a key measure of compositional generalization and requires models that are aware of complex structure.
arXiv Detail & Related papers (2022-02-24T07:36:35Z) - Meta-Learning to Compositionally Generalize [34.656819307701156]
We implement a meta-learning augmented version of supervised learning.
We construct pairs of tasks for meta-learning by sub-sampling existing training data.
Experimental results on the COGS and SCAN datasets show that our similarity-driven meta-learning can improve generalization performance.
arXiv Detail & Related papers (2021-06-08T11:21:48Z) - A Benchmark for Systematic Generalization in Grounded Language
Understanding [61.432407738682635]
Humans easily interpret expressions that describe unfamiliar situations composed from familiar parts.
Modern neural networks, by contrast, struggle to interpret novel compositions.
We introduce a new benchmark, gSCAN, for evaluating compositional generalization in situated language understanding.
arXiv Detail & Related papers (2020-03-11T08:40:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.