What makes Models Compositional? A Theoretical View: With Supplement
- URL: http://arxiv.org/abs/2405.02350v1
- Date: Thu, 2 May 2024 20:10:27 GMT
- Title: What makes Models Compositional? A Theoretical View: With Supplement
- Authors: Parikshit Ram, Tim Klinger, Alexander G. Gray,
- Abstract summary: We propose a general neuro-symbolic definition of compositional functions and their compositional complexity.
We show how various existing general and special purpose sequence processing models fit this definition and use it to analyze their compositional complexity.
- Score: 60.284698521569936
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Compositionality is thought to be a key component of language, and various compositional benchmarks have been developed to empirically probe the compositional generalization of existing sequence processing models. These benchmarks often highlight failures of existing models, but it is not clear why these models fail in this way. In this paper, we seek to theoretically understand the role the compositional structure of the models plays in these failures and how this structure relates to their expressivity and sample complexity. We propose a general neuro-symbolic definition of compositional functions and their compositional complexity. We then show how various existing general and special purpose sequence processing models (such as recurrent, convolution and attention-based ones) fit this definition and use it to analyze their compositional complexity. Finally, we provide theoretical guarantees for the expressivity and systematic generalization of compositional models that explicitly depend on our proposed definition and highlighting factors which drive poor empirical performance.
Related papers
- When does compositional structure yield compositional generalization? A kernel theory [0.0]
We present a theory of compositional generalization in kernel models with fixed representations.
We identify novel failure modes in compositional generalization that arise from biases in the training data.
This work provides a theoretical perspective on how statistical structure in the training data can affect compositional generalization.
arXiv Detail & Related papers (2024-05-26T00:50:11Z) - SLOG: A Structural Generalization Benchmark for Semantic Parsing [68.19511282584304]
The goal of compositional generalization benchmarks is to evaluate how well models generalize to new complex linguistic expressions.
Existing benchmarks often focus on lexical generalization, the interpretation of novel lexical items in syntactic structures familiar from training, are often underrepresented.
We introduce SLOG, a semantic parsing dataset that extends COGS with 17 structural generalization cases.
arXiv Detail & Related papers (2023-10-23T15:39:09Z) - Dynamic MOdularized Reasoning for Compositional Structured Explanation
Generation [29.16040150962427]
We propose a dynamic modularized reasoning model, MORSE, to improve compositional generalization of neural models.
MORSE factorizes the inference process into a combination of modules, where each module represents a functional unit.
We conduct experiments for increasing lengths and shapes of reasoning trees on two benchmarks to test MORSE's compositional generalization abilities.
arXiv Detail & Related papers (2023-09-14T11:40:30Z) - Compositional Generalisation with Structured Reordering and Fertility
Layers [121.37328648951993]
Seq2seq models have been shown to struggle with compositional generalisation.
We present a flexible end-to-end differentiable neural model that composes two structural operations.
arXiv Detail & Related papers (2022-10-06T19:51:31Z) - Compositional Generalization Requires Compositional Parsers [69.77216620997305]
We compare sequence-to-sequence models and models guided by compositional principles on the recent COGS corpus.
We show structural generalization is a key measure of compositional generalization and requires models that are aware of complex structure.
arXiv Detail & Related papers (2022-02-24T07:36:35Z) - Disentangled Sequence to Sequence Learning for Compositional
Generalization [62.954842223732435]
We propose an extension to sequence-to-sequence models which allows us to learn disentangled representations by adaptively re-encoding the source input.
Experimental results on semantic parsing and machine translation empirically show that our proposal yields more disentangled representations and better generalization.
arXiv Detail & Related papers (2021-10-09T22:27:19Z) - Compositional Generalization via Semantic Tagging [81.24269148865555]
We propose a new decoding framework that preserves the expressivity and generality of sequence-to-sequence models.
We show that the proposed approach consistently improves compositional generalization across model architectures, domains, and semantic formalisms.
arXiv Detail & Related papers (2020-10-22T15:55:15Z) - A Study of Compositional Generalization in Neural Models [22.66002315559978]
We introduce ConceptWorld, which enables the generation of images from compositional and relational concepts.
We perform experiments to test the ability of standard neural networks to generalize on relations with compositional arguments.
For simple problems, all models generalize well to close concepts but struggle with longer compositional chains.
arXiv Detail & Related papers (2020-06-16T18:29:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.