Composition, Attention, or Both?
- URL: http://arxiv.org/abs/2210.12958v3
- Date: Thu, 11 May 2023 02:39:00 GMT
- Title: Composition, Attention, or Both?
- Authors: Ryo Yoshida and Yohei Oseki
- Abstract summary: We propose a novel architecture called Composition Attention Grammars (CAGs)
We investigate whether composition function and self-attention mechanism can both induce human-like syntactic generalization.
- Score: 8.22379888383833
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we propose a novel architecture called Composition Attention
Grammars (CAGs) that recursively compose subtrees into a single vector
representation with a composition function, and selectively attend to previous
structural information with a self-attention mechanism. We investigate whether
these components -- the composition function and the self-attention mechanism
-- can both induce human-like syntactic generalization. Specifically, we train
language models (LMs) with and without these two components with the model
sizes carefully controlled, and evaluate their syntactic generalization
performance against six test circuits on the SyntaxGym benchmark. The results
demonstrated that the composition function and the self-attention mechanism
both play an important role to make LMs more human-like, and closer inspection
of linguistic phenomenon implied that the composition function allowed
syntactic features, but not semantic features, to percolate into subtree
representations.
Related papers
- What makes Models Compositional? A Theoretical View: With Supplement [60.284698521569936]
We propose a general neuro-symbolic definition of compositional functions and their compositional complexity.
We show how various existing general and special purpose sequence processing models fit this definition and use it to analyze their compositional complexity.
arXiv Detail & Related papers (2024-05-02T20:10:27Z) - Compositional learning of functions in humans and machines [23.583544271543033]
We develop a function learning paradigm to explore the capacity of humans and neural network models in learning and reasoning with compositional functions.
Our findings indicate that humans can make zero-shot generalizations on novel visual function compositions across interaction conditions.
A comparison with a neural network model on the same task reveals that, through the meta-learning for compositionality (MLC) approach, a standard sequence-to-sequence Transformer can mimic human generalization patterns in composing functions.
arXiv Detail & Related papers (2024-03-18T19:22:53Z) - Im-Promptu: In-Context Composition from Image Prompts [10.079743487034762]
We investigate whether analogical reasoning can enable in-context composition over composable elements of visual stimuli.
We use Im-Promptu to train agents with different levels of compositionality, including vector representations, patch representations, and object slots.
Our experiments reveal tradeoffs between extrapolation abilities and the degree of compositionality, with non-compositional representations extending learned composition rules to unseen domains but performing poorly on tasks.
arXiv Detail & Related papers (2023-05-26T21:10:11Z) - Compositional Generalization in Grounded Language Learning via Induced
Model Sparsity [81.38804205212425]
We consider simple language-conditioned navigation problems in a grid world environment with disentangled observations.
We design an agent that encourages sparse correlations between words in the instruction and attributes of objects, composing them together to find the goal.
Our agent maintains a high level of performance on goals containing novel combinations of properties even when learning from a handful of demonstrations.
arXiv Detail & Related papers (2022-07-06T08:46:27Z) - Compositional Generalization Requires Compositional Parsers [69.77216620997305]
We compare sequence-to-sequence models and models guided by compositional principles on the recent COGS corpus.
We show structural generalization is a key measure of compositional generalization and requires models that are aware of complex structure.
arXiv Detail & Related papers (2022-02-24T07:36:35Z) - Zero-Shot Generalization using Intrinsically Motivated Compositional
Emergent Protocols [0.0]
We show how compositionality can enable agents to not only interact with unseen objects but also transfer skills from one task to another in a zero-shot setting.
We demonstrate how compositionality can enable agents to not only interact with unseen objects but also transfer skills from one task to another in a zero-shot setting.
arXiv Detail & Related papers (2021-05-11T14:20:26Z) - Decomposing lexical and compositional syntax and semantics with deep
language models [82.81964713263483]
The activations of language transformers like GPT2 have been shown to linearly map onto brain activity during speech comprehension.
Here, we propose a taxonomy to factorize the high-dimensional activations of language models into four classes: lexical, compositional, syntactic, and semantic representations.
The results highlight two findings. First, compositional representations recruit a more widespread cortical network than lexical ones, and encompass the bilateral temporal, parietal and prefrontal cortices.
arXiv Detail & Related papers (2021-03-02T10:24:05Z) - Semantic Disentangling Generalized Zero-Shot Learning [50.259058462272435]
Generalized Zero-Shot Learning (GZSL) aims to recognize images from both seen and unseen categories.
In this paper, we propose a novel feature disentangling approach based on an encoder-decoder architecture.
The proposed model aims to distill quality semantic-consistent representations that capture intrinsic features of seen images.
arXiv Detail & Related papers (2021-01-20T05:46:21Z) - Compositional Generalization via Semantic Tagging [81.24269148865555]
We propose a new decoding framework that preserves the expressivity and generality of sequence-to-sequence models.
We show that the proposed approach consistently improves compositional generalization across model architectures, domains, and semantic formalisms.
arXiv Detail & Related papers (2020-10-22T15:55:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.