Related papers: Abstractors and relational cross-attention: An inductive bias for explicit relational reasoning in Transformers

Abstractors and relational cross-attention: An inductive bias for explicit relational reasoning in Transformers

URL: http://arxiv.org/abs/2304.00195v4
Date: Fri, 12 Apr 2024 22:49:28 GMT
Title: Abstractors and relational cross-attention: An inductive bias for explicit relational reasoning in Transformers
Authors: Awni Altabaa, Taylor Webb, Jonathan Cohen, John Lafferty,
Abstract summary: An extension of Transformers is proposed that enables explicit relational reasoning through a novel module called the Abstractor. At the core of the Abstractor is a variant of attention called relational cross-attention. The approach is motivated by an architectural inductive bias for relational learning that disentangles relational information from object-level features.
Score: 4.562331048595688
License: http://creativecommons.org/licenses/by/4.0/
Abstract: An extension of Transformers is proposed that enables explicit relational reasoning through a novel module called the Abstractor. At the core of the Abstractor is a variant of attention called relational cross-attention. The approach is motivated by an architectural inductive bias for relational learning that disentangles relational information from object-level features. This enables explicit relational reasoning, supporting abstraction and generalization from limited data. The Abstractor is first evaluated on simple discriminative relational tasks and compared to existing relational architectures. Next, the Abstractor is evaluated on purely relational sequence-to-sequence tasks, where dramatic improvements are seen in sample efficiency compared to standard Transformers. Finally, Abstractors are evaluated on a collection of tasks based on mathematical problem solving, where consistent improvements in performance and sample efficiency are observed.

Related papers

RiemannFormer: A Framework for Attention in Curved Spaces [0.43512163406552]
This research endeavors to offer insights into unlocking the further potential of transformer-based architectures.<n>One of the primary motivations is to offer a geometric interpretation for the attention mechanism in transformers.
arXiv Detail & Related papers (2025-06-09T03:56:18Z)
RESOLVE: Relational Reasoning with Symbolic and Object-Level Features Using Vector Symbolic Processing [1.3049516752695616]
We propose RESOLVE, a neuro-vector symbolic architecture that combines object-level features with relational representations in high-dimensional spaces. By leveraging this design, the model achieves both low compute latency and memory efficiency.
arXiv Detail & Related papers (2024-11-13T02:17:03Z)
Strengthening Structural Inductive Biases by Pre-training to Perform Syntactic Transformations [75.14793516745374]
We propose to strengthen the structural inductive bias of a Transformer by intermediate pre-training. Our experiments confirm that this helps with few-shot learning of syntactic tasks such as chunking. Our analysis shows that the intermediate pre-training leads to attention heads that keep track of which syntactic transformation needs to be applied to which token.
arXiv Detail & Related papers (2024-07-05T14:29:44Z)
Slot Abstractors: Toward Scalable Abstract Visual Reasoning [5.262577780347204]
We propose Slot Abstractors, an approach to abstract visual reasoning that can be scaled to problems involving a large number of objects and multiple relations among them. The approach displays state-of-the-art performance across four abstract visual reasoning tasks, as well as an abstract reasoning task involving real-world images.
arXiv Detail & Related papers (2024-03-06T04:49:02Z)
On Neural Architecture Inductive Biases for Relational Tasks [76.18938462270503]
We introduce a simple architecture based on similarity-distribution scores which we name Compositional Network generalization (CoRelNet) We find that simple architectural choices can outperform existing models in out-of-distribution generalizations.
arXiv Detail & Related papers (2022-06-09T16:24:01Z)
Amortized Inference for Causal Structure Learning [72.84105256353801]
Learning causal structure poses a search problem that typically involves evaluating structures using a score or independence test. We train a variational inference model to predict the causal structure from observational/interventional data. Our models exhibit robust generalization capabilities under substantial distribution shift.
arXiv Detail & Related papers (2022-05-25T17:37:08Z)
HiURE: Hierarchical Exemplar Contrastive Learning for Unsupervised Relation Extraction [60.80849503639896]
Unsupervised relation extraction aims to extract the relationship between entities from natural language sentences without prior information on relational scope or distribution. We propose a novel contrastive learning framework named HiURE, which has the capability to derive hierarchical signals from relational feature space using cross hierarchy attention. Experimental results on two public datasets demonstrate the advanced effectiveness and robustness of HiURE on unsupervised relation extraction when compared with state-of-the-art models.
arXiv Detail & Related papers (2022-05-04T17:56:48Z)
Dynamic Language Binding in Relational Visual Reasoning [67.85579756590478]
We present Language-binding Object Graph Network, the first neural reasoning method with dynamic relational structures across both visual and textual domains. Our method outperforms other methods in sophisticated question-answering tasks wherein multiple object relations are involved.
arXiv Detail & Related papers (2020-04-30T06:26:20Z)
SelfORE: Self-supervised Relational Feature Learning for Open Relation Extraction [60.08464995629325]
Open-domain relation extraction is the task of extracting open-domain relation facts from natural language sentences. We proposed a self-supervised framework named SelfORE, which exploits weak, self-supervised signals. Experimental results on three datasets show the effectiveness and robustness of SelfORE.
arXiv Detail & Related papers (2020-04-06T07:23:17Z)
Better Set Representations For Relational Reasoning [30.398348643632445]
relational reasoning operates on a set of entities, as opposed to standard vector representations. We propose a simple and general network module called a Set Refiner Network (SRN)
arXiv Detail & Related papers (2020-03-09T23:07:27Z)

This list is automatically generated from the titles and abstracts of the papers in this site.