Coordination Among Neural Modules Through a Shared Global Workspace
- URL: http://arxiv.org/abs/2103.01197v1
- Date: Mon, 1 Mar 2021 18:43:48 GMT
- Title: Coordination Among Neural Modules Through a Shared Global Workspace
- Authors: Anirudh Goyal, Aniket Didolkar, Alex Lamb, Kartikeya Badola, Nan
Rosemary Ke, Nasim Rahaman, Jonathan Binas, Charles Blundell, Michael Mozer,
Yoshua Bengio
- Abstract summary: In cognitive science, a global workspace architecture has been proposed in which functionally specialized components share information.
We show that capacity limitations have a rational basis in that they encourage specialization and compositionality.
- Score: 78.08062292790109
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep learning has seen a movement away from representing examples with a
monolithic hidden state towards a richly structured state. For example,
Transformers segment by position, and object-centric architectures decompose
images into entities. In all these architectures, interactions between
different elements are modeled via pairwise interactions: Transformers make use
of self-attention to incorporate information from other positions;
object-centric architectures make use of graph neural networks to model
interactions among entities. However, pairwise interactions may not achieve
global coordination or a coherent, integrated representation that can be used
for downstream tasks. In cognitive science, a global workspace architecture has
been proposed in which functionally specialized components share information
through a common, bandwidth-limited communication channel. We explore the use
of such a communication channel in the context of deep learning for modeling
the structure of complex environments. The proposed method includes a shared
workspace through which communication among different specialist modules takes
place but due to limits on the communication bandwidth, specialist modules must
compete for access. We show that capacity limitations have a rational basis in
that (1) they encourage specialization and compositionality and (2) they
facilitate the synchronization of otherwise independent specialists.
Related papers
- DeepInteraction++: Multi-Modality Interaction for Autonomous Driving [80.8837864849534]
We introduce a novel modality interaction strategy that allows individual per-modality representations to be learned and maintained throughout.
DeepInteraction++ is a multi-modal interaction framework characterized by a multi-modal representational interaction encoder and a multi-modal predictive interaction decoder.
Experiments demonstrate the superior performance of the proposed framework on both 3D object detection and end-to-end autonomous driving tasks.
arXiv Detail & Related papers (2024-08-09T14:04:21Z) - LoginMEA: Local-to-Global Interaction Network for Multi-modal Entity Alignment [18.365849722239865]
Multi-modal entity alignment (MMEA) aims to identify equivalent entities between two multi-modal knowledge graphs.
We propose a novel local-to-global interaction network for MMEA, termed as LoginMEA.
arXiv Detail & Related papers (2024-07-29T01:06:45Z) - REACT: Recognize Every Action Everywhere All At Once [8.10024991952397]
Group Activity Decoder (GAR) is a fundamental problem in computer vision, with diverse applications in sports analysis, surveillance, and social scene understanding.
We present REACT, an architecture inspired by the transformer encoder-decoder model.
Our method outperforms state-of-the-art GAR approaches in extensive experiments, demonstrating superior accuracy in recognizing and understanding group activities.
arXiv Detail & Related papers (2023-11-27T20:48:54Z) - Collective Relational Inference for learning heterogeneous interactions [8.215734914005845]
We propose a novel probabilistic method for relational inference, which possesses two distinctive characteristics compared to existing methods.
We evaluate the proposed methodology across several benchmark datasets and demonstrate that it outperforms existing methods in accurately inferring interaction types.
Overall the proposed model is data-efficient and generalizable to large systems when trained on smaller ones.
arXiv Detail & Related papers (2023-04-30T19:45:04Z) - Global-and-Local Collaborative Learning for Co-Salient Object Detection [162.62642867056385]
The goal of co-salient object detection (CoSOD) is to discover salient objects that commonly appear in a query group containing two or more relevant images.
We propose a global-and-local collaborative learning architecture, which includes a global correspondence modeling (GCM) and a local correspondence modeling (LCM)
The proposed GLNet is evaluated on three prevailing CoSOD benchmark datasets, demonstrating that our model trained on a small dataset (about 3k images) still outperforms eleven state-of-the-art competitors trained on some large datasets (about 8k-200k images)
arXiv Detail & Related papers (2022-04-19T14:32:41Z) - Bidirectional Graph Reasoning Network for Panoptic Segmentation [126.06251745669107]
We introduce a Bidirectional Graph Reasoning Network (BGRNet) to mine the intra-modular and intermodular relations within and between foreground things and background stuff classes.
BGRNet first constructs image-specific graphs in both instance and semantic segmentation branches that enable flexible reasoning at the proposal level and class level.
arXiv Detail & Related papers (2020-04-14T02:32:10Z) - Cascaded Human-Object Interaction Recognition [175.60439054047043]
We introduce a cascade architecture for a multi-stage, coarse-to-fine HOI understanding.
At each stage, an instance localization network progressively refines HOI proposals and feeds them into an interaction recognition network.
With our carefully-designed human-centric relation features, these two modules work collaboratively towards effective interaction understanding.
arXiv Detail & Related papers (2020-03-09T17:05:04Z) - Learning Structured Communication for Multi-agent Reinforcement Learning [104.64584573546524]
This work explores the large-scale multi-agent communication mechanism under a multi-agent reinforcement learning (MARL) setting.
We propose a novel framework termed as Learning Structured Communication (LSC) by using a more flexible and efficient communication topology.
arXiv Detail & Related papers (2020-02-11T07:19:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.