Related papers: Coordination Among Neural Modules Through a Shared Global Workspace

Coordination Among Neural Modules Through a Shared Global Workspace

URL: http://arxiv.org/abs/2103.01197v1
Date: Mon, 1 Mar 2021 18:43:48 GMT
Title: Coordination Among Neural Modules Through a Shared Global Workspace
Authors: Anirudh Goyal, Aniket Didolkar, Alex Lamb, Kartikeya Badola, Nan Rosemary Ke, Nasim Rahaman, Jonathan Binas, Charles Blundell, Michael Mozer, Yoshua Bengio
Abstract summary: In cognitive science, a global workspace architecture has been proposed in which functionally specialized components share information. We show that capacity limitations have a rational basis in that they encourage specialization and compositionality.
Score: 78.08062292790109
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Deep learning has seen a movement away from representing examples with a monolithic hidden state towards a richly structured state. For example, Transformers segment by position, and object-centric architectures decompose images into entities. In all these architectures, interactions between different elements are modeled via pairwise interactions: Transformers make use of self-attention to incorporate information from other positions; object-centric architectures make use of graph neural networks to model interactions among entities. However, pairwise interactions may not achieve global coordination or a coherent, integrated representation that can be used for downstream tasks. In cognitive science, a global workspace architecture has been proposed in which functionally specialized components share information through a common, bandwidth-limited communication channel. We explore the use of such a communication channel in the context of deep learning for modeling the structure of complex environments. The proposed method includes a shared workspace through which communication among different specialist modules takes place but due to limits on the communication bandwidth, specialist modules must compete for access. We show that capacity limitations have a rational basis in that (1) they encourage specialization and compositionality and (2) they facilitate the synchronization of otherwise independent specialists.

Related papers

Interpretable deformable image registration: A geometric deep learning perspective [9.13809412085203]
We present a theoretical foundation for designing an interpretable registration framework. We formulate an end-to-end process that refines transformations in a coarse-to-fine fashion. We conclude by showing significant improvement in performance metrics over state-of-the-art approaches.
arXiv Detail & Related papers (2024-12-17T19:47:10Z)
DeepInteraction++: Multi-Modality Interaction for Autonomous Driving [80.8837864849534]
We introduce a novel modality interaction strategy that allows individual per-modality representations to be learned and maintained throughout. DeepInteraction++ is a multi-modal interaction framework characterized by a multi-modal representational interaction encoder and a multi-modal predictive interaction decoder. Experiments demonstrate the superior performance of the proposed framework on both 3D object detection and end-to-end autonomous driving tasks.
arXiv Detail & Related papers (2024-08-09T14:04:21Z)
LoginMEA: Local-to-Global Interaction Network for Multi-modal Entity Alignment [18.365849722239865]
Multi-modal entity alignment (MMEA) aims to identify equivalent entities between two multi-modal knowledge graphs. We propose a novel local-to-global interaction network for MMEA, termed as LoginMEA.
arXiv Detail & Related papers (2024-07-29T01:06:45Z)
REACT: Recognize Every Action Everywhere All At Once [8.10024991952397]
Group Activity Decoder (GAR) is a fundamental problem in computer vision, with diverse applications in sports analysis, surveillance, and social scene understanding. We present REACT, an architecture inspired by the transformer encoder-decoder model. Our method outperforms state-of-the-art GAR approaches in extensive experiments, demonstrating superior accuracy in recognizing and understanding group activities.
arXiv Detail & Related papers (2023-11-27T20:48:54Z)
Collective Relational Inference for learning heterogeneous interactions [8.215734914005845]
We propose a novel probabilistic method for relational inference, which possesses two distinctive characteristics compared to existing methods. We evaluate the proposed methodology across several benchmark datasets and demonstrate that it outperforms existing methods in accurately inferring interaction types. Overall the proposed model is data-efficient and generalizable to large systems when trained on smaller ones.
arXiv Detail & Related papers (2023-04-30T19:45:04Z)
Global-and-Local Collaborative Learning for Co-Salient Object Detection [162.62642867056385]
The goal of co-salient object detection (CoSOD) is to discover salient objects that commonly appear in a query group containing two or more relevant images. We propose a global-and-local collaborative learning architecture, which includes a global correspondence modeling (GCM) and a local correspondence modeling (LCM) The proposed GLNet is evaluated on three prevailing CoSOD benchmark datasets, demonstrating that our model trained on a small dataset (about 3k images) still outperforms eleven state-of-the-art competitors trained on some large datasets (about 8k-200k images)
arXiv Detail & Related papers (2022-04-19T14:32:41Z)
Bidirectional Graph Reasoning Network for Panoptic Segmentation [126.06251745669107]
We introduce a Bidirectional Graph Reasoning Network (BGRNet) to mine the intra-modular and intermodular relations within and between foreground things and background stuff classes. BGRNet first constructs image-specific graphs in both instance and semantic segmentation branches that enable flexible reasoning at the proposal level and class level.
arXiv Detail & Related papers (2020-04-14T02:32:10Z)
Cascaded Human-Object Interaction Recognition [175.60439054047043]
We introduce a cascade architecture for a multi-stage, coarse-to-fine HOI understanding. At each stage, an instance localization network progressively refines HOI proposals and feeds them into an interaction recognition network. With our carefully-designed human-centric relation features, these two modules work collaboratively towards effective interaction understanding.
arXiv Detail & Related papers (2020-03-09T17:05:04Z)
Learning Structured Communication for Multi-agent Reinforcement Learning [104.64584573546524]
This work explores the large-scale multi-agent communication mechanism under a multi-agent reinforcement learning (MARL) setting. We propose a novel framework termed as Learning Structured Communication (LSC) by using a more flexible and efficient communication topology.
arXiv Detail & Related papers (2020-02-11T07:19:45Z)

This list is automatically generated from the titles and abstracts of the papers in this site.