Related papers: Learning to Combine Top-Down and Bottom-Up Signals in Recurrent Neural Networks with Attention over Modules

Learning to Combine Top-Down and Bottom-Up Signals in Recurrent Neural Networks with Attention over Modules

URL: http://arxiv.org/abs/2006.16981v3
Date: Sun, 15 Nov 2020 18:34:53 GMT
Title: Learning to Combine Top-Down and Bottom-Up Signals in Recurrent Neural Networks with Attention over Modules
Authors: Sarthak Mittal, Alex Lamb, Anirudh Goyal, Vikram Voleti, Murray Shanahan, Guillaume Lajoie, Michael Mozer, Yoshua Bengio
Abstract summary: Robust perception relies on both bottom-up and top-down signals. We explore deep recurrent neural net architectures in which bottom-up and top-down signals are dynamically combined using attention. We demonstrate on a variety of benchmarks in language modeling, sequential image classification, video prediction and reinforcement learning that the emphbidirectional information flow can improve results over strong baselines.
Score: 81.1967157385085
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Robust perception relies on both bottom-up and top-down signals. Bottom-up signals consist of what's directly observed through sensation. Top-down signals consist of beliefs and expectations based on past experience and short-term memory, such as how the phrase `peanut butter and~...' will be completed. The optimal combination of bottom-up and top-down information remains an open question, but the manner of combination must be dynamic and both context and task dependent. To effectively utilize the wealth of potential top-down information available, and to prevent the cacophony of intermixed signals in a bidirectional architecture, mechanisms are needed to restrict information flow. We explore deep recurrent neural net architectures in which bottom-up and top-down signals are dynamically combined using attention. Modularity of the architecture further restricts the sharing and communication of information. Together, attention and modularity direct information flow, which leads to reliable performance improvements in perceptual and language tasks, and in particular improves robustness to distractions and noisy data. We demonstrate on a variety of benchmarks in language modeling, sequential image classification, video prediction and reinforcement learning that the \emph{bidirectional} information flow can improve results over strong baselines.

Related papers

LLHA-Net: A Hierarchical Attention Network for Two-View Correspondence Learning [33.76961965760301]
We propose a novel method called Layer-by-Layer Hierarchical Attention Network.<n>It enhances the precision of feature point matching in computer vision by addressing the issue of outliers.<n>Our method incorporates stage fusion, hierarchical extraction, and an attention mechanism to improve the network's representation capability.
arXiv Detail & Related papers (2025-12-31T04:25:53Z)
QoSDiff: An Implicit Topological Embedding Learning Framework Leveraging Denoising Diffusion and Adversarial Attention for Robust QoS Prediction [5.632045399777709]
This paper introduces emphQoSDiff, a novel embedding learning framework that bypasses the prerequisite of explicit graph construction.<n>To address these challenges, this paper introduces emphQoSDiff, a novel embedding learning framework that bypasses the prerequisite of explicit graph construction.
arXiv Detail & Related papers (2025-12-04T09:17:26Z)
Sensory robustness through top-down feedback and neural stochasticity in recurrent vision models [0.9188951403098383]
We trained convolutional recurrent neural networks (ConvRNN) on image classification in the presence or absence of top-down feedback projections.<n>We found that ConvRNNs with top-down feedback exhibited remarkable speed-accuracy trade-off and robustness to noise perturbations and adversarial attacks.
arXiv Detail & Related papers (2025-08-09T22:51:50Z)
Semantic Item Graph Enhancement for Multimodal Recommendation [49.66272783945571]
Multimodal recommendation systems have attracted increasing attention for their improved performance by leveraging items' multimodal information.<n>Prior methods often build modality-specific item-item semantic graphs from raw modality features.<n>These semantic graphs suffer from semantic deficiencies, including insufficient modeling of collaborative signals among items.
arXiv Detail & Related papers (2025-08-08T09:20:50Z)
Mitigating Attention Hacking in Preference-Based Reward Modeling via Interaction Distillation [62.14692332209628]
"Interaction Distillation" is a novel training framework for more adequate preference modeling through attention-level optimization.<n>It provides more stable and generalizable reward signals compared to state-of-the-art RM optimization methods.
arXiv Detail & Related papers (2025-08-04T17:06:23Z)
Reversible Decoupling Network for Single Image Reflection Removal [15.763420129991255]
High-level semantic clues tend to be compressed or discarded during layer-by-layer propagation. We propose a novel architecture called Reversible Decoupling Network (RDNet) RDNet employs a reversible encoder to secure valuable information while flexibly decoupling transmission- and reflection-relevant features during the forward pass.
arXiv Detail & Related papers (2024-10-10T15:58:27Z)
Connectivity-Inspired Network for Context-Aware Recognition [1.049712834719005]
We focus on the effect of incorporating circuit motifs found in biological brains to address visual recognition. Our convolutional architecture is inspired by the connectivity of human cortical and subcortical streams. We present a new plug-and-play module to model context awareness.
arXiv Detail & Related papers (2024-09-06T15:42:10Z)
Self-Attention-Based Contextual Modulation Improves Neural System Identification [2.784365807133169]
Cortical neurons in the primary visual cortex are sensitive to contextual information mediated by horizontal and feedback connections. CNNs integrate global contextual information to model contextual modulation via two mechanisms: successive convolutions and a fully connected readout layer. We find that self-attention can improve neural response predictions over parameter-matched CNNs in two key metrics: tuning curve correlation and peak tuning.
arXiv Detail & Related papers (2024-06-12T03:21:06Z)
Self-Contrastive Graph Diffusion Network [1.14219428942199]
We propose a novel framework called the Self-Contrastive Graph Diffusion Network (SCGDN) Our framework consists of two main components: the Attentional Module (AttM) and the Diffusion Module (DiFM) Unlike existing methodologies, SCGDN is an augmentation-free approach that avoids "sampling bias" and semantic drift.
arXiv Detail & Related papers (2023-07-27T04:00:23Z)
Multi-Agent Feedback Enabled Neural Networks for Intelligent Communications [28.723523146324002]
In this paper, a novel multi-agent feedback enabled neural network (MAFENN) framework is proposed. The MAFENN framework is theoretically formulated into a three-player Feedback Stackelberg game, and the game is proved to converge to the Feedback Stackelberg equilibrium. To verify the MAFENN framework's feasibility in wireless communications, a multi-agent MAFENN based equalizer (MAFENN-E) is developed.
arXiv Detail & Related papers (2022-05-22T05:28:43Z)
Deep Equilibrium Assisted Block Sparse Coding of Inter-dependent Signals: Application to Hyperspectral Imaging [71.57324258813675]
A dataset of inter-dependent signals is defined as a matrix whose columns demonstrate strong dependencies. A neural network is employed to act as structure prior and reveal the underlying signal interdependencies. Deep unrolling and Deep equilibrium based algorithms are developed, forming highly interpretable and concise deep-learning-based architectures.
arXiv Detail & Related papers (2022-03-29T21:00:39Z)
On the benefits of robust models in modulation recognition [53.391095789289736]
Deep Neural Networks (DNNs) using convolutional layers are state-of-the-art in many tasks in communications. In other domains, like image classification, DNNs have been shown to be vulnerable to adversarial perturbations. We propose a novel framework to test the robustness of current state-of-the-art models.
arXiv Detail & Related papers (2021-03-27T19:58:06Z)
PredRNN: A Recurrent Neural Network for Spatiotemporal Predictive Learning [109.84770951839289]
We present PredRNN, a new recurrent network for learning visual dynamics from historical context. We show that our approach obtains highly competitive results on three standard datasets.
arXiv Detail & Related papers (2021-03-17T08:28:30Z)
Neural Function Modules with Sparse Arguments: A Dynamic Approach to Integrating Information across Layers [84.57980167400513]
Neural Function Modules (NFM) aims to introduce the same structural capability into deep learning. Most of the work in the context of feed-forward networks combining top-down and bottom-up feedback is limited to classification problems. The key contribution of our work is to combine attention, sparsity, top-down and bottom-up feedback, in a flexible algorithm.
arXiv Detail & Related papers (2020-10-15T20:43:17Z)
Incremental Training of a Recurrent Neural Network Exploiting a Multi-Scale Dynamic Memory [79.42778415729475]
We propose a novel incrementally trained recurrent architecture targeting explicitly multi-scale learning. We show how to extend the architecture of a simple RNN by separating its hidden state into different modules. We discuss a training algorithm where new modules are iteratively added to the model to learn progressively longer dependencies.
arXiv Detail & Related papers (2020-06-29T08:35:49Z)

This list is automatically generated from the titles and abstracts of the papers in this site.