Attention or memory? Neurointerpretable agents in space and time
- URL: http://arxiv.org/abs/2007.04862v2
- Date: Sun, 12 Jul 2020 15:32:16 GMT
- Title: Attention or memory? Neurointerpretable agents in space and time
- Authors: Lennart Bramlage and Aurelio Cortese
- Abstract summary: We design a model incorporating a self-attention mechanism that implements task-state representations in semantic feature-space.
To evaluate the agent's selective properties, we add a large volume of task-irrelevant features to observations.
In line with neuroscience predictions, self-attention leads to increased robustness to noise compared to benchmark models.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In neuroscience, attention has been shown to bidirectionally interact with
reinforcement learning (RL) processes. This interaction is thought to support
dimensionality reduction of task representations, restricting computations to
relevant features. However, it remains unclear whether these properties can
translate into real algorithmic advantages for artificial agents, especially in
dynamic environments. We design a model incorporating a self-attention
mechanism that implements task-state representations in semantic feature-space,
and test it on a battery of Atari games. To evaluate the agent's selective
properties, we add a large volume of task-irrelevant features to observations.
In line with neuroscience predictions, self-attention leads to increased
robustness to noise compared to benchmark models. Strikingly, this
self-attention mechanism is general enough, such that it can be naturally
extended to implement a transient working-memory, able to solve a partially
observable maze task. Lastly, we highlight the predictive quality of attended
stimuli. Because we use semantic observations, we can uncover not only which
features the agent elects to base decisions on, but also how it chooses to
compile more complex, relational features from simpler ones. These results
formally illustrate the benefits of attention in deep RL and provide evidence
for the interpretability of self-attention mechanisms.
Related papers
- Artificial Kuramoto Oscillatory Neurons [65.16453738828672]
We introduce Artificial Kuramotoy Neurons (AKOrN) as a dynamical alternative to threshold units.
We show that this idea provides performance improvements across a wide spectrum of tasks.
We believe that these empirical results show the importance of our assumptions at the most basic neuronal level of neural representation.
arXiv Detail & Related papers (2024-10-17T17:47:54Z) - Binding Dynamics in Rotating Features [72.80071820194273]
We propose an alternative "cosine binding" mechanism, which explicitly computes the alignment between features and adjusts weights accordingly.
This allows us to draw direct connections to self-attention and biological neural processes, and to shed light on the fundamental dynamics for object-centric representations to emerge in Rotating Features.
arXiv Detail & Related papers (2024-02-08T12:31:08Z) - Pessimism meets VCG: Learning Dynamic Mechanism Design via Offline
Reinforcement Learning [114.36124979578896]
We design a dynamic mechanism using offline reinforcement learning algorithms.
Our algorithm is based on the pessimism principle and only requires a mild assumption on the coverage of the offline data set.
arXiv Detail & Related papers (2022-05-05T05:44:26Z) - Learning Theory of Mind via Dynamic Traits Attribution [59.9781556714202]
We propose a new neural ToM architecture that learns to generate a latent trait vector of an actor from the past trajectories.
This trait vector then multiplicatively modulates the prediction mechanism via a fast weights' scheme in the prediction neural network.
We empirically show that the fast weights provide a good inductive bias to model the character traits of agents and hence improves mindreading ability.
arXiv Detail & Related papers (2022-04-17T11:21:18Z) - Object Based Attention Through Internal Gating [4.941630596191806]
We propose an artificial neural network model of object-based attention.
Our model captures the way in which attention is both top-down and recurrent.
We find that our model replicates a range of findings from neuroscience.
arXiv Detail & Related papers (2021-06-08T17:20:50Z) - Abstract Spatial-Temporal Reasoning via Probabilistic Abduction and
Execution [97.50813120600026]
Spatial-temporal reasoning is a challenging task in Artificial Intelligence (AI)
Recent works have focused on an abstract reasoning task of this kind -- Raven's Progressive Matrices ( RPM)
We propose a neuro-symbolic Probabilistic Abduction and Execution learner (PrAE) learner.
arXiv Detail & Related papers (2021-03-26T02:42:18Z) - Slow manifolds in recurrent networks encode working memory efficiently
and robustly [0.0]
Working memory is a cognitive function involving the storage and manipulation of latent information over brief intervals of time.
We use a top-down modeling approach to examine network-level mechanisms of working memory.
arXiv Detail & Related papers (2021-01-08T18:47:02Z) - Untangling tradeoffs between recurrence and self-attention in neural
networks [81.30894993852813]
We present a formal analysis of how self-attention affects gradient propagation in recurrent networks.
We prove that it mitigates the problem of vanishing gradients when trying to capture long-term dependencies.
We propose a relevancy screening mechanism that allows for a scalable use of sparse self-attention with recurrence.
arXiv Detail & Related papers (2020-06-16T19:24:25Z) - Neuroevolution of Self-Interpretable Agents [11.171154483167514]
Inattentional blindness is the psychological phenomenon that causes one to miss things in plain sight.
Motivated by selective attention, we study the properties of artificial agents that perceive the world through the lens of a self-attention bottleneck.
arXiv Detail & Related papers (2020-03-18T11:40:35Z) - Is Attention All What You Need? -- An Empirical Investigation on
Convolution-Based Active Memory and Self-Attention [7.967230034960396]
We evaluate whether various active-memory mechanisms could replace self-attention in a Transformer.
Experiments suggest that active-memory alone achieves comparable results to the self-attention mechanism for language modelling.
For some specific algorithmic tasks, active-memory mechanisms alone outperform both self-attention and a combination of the two.
arXiv Detail & Related papers (2019-12-27T02:01:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.