Noisy Agents: Self-supervised Exploration by Predicting Auditory Events
- URL: http://arxiv.org/abs/2007.13729v1
- Date: Mon, 27 Jul 2020 17:59:08 GMT
- Title: Noisy Agents: Self-supervised Exploration by Predicting Auditory Events
- Authors: Chuang Gan, Xiaoyu Chen, Phillip Isola, Antonio Torralba, Joshua B.
Tenenbaum
- Abstract summary: We propose a novel type of intrinsic motivation for Reinforcement Learning (RL) that encourages the agent to understand the causal effect of its actions.
We train a neural network to predict the auditory events and use the prediction errors as intrinsic rewards to guide RL exploration.
Experimental results on Atari games show that our new intrinsic motivation significantly outperforms several state-of-the-art baselines.
- Score: 127.82594819117753
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Humans integrate multiple sensory modalities (e.g. visual and audio) to build
a causal understanding of the physical world. In this work, we propose a novel
type of intrinsic motivation for Reinforcement Learning (RL) that encourages
the agent to understand the causal effect of its actions through auditory event
prediction. First, we allow the agent to collect a small amount of acoustic
data and use K-means to discover underlying auditory event clusters. We then
train a neural network to predict the auditory events and use the prediction
errors as intrinsic rewards to guide RL exploration. Experimental results on
Atari games show that our new intrinsic motivation significantly outperforms
several state-of-the-art baselines. We further visualize our noisy agents'
behavior in a physics environment and demonstrate that our newly designed
intrinsic reward leads to the emergence of physical interaction behaviors (e.g.
contact with objects).
Related papers
- Variable-Agnostic Causal Exploration for Reinforcement Learning [56.52768265734155]
We introduce a novel framework, Variable-Agnostic Causal Exploration for Reinforcement Learning (VACERL)
Our approach automatically identifies crucial observation-action steps associated with key variables using attention mechanisms.
It constructs the causal graph connecting these steps, which guides the agent towards observation-action pairs with greater causal influence on task completion.
arXiv Detail & Related papers (2024-07-17T09:45:27Z) - A Neural Active Inference Model of Perceptual-Motor Learning [62.39667564455059]
The active inference framework (AIF) is a promising new computational framework grounded in contemporary neuroscience.
In this study, we test the ability for the AIF to capture the role of anticipation in the visual guidance of action in humans.
We present a novel formulation of the prior function that maps a multi-dimensional world-state to a uni-dimensional distribution of free-energy.
arXiv Detail & Related papers (2022-11-16T20:00:38Z) - Self-supervised Sequential Information Bottleneck for Robust Exploration
in Deep Reinforcement Learning [28.75574762244266]
In this work, we introduce the sequential information bottleneck objective for learning compressed and temporally coherent representations.
For efficient exploration in noisy environments, we further construct intrinsic rewards that capture task-relevant state novelty.
arXiv Detail & Related papers (2022-09-12T15:41:10Z) - Affect-Aware Deep Belief Network Representations for Multimodal
Unsupervised Deception Detection [3.04585143845864]
unsupervised approach for detecting real-world, high-stakes deception in videos without requiring labels.
This paper presents our novel approach for affect-aware unsupervised Deep Belief Networks (DBN)
In addition to using facial affect as a feature on which DBN models are trained, we also introduce a DBN training procedure that uses facial affect as an aligner of audio-visual representations.
arXiv Detail & Related papers (2021-08-17T22:07:26Z) - Backprop-Free Reinforcement Learning with Active Neural Generative
Coding [84.11376568625353]
We propose a computational framework for learning action-driven generative models without backpropagation of errors (backprop) in dynamic environments.
We develop an intelligent agent that operates even with sparse rewards, drawing inspiration from the cognitive theory of planning as inference.
The robust performance of our agent offers promising evidence that a backprop-free approach for neural inference and learning can drive goal-directed behavior.
arXiv Detail & Related papers (2021-07-10T19:02:27Z) - Agents that Listen: High-Throughput Reinforcement Learning with Multiple
Sensory Systems [6.952659395337689]
We introduce a new version of VizDoom simulator to create a highly efficient learning environment that provides raw audio observations.
We train our agent to play the full game of Doom and find that it can consistently defeat a traditional vision-based adversary.
arXiv Detail & Related papers (2021-07-05T18:00:50Z) - Causal Curiosity: RL Agents Discovering Self-supervised Experiments for
Causal Representation Learning [24.163616087447874]
We introduce em causal curiosity, a novel intrinsic reward.
We show that it allows our agents to learn optimal sequences of actions.
We also show that the knowledge of causal factor representations aids zero-shot learning for more complex tasks.
arXiv Detail & Related papers (2020-10-07T02:07:51Z) - Tracking Emotions: Intrinsic Motivation Grounded on Multi-Level
Prediction Error Dynamics [68.8204255655161]
We discuss how emotions arise when differences between expected and actual rates of progress towards a goal are experienced.
We present an intrinsic motivation architecture that generates behaviors towards self-generated and dynamic goals.
arXiv Detail & Related papers (2020-07-29T06:53:13Z) - Attention or memory? Neurointerpretable agents in space and time [0.0]
We design a model incorporating a self-attention mechanism that implements task-state representations in semantic feature-space.
To evaluate the agent's selective properties, we add a large volume of task-irrelevant features to observations.
In line with neuroscience predictions, self-attention leads to increased robustness to noise compared to benchmark models.
arXiv Detail & Related papers (2020-07-09T15:04:26Z) - Maximizing Information Gain in Partially Observable Environments via
Prediction Reward [64.24528565312463]
This paper tackles the challenge of using belief-based rewards for a deep RL agent.
We derive the exact error between negative entropy and the expected prediction reward.
This insight provides theoretical motivation for several fields using prediction rewards.
arXiv Detail & Related papers (2020-05-11T08:13:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.