Learning as Reinforcement: Applying Principles of Neuroscience for More
General Reinforcement Learning Agents
- URL: http://arxiv.org/abs/2004.09043v1
- Date: Mon, 20 Apr 2020 04:06:21 GMT
- Title: Learning as Reinforcement: Applying Principles of Neuroscience for More
General Reinforcement Learning Agents
- Authors: Eric Zelikman, William Yin, Kenneth Wang
- Abstract summary: We implement an architecture founded in principles of experimental neuroscience, by combining computationally efficient abstractions of biological algorithms.
Our approach is inspired by research on spike-timing dependent plasticity, the transition between short and long term memory, and the role of various neurotransmitters in rewarding curiosity.
The Neurons-in-a-Box architecture can learn in a wholly generalizable manner, and demonstrates an efficient way to build and apply representations without explicitly optimizing over a set of criteria or actions.
- Score: 1.0742675209112622
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: A significant challenge in developing AI that can generalize well is
designing agents that learn about their world without being told what to learn,
and apply that learning to challenges with sparse rewards. Moreover, most
traditional reinforcement learning approaches explicitly separate learning and
decision making in a way that does not correspond to biological learning. We
implement an architecture founded in principles of experimental neuroscience,
by combining computationally efficient abstractions of biological algorithms.
Our approach is inspired by research on spike-timing dependent plasticity, the
transition between short and long term memory, and the role of various
neurotransmitters in rewarding curiosity. The Neurons-in-a-Box architecture can
learn in a wholly generalizable manner, and demonstrates an efficient way to
build and apply representations without explicitly optimizing over a set of
criteria or actions. We find it performs well in many environments including
OpenAI Gym's Mountain Car, which has no reward besides touching a hard-to-reach
flag on a hill, Inverted Pendulum, where it learns simple strategies to improve
the time it holds a pendulum up, a video stream, where it spontaneously learns
to distinguish an open and closed hand, as well as other environments like
Google Chrome's Dinosaur Game.
Related papers
- A Neuro-mimetic Realization of the Common Model of Cognition via Hebbian
Learning and Free Energy Minimization [55.11642177631929]
Large neural generative models are capable of synthesizing semantically rich passages of text or producing complex images.
We discuss the COGnitive Neural GENerative system, such an architecture that casts the Common Model of Cognition.
arXiv Detail & Related papers (2023-10-14T23:28:48Z) - Incremental procedural and sensorimotor learning in cognitive humanoid
robots [52.77024349608834]
This work presents a cognitive agent that can learn procedures incrementally.
We show the cognitive functions required in each substage and how adding new functions helps address tasks previously unsolved by the agent.
Results show that this approach is capable of solving complex tasks incrementally.
arXiv Detail & Related papers (2023-04-30T22:51:31Z) - Generative Adversarial Neuroevolution for Control Behaviour Imitation [3.04585143845864]
We propose to explore whether deep neuroevolution can be used for behaviour imitation on popular simulation environments.
We introduce a simple co-evolutionary adversarial generation framework, and evaluate its capabilities by evolving standard deep recurrent networks.
Across all tasks, we find the final elite actor agents capable of achieving scores as high as those obtained by the pre-trained agents.
arXiv Detail & Related papers (2023-04-03T16:33:22Z) - MARTI-4: new model of human brain, considering neocortex and basal
ganglia -- learns to play Atari game by reinforcement learning on a single
CPU [0.0]
We present MARTI - new model of human brain, considering neocortex and basal ganglia.
We introduce a novel surprise feeling mechanism, that significantly improves reinforcement learning process through inner rewards.
arXiv Detail & Related papers (2022-08-18T20:23:49Z) - Open-Ended Reinforcement Learning with Neural Reward Functions [2.4366811507669115]
In high-dimensional robotic environments our approach learns a wide range of interesting skills including front-flips for Half-Cheetah and one-legged running for Humanoid.
In the pixel-based Montezuma's Revenge environment our method also works with minimal changes and it learns complex skills that involve interacting with items and visiting diverse locations.
arXiv Detail & Related papers (2022-02-16T15:55:22Z) - Improving the sample-efficiency of neural architecture search with
reinforcement learning [0.0]
In this work, we would like to contribute to the area of Automated Machine Learning (AutoML)
Our focus is on one of the most promising research directions, reinforcement learning.
The validation accuracies of the child networks serve as a reward signal for training the controller.
We propose to modify this to a more modern and complex algorithm, PPO, which has demonstrated to be faster and more stable in other environments.
arXiv Detail & Related papers (2021-10-13T14:30:09Z) - Backprop-Free Reinforcement Learning with Active Neural Generative
Coding [84.11376568625353]
We propose a computational framework for learning action-driven generative models without backpropagation of errors (backprop) in dynamic environments.
We develop an intelligent agent that operates even with sparse rewards, drawing inspiration from the cognitive theory of planning as inference.
The robust performance of our agent offers promising evidence that a backprop-free approach for neural inference and learning can drive goal-directed behavior.
arXiv Detail & Related papers (2021-07-10T19:02:27Z) - Artificial Neural Variability for Deep Learning: On Overfitting, Noise
Memorization, and Catastrophic Forgetting [135.0863818867184]
artificial neural variability (ANV) helps artificial neural networks learn some advantages from natural'' neural networks.
ANV plays as an implicit regularizer of the mutual information between the training data and the learned model.
It can effectively relieve overfitting, label noise memorization, and catastrophic forgetting at negligible costs.
arXiv Detail & Related papers (2020-11-12T06:06:33Z) - Hierarchical Affordance Discovery using Intrinsic Motivation [69.9674326582747]
We propose an algorithm using intrinsic motivation to guide the learning of affordances for a mobile robot.
This algorithm is capable to autonomously discover, learn and adapt interrelated affordances without pre-programmed actions.
Once learned, these affordances may be used by the algorithm to plan sequences of actions in order to perform tasks of various difficulties.
arXiv Detail & Related papers (2020-09-23T07:18:21Z) - Reinforcement Learning and its Connections with Neuroscience and
Psychology [0.0]
We review findings in both neuroscience and psychology that evidence reinforcement learning as a promising candidate for modeling learning and decision making in the brain.
We then discuss the implications of this observed relationship between RL, neuroscience and psychology and its role in advancing research in both AI and brain science.
arXiv Detail & Related papers (2020-06-25T04:29:15Z) - Emergent Real-World Robotic Skills via Unsupervised Off-Policy
Reinforcement Learning [81.12201426668894]
We develop efficient reinforcement learning methods that acquire diverse skills without any reward function, and then repurpose these skills for downstream tasks.
We show that our proposed algorithm provides substantial improvement in learning efficiency, making reward-free real-world training feasible.
We also demonstrate that the learned skills can be composed using model predictive control for goal-oriented navigation, without any additional training.
arXiv Detail & Related papers (2020-04-27T17:38:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.