Related papers: Learning as Reinforcement: Applying Principles of Neuroscience for More General Reinforcement Learning Agents

Learning as Reinforcement: Applying Principles of Neuroscience for More General Reinforcement Learning Agents

URL: http://arxiv.org/abs/2004.09043v1
Date: Mon, 20 Apr 2020 04:06:21 GMT
Title: Learning as Reinforcement: Applying Principles of Neuroscience for More General Reinforcement Learning Agents
Authors: Eric Zelikman, William Yin, Kenneth Wang
Abstract summary: We implement an architecture founded in principles of experimental neuroscience, by combining computationally efficient abstractions of biological algorithms. Our approach is inspired by research on spike-timing dependent plasticity, the transition between short and long term memory, and the role of various neurotransmitters in rewarding curiosity. The Neurons-in-a-Box architecture can learn in a wholly generalizable manner, and demonstrates an efficient way to build and apply representations without explicitly optimizing over a set of criteria or actions.
Score: 1.0742675209112622
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: A significant challenge in developing AI that can generalize well is designing agents that learn about their world without being told what to learn, and apply that learning to challenges with sparse rewards. Moreover, most traditional reinforcement learning approaches explicitly separate learning and decision making in a way that does not correspond to biological learning. We implement an architecture founded in principles of experimental neuroscience, by combining computationally efficient abstractions of biological algorithms. Our approach is inspired by research on spike-timing dependent plasticity, the transition between short and long term memory, and the role of various neurotransmitters in rewarding curiosity. The Neurons-in-a-Box architecture can learn in a wholly generalizable manner, and demonstrates an efficient way to build and apply representations without explicitly optimizing over a set of criteria or actions. We find it performs well in many environments including OpenAI Gym's Mountain Car, which has no reward besides touching a hard-to-reach flag on a hill, Inverted Pendulum, where it learns simple strategies to improve the time it holds a pendulum up, a video stream, where it spontaneously learns to distinguish an open and closed hand, as well as other environments like Google Chrome's Dinosaur Game.

Related papers

Less is More: some Computational Principles based on Parcimony, and Limitations of Natural Intelligence [39.89179121430488]
Natural intelligence consistently achieves more with less.<n>Today's AI relies on virtually unlimited computational power, energy, and data to reach high performance.<n>This paper argues that constraints in NI are paradoxically catalysts for efficiency, adaptability, and creativity.
arXiv Detail & Related papers (2025-06-08T09:42:29Z)
Personalized Artificial General Intelligence (AGI) via Neuroscience-Inspired Continuous Learning Systems [3.764721243654025]
Current approaches largely depend on expanding model parameters, which improves task-specific performance but falls short in enabling continuous, adaptable, and generalized learning. This paper reviews the state of continual learning and neuroscience-inspired AI, and proposes a novel architecture for Personalized AGI that integrates brain-like learning mechanisms for edge deployment. Building on these insights, we outline an AI architecture that features complementary fast-and-slow learning modules, synaptic self-optimization, and memory-efficient model updates to support on-device lifelong adaptation.
arXiv Detail & Related papers (2025-04-27T16:10:17Z)
Semi-parametric Memory Consolidation: Towards Brain-like Deep Continual Learning [59.35015431695172]
We propose a novel biomimetic continual learning framework that integrates semi-parametric memory and the wake-sleep consolidation mechanism. For the first time, our method enables deep neural networks to retain high performance on novel tasks while maintaining prior knowledge in real-world challenging continual learning scenarios.
arXiv Detail & Related papers (2025-04-20T19:53:13Z)
The Thousand Brains Project: A New Paradigm for Sensorimotor Intelligence [0.5032786223328559]
We outline the Thousand Brains Project, an ongoing research effort to develop an alternative, complementary form of AI. We present an early version of a thousand-brains system, a sensorimotor agent that is uniquely suited to quickly learn a wide range of tasks. We outline the key principles motivating the design of thousand-brains systems and provide details about the implementation of Monty, our first instantiation of such a system.
arXiv Detail & Related papers (2024-12-24T11:32:37Z)
A Neuro-mimetic Realization of the Common Model of Cognition via Hebbian Learning and Free Energy Minimization [55.11642177631929]
Large neural generative models are capable of synthesizing semantically rich passages of text or producing complex images. We discuss the COGnitive Neural GENerative system, such an architecture that casts the Common Model of Cognition.
arXiv Detail & Related papers (2023-10-14T23:28:48Z)
Incremental procedural and sensorimotor learning in cognitive humanoid robots [52.77024349608834]
This work presents a cognitive agent that can learn procedures incrementally. We show the cognitive functions required in each substage and how adding new functions helps address tasks previously unsolved by the agent. Results show that this approach is capable of solving complex tasks incrementally.
arXiv Detail & Related papers (2023-04-30T22:51:31Z)
Generative Adversarial Neuroevolution for Control Behaviour Imitation [3.04585143845864]
We propose to explore whether deep neuroevolution can be used for behaviour imitation on popular simulation environments. We introduce a simple co-evolutionary adversarial generation framework, and evaluate its capabilities by evolving standard deep recurrent networks. Across all tasks, we find the final elite actor agents capable of achieving scores as high as those obtained by the pre-trained agents.
arXiv Detail & Related papers (2023-04-03T16:33:22Z)
MARTI-4: new model of human brain, considering neocortex and basal ganglia -- learns to play Atari game by reinforcement learning on a single CPU [0.0]
We present MARTI - new model of human brain, considering neocortex and basal ganglia. We introduce a novel surprise feeling mechanism, that significantly improves reinforcement learning process through inner rewards.
arXiv Detail & Related papers (2022-08-18T20:23:49Z)
Open-Ended Reinforcement Learning with Neural Reward Functions [2.4366811507669115]
In high-dimensional robotic environments our approach learns a wide range of interesting skills including front-flips for Half-Cheetah and one-legged running for Humanoid. In the pixel-based Montezuma's Revenge environment our method also works with minimal changes and it learns complex skills that involve interacting with items and visiting diverse locations.
arXiv Detail & Related papers (2022-02-16T15:55:22Z)
Improving the sample-efficiency of neural architecture search with reinforcement learning [0.0]
In this work, we would like to contribute to the area of Automated Machine Learning (AutoML) Our focus is on one of the most promising research directions, reinforcement learning. The validation accuracies of the child networks serve as a reward signal for training the controller. We propose to modify this to a more modern and complex algorithm, PPO, which has demonstrated to be faster and more stable in other environments.
arXiv Detail & Related papers (2021-10-13T14:30:09Z)
Backprop-Free Reinforcement Learning with Active Neural Generative Coding [84.11376568625353]
We propose a computational framework for learning action-driven generative models without backpropagation of errors (backprop) in dynamic environments. We develop an intelligent agent that operates even with sparse rewards, drawing inspiration from the cognitive theory of planning as inference. The robust performance of our agent offers promising evidence that a backprop-free approach for neural inference and learning can drive goal-directed behavior.
arXiv Detail & Related papers (2021-07-10T19:02:27Z)
Artificial Neural Variability for Deep Learning: On Overfitting, Noise Memorization, and Catastrophic Forgetting [135.0863818867184]
artificial neural variability (ANV) helps artificial neural networks learn some advantages from natural'' neural networks. ANV plays as an implicit regularizer of the mutual information between the training data and the learned model. It can effectively relieve overfitting, label noise memorization, and catastrophic forgetting at negligible costs.
arXiv Detail & Related papers (2020-11-12T06:06:33Z)
Hierarchical Affordance Discovery using Intrinsic Motivation [69.9674326582747]
We propose an algorithm using intrinsic motivation to guide the learning of affordances for a mobile robot. This algorithm is capable to autonomously discover, learn and adapt interrelated affordances without pre-programmed actions. Once learned, these affordances may be used by the algorithm to plan sequences of actions in order to perform tasks of various difficulties.
arXiv Detail & Related papers (2020-09-23T07:18:21Z)
Reinforcement Learning and its Connections with Neuroscience and Psychology [0.0]
We review findings in both neuroscience and psychology that evidence reinforcement learning as a promising candidate for modeling learning and decision making in the brain. We then discuss the implications of this observed relationship between RL, neuroscience and psychology and its role in advancing research in both AI and brain science.
arXiv Detail & Related papers (2020-06-25T04:29:15Z)
Emergent Real-World Robotic Skills via Unsupervised Off-Policy Reinforcement Learning [81.12201426668894]
We develop efficient reinforcement learning methods that acquire diverse skills without any reward function, and then repurpose these skills for downstream tasks. We show that our proposed algorithm provides substantial improvement in learning efficiency, making reward-free real-world training feasible. We also demonstrate that the learned skills can be composed using model predictive control for goal-oriented navigation, without any additional training.
arXiv Detail & Related papers (2020-04-27T17:38:53Z)

This list is automatically generated from the titles and abstracts of the papers in this site.