Learning fast changing slow in spiking neural networks
- URL: http://arxiv.org/abs/2402.10069v2
- Date: Tue, 9 Apr 2024 15:47:15 GMT
- Title: Learning fast changing slow in spiking neural networks
- Authors: Cristiano Capone, Paolo Muratore,
- Abstract summary: Reinforcement learning (RL) faces substantial challenges when applied to real-life problems.
Life-long learning machines must resolve the plasticity-stability paradox.
Striking a balance between acquiring new knowledge and maintaining stability is crucial for artificial agents.
- Score: 3.069335774032178
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Reinforcement learning (RL) faces substantial challenges when applied to real-life problems, primarily stemming from the scarcity of available data due to limited interactions with the environment. This limitation is exacerbated by the fact that RL often demands a considerable volume of data for effective learning. The complexity escalates further when implementing RL in recurrent spiking networks, where inherent noise introduced by spikes adds a layer of difficulty. Life-long learning machines must inherently resolve the plasticity-stability paradox. Striking a balance between acquiring new knowledge and maintaining stability is crucial for artificial agents. To address this challenge, we draw inspiration from machine learning technology and introduce a biologically plausible implementation of proximal policy optimization, referred to as lf-cs (learning fast changing slow). Our approach results in two notable advancements: firstly, the capacity to assimilate new information into a new policy without requiring alterations to the current policy; and secondly, the capability to replay experiences without experiencing policy divergence. Furthermore, when contrasted with other experience replay (ER) techniques, our method demonstrates the added advantage of being computationally efficient in an online setting. We demonstrate that the proposed methodology enhances the efficiency of learning, showcasing its potential impact on neuromorphic and real-world applications.
Related papers
- SHIRE: Enhancing Sample Efficiency using Human Intuition in REinforcement Learning [11.304750795377657]
We propose SHIRE, a framework for encoding human intuition using Probabilistic Graphical Models (PGMs)
SHIRE achieves 25-78% sample efficiency gains across the environments we evaluate at negligible overhead cost.
arXiv Detail & Related papers (2024-09-16T04:46:22Z) - A Study of Plasticity Loss in On-Policy Deep Reinforcement Learning [7.767611890997147]
We show that plasticity loss is pervasive under domain shift in on-policy deep RL.
We find that a class of regenerative'' methods are able to consistently mitigate plasticity loss in a variety of contexts.
arXiv Detail & Related papers (2024-05-29T14:59:49Z) - Efficient Imitation Learning with Conservative World Models [54.52140201148341]
We tackle the problem of policy learning from expert demonstrations without a reward function.
We re-frame imitation learning as a fine-tuning problem, rather than a pure reinforcement learning one.
arXiv Detail & Related papers (2024-05-21T20:53:18Z) - Disentangling the Causes of Plasticity Loss in Neural Networks [55.23250269007988]
We show that loss of plasticity can be decomposed into multiple independent mechanisms.
We show that a combination of layer normalization and weight decay is highly effective at maintaining plasticity in a variety of synthetic nonstationary learning tasks.
arXiv Detail & Related papers (2024-02-29T00:02:33Z) - REBOOT: Reuse Data for Bootstrapping Efficient Real-World Dexterous
Manipulation [61.7171775202833]
We introduce an efficient system for learning dexterous manipulation skills withReinforcement learning.
The main idea of our approach is the integration of recent advances in sample-efficient RL and replay buffer bootstrapping.
Our system completes the real-world training cycle by incorporating learned resets via an imitation-based pickup policy.
arXiv Detail & Related papers (2023-09-06T19:05:31Z) - Human-Inspired Framework to Accelerate Reinforcement Learning [1.6317061277457001]
Reinforcement learning (RL) is crucial for data science decision-making but suffers from sample inefficiency.
This paper introduces a novel human-inspired framework to enhance RL algorithm sample efficiency.
arXiv Detail & Related papers (2023-02-28T13:15:04Z) - IQ-Learn: Inverse soft-Q Learning for Imitation [95.06031307730245]
imitation learning from a small amount of expert data can be challenging in high-dimensional environments with complex dynamics.
Behavioral cloning is a simple method that is widely used due to its simplicity of implementation and stable convergence.
We introduce a method for dynamics-aware IL which avoids adversarial training by learning a single Q-function.
arXiv Detail & Related papers (2021-06-23T03:43:10Z) - Deep RL With Information Constrained Policies: Generalization in
Continuous Control [21.46148507577606]
We show that a natural constraint on information flow might confer onto artificial agents in continuous control tasks.
We implement a novel Capacity-Limited Actor-Critic (CLAC) algorithm.
Our experiments show that compared to alternative approaches, CLAC offers improvements in generalization between training and modified test environments.
arXiv Detail & Related papers (2020-10-09T15:42:21Z) - Deep Reinforcement Learning amidst Lifelong Non-Stationarity [67.24635298387624]
We show that an off-policy RL algorithm can reason about and tackle lifelong non-stationarity.
Our method leverages latent variable models to learn a representation of the environment from current and past experiences.
We also introduce several simulation environments that exhibit lifelong non-stationarity, and empirically find that our approach substantially outperforms approaches that do not reason about environment shift.
arXiv Detail & Related papers (2020-06-18T17:34:50Z) - AWAC: Accelerating Online Reinforcement Learning with Offline Datasets [84.94748183816547]
We show that our method, advantage weighted actor critic (AWAC), enables rapid learning of skills with a combination of prior demonstration data and online experience.
Our results show that incorporating prior data can reduce the time required to learn a range of robotic skills to practical time-scales.
arXiv Detail & Related papers (2020-06-16T17:54:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.