Memristor Hardware-Friendly Reinforcement Learning
- URL: http://arxiv.org/abs/2001.06930v1
- Date: Mon, 20 Jan 2020 01:08:44 GMT
- Title: Memristor Hardware-Friendly Reinforcement Learning
- Authors: Nan Wu, Adrien Vincent, Dmitri Strukov, Yuan Xie
- Abstract summary: We propose a memristive neuromorphic hardware implementation for the actor-critic algorithm in reinforcement learning.
We consider the task of balancing an inverted pendulum, a classical problem in both RL and control theory.
We believe that this study shows the promise of using memristor-based hardware neural networks for handling complex tasks through in-situ reinforcement learning.
- Score: 14.853739554366351
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recently, significant progress has been made in solving sophisticated
problems among various domains by using reinforcement learning (RL), which
allows machines or agents to learn from interactions with environments rather
than explicit supervision. As the end of Moore's law seems to be imminent,
emerging technologies that enable high performance neuromorphic hardware
systems are attracting increasing attention. Namely, neuromorphic architectures
that leverage memristors, the programmable and nonvolatile two-terminal
devices, as synaptic weights in hardware neural networks, are candidates of
choice to realize such highly energy-efficient and complex nervous systems.
However, one of the challenges for memristive hardware with integrated learning
capabilities is prohibitively large number of write cycles that might be
required during learning process, and this situation is even exacerbated under
RL situations. In this work we propose a memristive neuromorphic hardware
implementation for the actor-critic algorithm in RL. By introducing a two-fold
training procedure (i.e., ex-situ pre-training and in-situ re-training) and
several training techniques, the number of weight updates can be significantly
reduced and thus it will be suitable for efficient in-situ learning
implementations. As a case study, we consider the task of balancing an inverted
pendulum, a classical problem in both RL and control theory. We believe that
this study shows the promise of using memristor-based hardware neural networks
for handling complex tasks through in-situ reinforcement learning.
Related papers
- Enhancing Hardware Fault Tolerance in Machines with Reinforcement Learning Policy Gradient Algorithms [2.473948454680334]
Reinforcement learning-based robotic control offers a new perspective on achieving hardware fault tolerance.
This paper investigates the potential of two state-of-the-art reinforcement learning algorithms, Proximal Policy Optimization (PPO) and Soft Actor-Critic (SAC)
We show PPO exhibits the fastest adaptation when retaining the knowledge within its models, while SAC performs best when discarding all acquired knowledge.
arXiv Detail & Related papers (2024-07-21T22:24:16Z) - Neuro-mimetic Task-free Unsupervised Online Learning with Continual
Self-Organizing Maps [56.827895559823126]
Self-organizing map (SOM) is a neural model often used in clustering and dimensionality reduction.
We propose a generalization of the SOM, the continual SOM, which is capable of online unsupervised learning under a low memory budget.
Our results, on benchmarks including MNIST, Kuzushiji-MNIST, and Fashion-MNIST, show almost a two times increase in accuracy.
arXiv Detail & Related papers (2024-02-19T19:11:22Z) - Solving Large-scale Spatial Problems with Convolutional Neural Networks [88.31876586547848]
We employ transfer learning to improve training efficiency for large-scale spatial problems.
We propose that a convolutional neural network (CNN) can be trained on small windows of signals, but evaluated on arbitrarily large signals with little to no performance degradation.
arXiv Detail & Related papers (2023-06-14T01:24:42Z) - Deep learning applied to computational mechanics: A comprehensive
review, state of the art, and the classics [77.34726150561087]
Recent developments in artificial neural networks, particularly deep learning (DL), are reviewed in detail.
Both hybrid and pure machine learning (ML) methods are discussed.
History and limitations of AI are recounted and discussed, with particular attention at pointing out misstatements or misconceptions of the classics.
arXiv Detail & Related papers (2022-12-18T02:03:00Z) - Intelligence Processing Units Accelerate Neuromorphic Learning [52.952192990802345]
Spiking neural networks (SNNs) have achieved orders of magnitude improvement in terms of energy consumption and latency.
We present an IPU-optimized release of our custom SNN Python package, snnTorch.
arXiv Detail & Related papers (2022-11-19T15:44:08Z) - Learning to Modulate Random Weights: Neuromodulation-inspired Neural
Networks For Efficient Continual Learning [1.9580473532948401]
We introduce a novel neural network architecture inspired by neuromodulation in biological nervous systems.
We show that this approach has strong learning performance per task despite the very small number of learnable parameters.
arXiv Detail & Related papers (2022-04-08T21:12:13Z) - Online Training of Spiking Recurrent Neural Networks with Phase-Change
Memory Synapses [1.9809266426888898]
Training spiking neural networks (RNNs) on dedicated neuromorphic hardware is still an open challenge.
We present a simulation framework of differential-architecture arrays based on an accurate and comprehensive Phase-Change Memory (PCM) device model.
We train a spiking RNN whose weights are emulated in the presented simulation framework, using a recently proposed e-prop learning rule.
arXiv Detail & Related papers (2021-08-04T01:24:17Z) - Deep Reinforcement Learning with Population-Coded Spiking Neural Network
for Continuous Control [0.0]
We propose a population-coded spiking actor network (PopSAN) trained in conjunction with a deep critic network using deep reinforcement learning (DRL)
We deployed the trained PopSAN on Intel's Loihi neuromorphic chip and benchmarked our method against the mainstream DRL algorithms for continuous control.
Our results support the efficiency of neuromorphic controllers and suggest our hybrid RL as an alternative to deep learning, when both energy-efficiency and robustness are important.
arXiv Detail & Related papers (2020-10-19T16:20:45Z) - Surrogate gradients for analog neuromorphic computing [2.6475944316982942]
We show that learning self-corrects for device mismatch resulting in competitive spiking network performance on vision and speech benchmarks.
Our work sets several new benchmarks for low-energy spiking network processing on analog neuromorphic hardware.
arXiv Detail & Related papers (2020-06-12T14:45:12Z) - Spiking Neural Networks Hardware Implementations and Challenges: a
Survey [53.429871539789445]
Spiking Neural Networks are cognitive algorithms mimicking neuron and synapse operational principles.
We present the state of the art of hardware implementations of spiking neural networks.
We discuss the strategies employed to leverage the characteristics of these event-driven algorithms at the hardware level.
arXiv Detail & Related papers (2020-05-04T13:24:00Z) - Learn2Perturb: an End-to-end Feature Perturbation Learning to Improve
Adversarial Robustness [79.47619798416194]
Learn2Perturb is an end-to-end feature perturbation learning approach for improving the adversarial robustness of deep neural networks.
Inspired by the Expectation-Maximization, an alternating back-propagation training algorithm is introduced to train the network and noise parameters consecutively.
arXiv Detail & Related papers (2020-03-02T18:27:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.