Efficient Off-Policy Reinforcement Learning via Brain-Inspired Computing
- URL: http://arxiv.org/abs/2205.06978v3
- Date: Wed, 21 Jun 2023 09:29:11 GMT
- Title: Efficient Off-Policy Reinforcement Learning via Brain-Inspired Computing
- Authors: Yang Ni, Danny Abraham, Mariam Issa, Yeseong Kim, Pietro Mercati,
Mohsen Imani
- Abstract summary: We propose QHD, an off-policy value-based Hyperdimensional Reinforcement Learning that mimics brain properties toward robust and real-time learning.
QHD relies on a lightweight brain-inspired model to learn an optimal policy in an unknown environment.
Our evaluation shows QHD capability for real-time learning, providing 34.6 times speedup and significantly better quality of learning than DQN.
- Score: 9.078553427792183
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Reinforcement Learning (RL) has opened up new opportunities to enhance
existing smart systems that generally include a complex decision-making
process. However, modern RL algorithms, e.g., Deep Q-Networks (DQN), are based
on deep neural networks, resulting in high computational costs. In this paper,
we propose QHD, an off-policy value-based Hyperdimensional Reinforcement
Learning, that mimics brain properties toward robust and real-time learning.
QHD relies on a lightweight brain-inspired model to learn an optimal policy in
an unknown environment. On both desktop and power-limited embedded platforms,
QHD achieves significantly better overall efficiency than DQN while providing
higher or comparable rewards. QHD is also suitable for highly-efficient
reinforcement learning with great potential for online and real-time learning.
Our solution supports a small experience replay batch size that provides 12.3
times speedup compared to DQN while ensuring minimal quality loss. Our
evaluation shows QHD capability for real-time learning, providing 34.6 times
speedup and significantly better quality of learning than DQN.
Related papers
- Lifting the Veil: Unlocking the Power of Depth in Q-learning [31.700583180829106]
deep Q-learning has been widely used in operations research and management science.
This paper theoretically verifies the power of depth in deep Q-learning.
arXiv Detail & Related papers (2023-10-27T06:15:33Z) - Quantum Imitation Learning [74.15588381240795]
We propose quantum imitation learning (QIL) with a hope to utilize quantum advantage to speed up IL.
We develop two QIL algorithms, quantum behavioural cloning (Q-BC) and quantum generative adversarial imitation learning (Q-GAIL)
Experiment results demonstrate that both Q-BC and Q-GAIL can achieve comparable performance compared to classical counterparts.
arXiv Detail & Related papers (2023-04-04T12:47:35Z) - Extreme Q-Learning: MaxEnt RL without Entropy [88.97516083146371]
Modern Deep Reinforcement Learning (RL) algorithms require estimates of the maximal Q-value, which are difficult to compute in continuous domains.
We introduce a new update rule for online and offline RL which directly models the maximal value using Extreme Value Theory (EVT)
Using EVT, we derive our Extreme Q-Learning framework and consequently online and, for the first time, offline MaxEnt Q-learning algorithms.
arXiv Detail & Related papers (2023-01-05T23:14:38Z) - M$^2$DQN: A Robust Method for Accelerating Deep Q-learning Network [6.689964384669018]
We propose a framework which uses the Max-Mean loss in Deep Q-Network (M$2$DQN)
Instead of sampling one batch of experiences in the training step, we sample several batches from the experience replay and update the parameters such as the maximum TD-error of these batches is minimized.
We verify the effectiveness of this framework with one of the most widely used techniques, Double DQN (DDQN) in several gym games.
arXiv Detail & Related papers (2022-09-16T09:20:35Z) - Deep Reinforcement Learning with Spiking Q-learning [51.386945803485084]
spiking neural networks (SNNs) are expected to realize artificial intelligence (AI) with less energy consumption.
It provides a promising energy-efficient way for realistic control tasks by combining SNNs with deep reinforcement learning (RL)
arXiv Detail & Related papers (2022-01-21T16:42:11Z) - Human-Level Control through Directly-Trained Deep Spiking Q-Networks [16.268397551693862]
Spiking Neural Networks (SNNs) have great potential on neuromorphic hardware because of their high energy-efficiency.
We propose a directly-trained deep spiking reinforcement learning architecture based on the Leaky Integrate-and-Fire neurons and Deep Q-Network.
Our work is the first one to achieve state-of-the-art performance on multiple Atari games with the directly-trained SNN.
arXiv Detail & Related papers (2021-12-13T09:46:17Z) - Online Target Q-learning with Reverse Experience Replay: Efficiently
finding the Optimal Policy for Linear MDPs [50.75812033462294]
We bridge the gap between practical success of Q-learning and pessimistic theoretical results.
We present novel methods Q-Rex and Q-RexDaRe.
We show that Q-Rex efficiently finds the optimal policy for linear MDPs.
arXiv Detail & Related papers (2021-10-16T01:47:41Z) - Mastering Visual Continuous Control: Improved Data-Augmented
Reinforcement Learning [114.35801511501639]
We present DrQ-v2, a model-free reinforcement learning algorithm for visual continuous control.
DrQ-v2 builds on DrQ, an off-policy actor-critic approach that uses data augmentation to learn directly from pixels.
Notably, DrQ-v2 is able to solve complex humanoid locomotion tasks directly from pixel observations.
arXiv Detail & Related papers (2021-07-20T17:29:13Z) - Accelerating Real-Time Question Answering via Question Generation [98.43852668033595]
Ocean-Q introduces a new question generation (QG) model to generate a large pool of QA pairs offline.
In real time matches an input question with the candidate QA pool to predict the answer without question encoding.
Ocean-Q can be readily deployed in existing distributed database systems or search engine for large-scale query usage.
arXiv Detail & Related papers (2020-09-10T22:44:29Z) - Deep Q-Network Based Multi-agent Reinforcement Learning with Binary
Action Agents [1.8782750537161614]
Deep Q-Network (DQN) based multi-agent systems (MAS) for reinforcement learning (RL) use various schemes where in the agents have to learn and communicate.
We propose a simple but efficient DQN based MAS for RL which uses shared state and rewards, but agent-specific actions.
The benefits of the approach are overall simplicity, faster convergence and better performance as compared to conventional DQN based approaches.
arXiv Detail & Related papers (2020-08-06T15:16:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.