Related papers: Deep Reinforcement Learning with Quantum-inspired Experience Replay

Deep Reinforcement Learning with Quantum-inspired Experience Replay

URL: http://arxiv.org/abs/2101.02034v1
Date: Wed, 6 Jan 2021 13:52:04 GMT
Title: Deep Reinforcement Learning with Quantum-inspired Experience Replay
Authors: Qing Wei, Hailan Ma, Chunlin Chen, Daoyi Dong
Abstract summary: A novel training paradigm inspired by quantum computation is proposed for deep reinforcement learning (DRL) with experience replay. The proposed deep reinforcement learning with quantum-inspired experience replay (DRL-QER) adaptively chooses experiences from the replay buffer according to the complexity and the replayed times of each experience (also called transition) The experimental results on Atari 2600 games show that DRL-QER outperforms state-of-the-art algorithms such as DRL-PER and DCRL on most of these games with improved training efficiency.
Score: 6.833294755109369
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In this paper, a novel training paradigm inspired by quantum computation is proposed for deep reinforcement learning (DRL) with experience replay. In contrast to traditional experience replay mechanism in DRL, the proposed deep reinforcement learning with quantum-inspired experience replay (DRL-QER) adaptively chooses experiences from the replay buffer according to the complexity and the replayed times of each experience (also called transition), to achieve a balance between exploration and exploitation. In DRL-QER, transitions are first formulated in quantum representations, and then the preparation operation and the depreciation operation are performed on the transitions. In this progress, the preparation operation reflects the relationship between the temporal difference errors (TD-errors) and the importance of the experiences, while the depreciation operation is taken into account to ensure the diversity of the transitions. The experimental results on Atari 2600 games show that DRL-QER outperforms state-of-the-art algorithms such as DRL-PER and DCRL on most of these games with improved training efficiency, and is also applicable to such memory-based DRL approaches as double network and dueling network.

Related papers

Turning Sand to Gold: Recycling Data to Bridge On-Policy and Off-Policy Learning via Causal Bound [4.350004414611934]
We introduce a novel theoretical result that leverages the Neyman-Rubin potential outcomes framework into DRL.<n>Unlike most methods that focus on bounding the counterfactual loss, we establish a causal bound on the factual loss.<n>This bound is computed by storing past value network outputs in the experience replay buffer, effectively utilizing data that is usually discarded.
arXiv Detail & Related papers (2025-07-15T12:46:25Z)
A Forget-and-Grow Strategy for Deep Reinforcement Learning Scaling in Continuous Control [24.96744955563452]
We propose Forget and Grow (FoG), a new deep RL algorithm with two mechanisms introduced.<n>First, Experience Replay Decay (ER Decay) "forgetting early experience", which balances memory by gradually reducing the influence of early experiences.<n>Second, Network Expansion, "growing neural capacity", which enhances agents' capability to exploit the patterns of existing data.
arXiv Detail & Related papers (2025-07-03T15:26:48Z)
CIER: A Novel Experience Replay Approach with Causal Inference in Deep Reinforcement Learning [11.13226491866178]
We propose a novel approach to segment time series into meaningful subsequences and represent the time series based on these subsequences. The subsequences are employed for causal inference to identify fundamental causal factors that significantly impact training outcomes. Several experiments demonstrate the feasibility of our approach in common environments, confirming its ability to enhance the effectiveness of DRL training and impart a certain level of explainability to the training process.
arXiv Detail & Related papers (2024-05-14T07:23:10Z)
Replay across Experiments: A Natural Extension of Off-Policy RL [18.545939667810565]
We present an effective yet simple framework to extend the use of replays across multiple experiments. At its core, Replay Across Experiments (RaE) involves reusing experience from previous experiments to improve exploration and bootstrap learning. We empirically show benefits across a number of RL algorithms and challenging control domains spanning both locomotion and manipulation.
arXiv Detail & Related papers (2023-11-27T15:57:11Z)
Leveraging Reward Consistency for Interpretable Feature Discovery in Reinforcement Learning [69.19840497497503]
It is argued that the commonly used action matching principle is more like an explanation of deep neural networks (DNNs) than the interpretation of RL agents. We propose to consider rewards, the essential objective of RL agents, as the essential objective of interpreting RL agents. We verify and evaluate our method on the Atari 2600 games as well as Duckietown, a challenging self-driving car simulator environment.
arXiv Detail & Related papers (2023-09-04T09:09:54Z)
Temporal Difference Learning with Experience Replay [3.5823366350053325]
Temporal-difference (TD) learning is widely regarded as one of the most popular algorithms in reinforcement learning (RL) We present a simple decomposition of the Markovian noise terms and provide finite-time error bounds for TD-learning with experience replay.
arXiv Detail & Related papers (2023-06-16T10:25:43Z)
Look Back When Surprised: Stabilizing Reverse Experience Replay for Neural Approximation [7.6146285961466]
We consider the recently developed and theoretically rigorous reverse experience replay (RER) We show via experiments that this has a better performance than techniques like prioritized experience replay (PER) on various tasks.
arXiv Detail & Related papers (2022-06-07T10:42:02Z)
Convergence Results For Q-Learning With Experience Replay [51.11953997546418]
We provide a convergence rate guarantee, and discuss how it compares to the convergence of Q-learning depending on important parameters such as the frequency and number of iterations of replay. We also provide theoretical evidence showing when we might expect this to strictly improve performance, by introducing and analyzing a simple class of MDPs.
arXiv Detail & Related papers (2021-12-08T10:22:49Z)
Improving Computational Efficiency in Visual Reinforcement Learning via Stored Embeddings [89.63764845984076]
We present Stored Embeddings for Efficient Reinforcement Learning (SEER) SEER is a simple modification of existing off-policy deep reinforcement learning methods. We show that SEER does not degrade the performance of RLizable agents while significantly saving computation and memory.
arXiv Detail & Related papers (2021-03-04T08:14:10Z)
Stratified Experience Replay: Correcting Multiplicity Bias in Off-Policy Reinforcement Learning [17.3794999533024]
We show that deep RL appears to struggle in the presence of extraneous data. Recent works have shown that the performance of Deep Q-Network (DQN) degrades when its replay memory becomes too large. We re-examine the motivation for sampling uniformly over a replay memory, and find that it may be flawed when using function approximation.
arXiv Detail & Related papers (2021-02-22T19:29:18Z)
Learning to Sample with Local and Global Contexts in Experience Replay Buffer [135.94190624087355]
We propose a new learning-based sampling method that can compute the relative importance of transition. We show that our framework can significantly improve the performance of various off-policy reinforcement learning methods.
arXiv Detail & Related papers (2020-07-14T21:12:56Z)
Revisiting Fundamentals of Experience Replay [91.24213515992595]
We present a systematic and extensive analysis of experience replay in Q-learning methods. We focus on two fundamental properties: the replay capacity and the ratio of learning updates to experience collected.
arXiv Detail & Related papers (2020-07-13T21:22:17Z)
Transient Non-Stationarity and Generalisation in Deep Reinforcement Learning [67.34810824996887]
Non-stationarity can arise in Reinforcement Learning (RL) even in stationary environments. We propose Iterated Relearning (ITER) to improve generalisation of deep RL agents.
arXiv Detail & Related papers (2020-06-10T13:26:31Z)

This list is automatically generated from the titles and abstracts of the papers in this site.