Related papers: Revisiting Fundamentals of Experience Replay

Revisiting Fundamentals of Experience Replay

URL: http://arxiv.org/abs/2007.06700v1
Date: Mon, 13 Jul 2020 21:22:17 GMT
Title: Revisiting Fundamentals of Experience Replay
Authors: William Fedus, Prajit Ramachandran, Rishabh Agarwal, Yoshua Bengio, Hugo Larochelle, Mark Rowland, Will Dabney
Abstract summary: We present a systematic and extensive analysis of experience replay in Q-learning methods. We focus on two fundamental properties: the replay capacity and the ratio of learning updates to experience collected.
Score: 91.24213515992595
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Experience replay is central to off-policy algorithms in deep reinforcement learning (RL), but there remain significant gaps in our understanding. We therefore present a systematic and extensive analysis of experience replay in Q-learning methods, focusing on two fundamental properties: the replay capacity and the ratio of learning updates to experience collected (replay ratio). Our additive and ablative studies upend conventional wisdom around experience replay -- greater capacity is found to substantially increase the performance of certain algorithms, while leaving others unaffected. Counterintuitively we show that theoretically ungrounded, uncorrected n-step returns are uniquely beneficial while other techniques confer limited benefit for sifting through larger memory. Separately, by directly controlling the replay ratio we contextualize previous observations in the literature and empirically measure its importance across a variety of deep RL algorithms. Finally, we conclude by testing a set of hypotheses on the nature of these performance benefits.

Related papers

Reliability-Adjusted Prioritized Experience Replay [5.342556166066767]
We propose an extension to Prioritized Experience Replay (PER) by introducing a novel measure of temporal difference error reliability.<n>We theoretically show that the resulting transition selection algorithm, Reliability-adjusted Prioritized Experience Replay (ReaPER), enables more efficient learning than PER.
arXiv Detail & Related papers (2025-06-23T10:35:36Z)
Look Back When Surprised: Stabilizing Reverse Experience Replay for Neural Approximation [7.6146285961466]
We consider the recently developed and theoretically rigorous reverse experience replay (RER) We show via experiments that this has a better performance than techniques like prioritized experience replay (PER) on various tasks.
arXiv Detail & Related papers (2022-06-07T10:42:02Z)
Practical Recommendations for Replay-based Continual Learning Methods [18.559132470835937]
Continual Learning requires the model to learn from a stream of dynamic, non-stationary data without forgetting previous knowledge. Replay approaches have empirically proved to be the most effective ones.
arXiv Detail & Related papers (2022-03-19T12:44:44Z)
Replay For Safety [51.11953997546418]
In experience replay, past transitions are stored in a memory buffer and re-used during learning. We show that using an appropriate biased sampling scheme can allow us to achieve a emphsafe policy.
arXiv Detail & Related papers (2021-12-08T11:10:57Z)
Convergence Results For Q-Learning With Experience Replay [51.11953997546418]
We provide a convergence rate guarantee, and discuss how it compares to the convergence of Q-learning depending on important parameters such as the frequency and number of iterations of replay. We also provide theoretical evidence showing when we might expect this to strictly improve performance, by introducing and analyzing a simple class of MDPs.
arXiv Detail & Related papers (2021-12-08T10:22:49Z)
Improving Experience Replay with Successor Representation [0.0]
Prioritized experience replay is a reinforcement learning technique shown to speed up learning. Recent work in neuroscience suggests that, in biological organisms, replay is prioritized by both gain and need.
arXiv Detail & Related papers (2021-11-29T05:25:54Z)
Reducing Representation Drift in Online Continual Learning [87.71558506591937]
We study the online continual learning paradigm, where agents must learn from a changing distribution with constrained memory and compute. In this work we instead focus on the change in representations of previously observed data due to the introduction of previously unobserved class samples in the incoming data stream.
arXiv Detail & Related papers (2021-04-11T15:19:30Z)
Stratified Experience Replay: Correcting Multiplicity Bias in Off-Policy Reinforcement Learning [17.3794999533024]
We show that deep RL appears to struggle in the presence of extraneous data. Recent works have shown that the performance of Deep Q-Network (DQN) degrades when its replay memory becomes too large. We re-examine the motivation for sampling uniformly over a replay memory, and find that it may be flawed when using function approximation.
arXiv Detail & Related papers (2021-02-22T19:29:18Z)
Learning to Sample with Local and Global Contexts in Experience Replay Buffer [135.94190624087355]
We propose a new learning-based sampling method that can compute the relative importance of transition. We show that our framework can significantly improve the performance of various off-policy reinforcement learning methods.
arXiv Detail & Related papers (2020-07-14T21:12:56Z)
Experience Replay with Likelihood-free Importance Weights [123.52005591531194]
We propose to reweight experiences based on their likelihood under the stationary distribution of the current policy. We apply the proposed approach empirically on two competitive methods, Soft Actor Critic (SAC) and Twin Delayed Deep Deterministic policy gradient (TD3)
arXiv Detail & Related papers (2020-06-23T17:17:44Z)

This list is automatically generated from the titles and abstracts of the papers in this site.