Bootstrapping a DQN Replay Memory with Synthetic Experiences
- URL: http://arxiv.org/abs/2002.01370v1
- Date: Tue, 4 Feb 2020 15:36:36 GMT
- Title: Bootstrapping a DQN Replay Memory with Synthetic Experiences
- Authors: Wenzel Baron Pilar von Pilchau and Anthony Stein and J\"org H\"ahner
- Abstract summary: We present an algorithm that creates synthetic experiences in a nondeterministic discrete environment to assist the learner.
The Interpolated Experience Replay is evaluated on the FrozenLake environment and we show that it can support the agent to learn faster and even better than the classic version.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: An important component of many Deep Reinforcement Learning algorithms is the
Experience Replay which serves as a storage mechanism or memory of made
experiences. These experiences are used for training and help the agent to
stably find the perfect trajectory through the problem space. The classic
Experience Replay however makes only use of the experiences it actually made,
but the stored samples bear great potential in form of knowledge about the
problem that can be extracted. We present an algorithm that creates synthetic
experiences in a nondeterministic discrete environment to assist the learner.
The Interpolated Experience Replay is evaluated on the FrozenLake environment
and we show that it can support the agent to learn faster and even better than
the classic version.
Related papers
- Synthetic Experience Replay [48.601879260071655]
We propose Synthetic Experience Replay ( SynthER), a diffusion-based approach to flexibly upsample an agent's collected experience.
We show that SynthER is an effective method for training RL agents across offline and online settings.
We believe that synthetic training data could open the door to realizing the full potential of deep learning for replay-based RL algorithms from limited data.
arXiv Detail & Related papers (2023-03-12T09:10:45Z) - Replay For Safety [51.11953997546418]
In experience replay, past transitions are stored in a memory buffer and re-used during learning.
We show that using an appropriate biased sampling scheme can allow us to achieve a emphsafe policy.
arXiv Detail & Related papers (2021-12-08T11:10:57Z) - Saliency Guided Experience Packing for Replay in Continual Learning [6.417011237981518]
We propose a new approach for experience replay, where we select the past experiences by looking at the saliency maps.
While learning a new task, we replay these memory patches with appropriate zero-padding to remind the model about its past decisions.
arXiv Detail & Related papers (2021-09-10T15:54:58Z) - Reinforcement Learning with Videos: Combining Offline Observations with
Interaction [151.73346150068866]
Reinforcement learning is a powerful framework for robots to acquire skills from experience.
Videos of humans are a readily available source of broad and interesting experiences.
We propose a framework for reinforcement learning with videos.
arXiv Detail & Related papers (2020-11-12T17:15:48Z) - Lucid Dreaming for Experience Replay: Refreshing Past States with the
Current Policy [48.8675653453076]
We introduce Lucid Dreaming for Experience Replay (LiDER), a framework that allows replay experiences to be refreshed by leveraging the agent's current policy.
LiDER consistently improves performance over the baseline in six Atari 2600 games.
arXiv Detail & Related papers (2020-09-29T02:54:11Z) - Revisiting Fundamentals of Experience Replay [91.24213515992595]
We present a systematic and extensive analysis of experience replay in Q-learning methods.
We focus on two fundamental properties: the replay capacity and the ratio of learning updates to experience collected.
arXiv Detail & Related papers (2020-07-13T21:22:17Z) - Double Prioritized State Recycled Experience Replay [3.42658286826597]
We develop a method called double-prioritized state-recycled (DPSR) experience replay.
We used this method in Deep Q-Networks (DQN), and achieved a state-of-the-art result.
arXiv Detail & Related papers (2020-07-08T08:36:41Z) - Experience Replay with Likelihood-free Importance Weights [123.52005591531194]
We propose to reweight experiences based on their likelihood under the stationary distribution of the current policy.
We apply the proposed approach empirically on two competitive methods, Soft Actor Critic (SAC) and Twin Delayed Deep Deterministic policy gradient (TD3)
arXiv Detail & Related papers (2020-06-23T17:17:44Z) - Dynamic Experience Replay [6.062589413216726]
We build upon Ape-X DDPG and demonstrate our approach on robotic tight-fitting joint assembly tasks.
In particular, we run experiments on two different tasks: peg-in-hole and lap-joint.
Our ablation studies show that Dynamic Experience Replay is a crucial ingredient that either largely shortens the training time in these challenging environments.
arXiv Detail & Related papers (2020-03-04T23:46:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.