Synthetic Experience Replay
- URL: http://arxiv.org/abs/2303.06614v4
- Date: Fri, 27 Oct 2023 01:02:22 GMT
- Title: Synthetic Experience Replay
- Authors: Cong Lu, Philip J. Ball, Yee Whye Teh, Jack Parker-Holder
- Abstract summary: We propose Synthetic Experience Replay ( SynthER), a diffusion-based approach to flexibly upsample an agent's collected experience.
We show that SynthER is an effective method for training RL agents across offline and online settings.
We believe that synthetic training data could open the door to realizing the full potential of deep learning for replay-based RL algorithms from limited data.
- Score: 48.601879260071655
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: A key theme in the past decade has been that when large neural networks and
large datasets combine they can produce remarkable results. In deep
reinforcement learning (RL), this paradigm is commonly made possible through
experience replay, whereby a dataset of past experiences is used to train a
policy or value function. However, unlike in supervised or self-supervised
learning, an RL agent has to collect its own data, which is often limited.
Thus, it is challenging to reap the benefits of deep learning, and even small
neural networks can overfit at the start of training. In this work, we leverage
the tremendous recent progress in generative modeling and propose Synthetic
Experience Replay (SynthER), a diffusion-based approach to flexibly upsample an
agent's collected experience. We show that SynthER is an effective method for
training RL agents across offline and online settings, in both proprioceptive
and pixel-based environments. In offline settings, we observe drastic
improvements when upsampling small offline datasets and see that additional
synthetic data also allows us to effectively train larger networks.
Furthermore, SynthER enables online agents to train with a much higher
update-to-data ratio than before, leading to a significant increase in sample
efficiency, without any algorithmic changes. We believe that synthetic training
data could open the door to realizing the full potential of deep learning for
replay-based RL algorithms from limited data. Finally, we open-source our code
at https://github.com/conglu1997/SynthER.
Related papers
- REBOOT: Reuse Data for Bootstrapping Efficient Real-World Dexterous
Manipulation [61.7171775202833]
We introduce an efficient system for learning dexterous manipulation skills withReinforcement learning.
The main idea of our approach is the integration of recent advances in sample-efficient RL and replay buffer bootstrapping.
Our system completes the real-world training cycle by incorporating learned resets via an imitation-based pickup policy.
arXiv Detail & Related papers (2023-09-06T19:05:31Z) - Learn, Unlearn and Relearn: An Online Learning Paradigm for Deep Neural
Networks [12.525959293825318]
We introduce Learn, Unlearn, and Relearn (LURE) an online learning paradigm for deep neural networks (DNNs)
LURE interchanges between the unlearning phase, which selectively forgets the undesirable information in the model, and the relearning phase, which emphasizes learning on generalizable features.
We show that our training paradigm provides consistent performance gains across datasets in both classification and few-shot settings.
arXiv Detail & Related papers (2023-03-18T16:45:54Z) - A New Benchmark: On the Utility of Synthetic Data with Blender for Bare
Supervised Learning and Downstream Domain Adaptation [42.2398858786125]
Deep learning in computer vision has achieved great success with the price of large-scale labeled training data.
The uncontrollable data collection process produces non-IID training and test data, where undesired duplication may exist.
To circumvent them, an alternative is to generate synthetic data via 3D rendering with domain randomization.
arXiv Detail & Related papers (2023-03-16T09:03:52Z) - Retrieval-Augmented Reinforcement Learning [63.32076191982944]
We train a network to map a dataset of past experiences to optimal behavior.
The retrieval process is trained to retrieve information from the dataset that may be useful in the current context.
We show that retrieval-augmented R2D2 learns significantly faster than the baseline R2D2 agent and achieves higher scores.
arXiv Detail & Related papers (2022-02-17T02:44:05Z) - Improving Computational Efficiency in Visual Reinforcement Learning via
Stored Embeddings [89.63764845984076]
We present Stored Embeddings for Efficient Reinforcement Learning (SEER)
SEER is a simple modification of existing off-policy deep reinforcement learning methods.
We show that SEER does not degrade the performance of RLizable agents while significantly saving computation and memory.
arXiv Detail & Related papers (2021-03-04T08:14:10Z) - Decoupling Representation Learning from Reinforcement Learning [89.82834016009461]
We introduce an unsupervised learning task called Augmented Temporal Contrast (ATC)
ATC trains a convolutional encoder to associate pairs of observations separated by a short time difference.
In online RL experiments, we show that training the encoder exclusively using ATC matches or outperforms end-to-end RL.
arXiv Detail & Related papers (2020-09-14T19:11:13Z) - AWAC: Accelerating Online Reinforcement Learning with Offline Datasets [84.94748183816547]
We show that our method, advantage weighted actor critic (AWAC), enables rapid learning of skills with a combination of prior demonstration data and online experience.
Our results show that incorporating prior data can reduce the time required to learn a range of robotic skills to practical time-scales.
arXiv Detail & Related papers (2020-06-16T17:54:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.