RLDS: an Ecosystem to Generate, Share and Use Datasets in Reinforcement
Learning
- URL: http://arxiv.org/abs/2111.02767v1
- Date: Thu, 4 Nov 2021 11:48:19 GMT
- Title: RLDS: an Ecosystem to Generate, Share and Use Datasets in Reinforcement
Learning
- Authors: Sabela Ramos, Sertan Girgin, L\'eonard Hussenot, Damien Vincent, Hanna
Yakubovich, Daniel Toyama, Anita Gergely, Piotr Stanczyk, Raphael Marinier,
Jeremiah Harmsen, Olivier Pietquin, Nikola Momchev
- Abstract summary: RLDS is an ecosystem for recording, replaying, manipulating, annotating and sharing data in the context of Sequential Decision Making (SDM)
RLDS enables not only of existing research and easy generation of new datasets, but also accelerates novel research.
The RLDS ecosystem makes it easy to share datasets without any loss of information and to be agnostic to the underlying original format.
- Score: 17.87592413742589
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: We introduce RLDS (Reinforcement Learning Datasets), an ecosystem for
recording, replaying, manipulating, annotating and sharing data in the context
of Sequential Decision Making (SDM) including Reinforcement Learning (RL),
Learning from Demonstrations, Offline RL or Imitation Learning. RLDS enables
not only reproducibility of existing research and easy generation of new
datasets, but also accelerates novel research. By providing a standard and
lossless format of datasets it enables to quickly test new algorithms on a
wider range of tasks. The RLDS ecosystem makes it easy to share datasets
without any loss of information and to be agnostic to the underlying original
format when applying various data processing pipelines to large collections of
datasets. Besides, RLDS provides tools for collecting data generated by either
synthetic agents or humans, as well as for inspecting and manipulating the
collected data. Ultimately, integration with TFDS facilitates the sharing of RL
datasets with the research community.
Related papers
- A Survey on Data Synthesis and Augmentation for Large Language Models [35.59526251210408]
This paper reviews and summarizes data generation techniques throughout the lifecycle of Large Language Models.
We discuss the current constraints faced by these methods and investigate potential pathways for future development and research.
arXiv Detail & Related papers (2024-10-16T16:12:39Z) - D5RL: Diverse Datasets for Data-Driven Deep Reinforcement Learning [99.33607114541861]
We propose a new benchmark for offline RL that focuses on realistic simulations of robotic manipulation and locomotion environments.
Our proposed benchmark covers state-based and image-based domains, and supports both offline RL and online fine-tuning evaluation.
arXiv Detail & Related papers (2024-08-15T22:27:00Z) - Behaviour Distillation [10.437472004180883]
We formalize behaviour distillation, a setting that aims to discover and condense information required for training an expert policy into a synthetic dataset.
We then introduce Hallucinating datasets with Evolution Strategies (HaDES), a method for behaviour distillation that can discover datasets of just four state-action pairs.
We show that these datasets generalize out of distribution to training policies with a wide range of architectures.
We also demonstrate application to a downstream task, namely training multi-task agents in a zero-shot fashion.
arXiv Detail & Related papers (2024-06-21T10:45:43Z) - Retrieval-Augmented Data Augmentation for Low-Resource Domain Tasks [66.87070857705994]
In low-resource settings, the amount of seed data samples to use for data augmentation is very small.
We propose a novel method that augments training data by incorporating a wealth of examples from other datasets.
This approach can ensure that the generated data is not only relevant but also more diverse than what could be achieved using the limited seed data alone.
arXiv Detail & Related papers (2024-02-21T02:45:46Z) - STAR: Boosting Low-Resource Information Extraction by Structure-to-Text
Data Generation with Large Language Models [56.27786433792638]
STAR is a data generation method that leverages Large Language Models (LLMs) to synthesize data instances.
We design fine-grained step-by-step instructions to obtain the initial data instances.
Our experiments show that the data generated by STAR significantly improve the performance of low-resource event extraction and relation extraction tasks.
arXiv Detail & Related papers (2023-05-24T12:15:19Z) - The Challenges of Exploration for Offline Reinforcement Learning [8.484491887821473]
We study the two processes of reinforcement learning: collecting informative experience and inferring optimal behaviour.
The task-agnostic setting for data collection, where the task is not known a priori, is of particular interest.
We use this decoupled framework to strengthen intuitions about exploration and the data prerequisites for effective offline RL.
arXiv Detail & Related papers (2022-01-27T23:59:56Z) - RL Unplugged: A Suite of Benchmarks for Offline Reinforcement Learning [108.9599280270704]
We propose a benchmark called RL Unplugged to evaluate and compare offline RL methods.
RL Unplugged includes data from a diverse range of domains including games and simulated motor control problems.
We will release data for all our tasks and open-source all algorithms presented in this paper.
arXiv Detail & Related papers (2020-06-24T17:14:51Z) - D4RL: Datasets for Deep Data-Driven Reinforcement Learning [119.49182500071288]
We introduce benchmarks specifically designed for the offline setting, guided by key properties of datasets relevant to real-world applications of offline RL.
By moving beyond simple benchmark tasks and data collected by partially-trained RL agents, we reveal important and unappreciated deficiencies of existing algorithms.
arXiv Detail & Related papers (2020-04-15T17:18:19Z) - DeGAN : Data-Enriching GAN for Retrieving Representative Samples from a
Trained Classifier [58.979104709647295]
We bridge the gap between the abundance of available data and lack of relevant data, for the future learning tasks of a trained network.
We use the available data, that may be an imbalanced subset of the original training dataset, or a related domain dataset, to retrieve representative samples.
We demonstrate that data from a related domain can be leveraged to achieve state-of-the-art performance.
arXiv Detail & Related papers (2019-12-27T02:05:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.