B2RL: An open-source Dataset for Building Batch Reinforcement Learning
- URL: http://arxiv.org/abs/2209.15626v1
- Date: Fri, 30 Sep 2022 17:54:42 GMT
- Title: B2RL: An open-source Dataset for Building Batch Reinforcement Learning
- Authors: Hsin-Yu Liu (1), Xiaohan Fu (1), Bharathan Balaji (2), Rajesh Gupta
(1), and Dezhi Hong (2) ((1) University of California, San Diego, (2) Amazon)
- Abstract summary: Batch reinforcement learning (BRL) is an emerging research area in the RL community.
We are the first to open-source building datasets for the purpose of BRL learning.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Batch reinforcement learning (BRL) is an emerging research area in the RL
community. It learns exclusively from static datasets (i.e. replay buffers)
without interaction with the environment. In the offline settings, existing
replay experiences are used as prior knowledge for BRL models to find the
optimal policy. Thus, generating replay buffers is crucial for BRL model
benchmark. In our B2RL (Building Batch RL) dataset, we collected real-world
data from our building management systems, as well as buffers generated by
several behavioral policies in simulation environments. We believe it could
help building experts on BRL research. To the best of our knowledge, we are the
first to open-source building datasets for the purpose of BRL learning.
Related papers
- A Benchmark Environment for Offline Reinforcement Learning in Racing Games [54.83171948184851]
Offline Reinforcement Learning (ORL) is a promising approach to reduce the high sample complexity of traditional Reinforcement Learning (RL)
This paper introduces OfflineMania, a novel environment for ORL research.
It is inspired by the iconic TrackMania series and developed using the Unity 3D game engine.
arXiv Detail & Related papers (2024-07-12T16:44:03Z) - TeaMs-RL: Teaching LLMs to Generate Better Instruction Datasets via Reinforcement Learning [7.9961739811640244]
Development of Large Language Models often confronts challenges stemming from heavy reliance on human annotators.
In this work, we pivot to Reinforcement Learning -- but with a twist.
We use RL to directly generate the foundational instruction dataset that alone suffices for fine-tuning.
arXiv Detail & Related papers (2024-03-13T16:57:57Z) - Open RL Benchmark: Comprehensive Tracked Experiments for Reinforcement
Learning [41.971465819626005]
We present Open RL Benchmark, a set of fully tracked RL experiments.
Open RL Benchmark is community-driven: anyone can download, use, and contribute to the data.
Special care is taken to ensure that each experiment is precisely reproducible.
arXiv Detail & Related papers (2024-02-05T14:32:00Z) - A Survey on Model-based Reinforcement Learning [21.85904195671014]
Reinforcement learning (RL) solves sequential decision-making problems via a trial-and-error process interacting with the environment.
Model-based reinforcement learning (MBRL) is believed to be a promising direction, which builds environment models in which the trial-and-errors can take place without real costs.
arXiv Detail & Related papers (2022-06-19T05:28:03Z) - When Should We Prefer Offline Reinforcement Learning Over Behavioral
Cloning? [86.43517734716606]
offline reinforcement learning (RL) algorithms can acquire effective policies by utilizing previously collected experience, without any online interaction.
behavioral cloning (BC) algorithms mimic a subset of the dataset via supervised learning.
We show that policies trained on sufficiently noisy suboptimal data can attain better performance than even BC algorithms with expert data.
arXiv Detail & Related papers (2022-04-12T08:25:34Z) - Don't Change the Algorithm, Change the Data: Exploratory Data for
Offline Reinforcement Learning [147.61075994259807]
We propose Exploratory data for Offline RL (ExORL), a data-centric approach to offline RL.
ExORL first generates data with unsupervised reward-free exploration, then relabels this data with a downstream reward before training a policy with offline RL.
We find that exploratory data allows vanilla off-policy RL algorithms, without any offline-specific modifications, to outperform or match state-of-the-art offline RL algorithms on downstream tasks.
arXiv Detail & Related papers (2022-01-31T18:39:27Z) - RvS: What is Essential for Offline RL via Supervised Learning? [77.91045677562802]
Recent work has shown that supervised learning alone, without temporal difference (TD) learning, can be remarkably effective for offline RL.
In every environment suite we consider simply maximizing likelihood with two-layer feedforward is competitive.
They also probe the limits of existing RvS methods, which are comparatively weak on random data.
arXiv Detail & Related papers (2021-12-20T18:55:16Z) - RL4RS: A Real-World Dataset for Reinforcement Learning based Recommender
System [26.097154801770245]
Reinforcement learning based recommender systems (RL-based RS) aim at learning a good policy from a batch of collected data.
Current RL-based RS research commonly has a large reality gap.
We introduce the first open-source real-world dataset, RL4RS, hoping to replace the artificial datasets and semi-simulated RS datasets.
arXiv Detail & Related papers (2021-10-18T12:48:02Z) - RL Unplugged: A Suite of Benchmarks for Offline Reinforcement Learning [108.9599280270704]
We propose a benchmark called RL Unplugged to evaluate and compare offline RL methods.
RL Unplugged includes data from a diverse range of domains including games and simulated motor control problems.
We will release data for all our tasks and open-source all algorithms presented in this paper.
arXiv Detail & Related papers (2020-06-24T17:14:51Z) - D4RL: Datasets for Deep Data-Driven Reinforcement Learning [119.49182500071288]
We introduce benchmarks specifically designed for the offline setting, guided by key properties of datasets relevant to real-world applications of offline RL.
By moving beyond simple benchmark tasks and data collected by partially-trained RL agents, we reveal important and unappreciated deficiencies of existing algorithms.
arXiv Detail & Related papers (2020-04-15T17:18:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.