Katakomba: Tools and Benchmarks for Data-Driven NetHack
- URL: http://arxiv.org/abs/2306.08772v2
- Date: Thu, 26 Oct 2023 18:51:10 GMT
- Title: Katakomba: Tools and Benchmarks for Data-Driven NetHack
- Authors: Vladislav Kurenkov, Alexander Nikulin, Denis Tarasov, Sergey
Kolesnikov
- Abstract summary: NetHack is known as the frontier of reinforcement learning research.
We argue that there are three major obstacles for adoption: resource-wise, implementation-wise, and benchmark-wise.
We develop an open-source library that provides workflow fundamentals familiar to the offline reinforcement learning community.
- Score: 52.0035089982277
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: NetHack is known as the frontier of reinforcement learning research where
learning-based methods still need to catch up to rule-based solutions. One of
the promising directions for a breakthrough is using pre-collected datasets
similar to recent developments in robotics, recommender systems, and more under
the umbrella of offline reinforcement learning (ORL). Recently, a large-scale
NetHack dataset was released; while it was a necessary step forward, it has yet
to gain wide adoption in the ORL community. In this work, we argue that there
are three major obstacles for adoption: resource-wise, implementation-wise, and
benchmark-wise. To address them, we develop an open-source library that
provides workflow fundamentals familiar to the ORL community: pre-defined
D4RL-style tasks, uncluttered baseline implementations, and reliable evaluation
tools with accompanying configs and logs synced to the cloud.
Related papers
- SERL: A Software Suite for Sample-Efficient Robotic Reinforcement
Learning [85.21378553454672]
We develop a library containing a sample efficient off-policy deep RL method, together with methods for computing rewards and resetting the environment.
We find that our implementation can achieve very efficient learning, acquiring policies for PCB board assembly, cable routing, and object relocation.
These policies achieve perfect or near-perfect success rates, extreme robustness even under perturbations, and exhibit emergent robustness recovery and correction behaviors.
arXiv Detail & Related papers (2024-01-29T10:01:10Z) - A Real-World Quadrupedal Locomotion Benchmark for Offline Reinforcement
Learning [27.00483962026472]
We benchmark 11 offline reinforcement learning algorithms in realistic quadrupedal locomotion dataset.
Experiments show that the best-performing ORL algorithms can achieve competitive performance compared with the model-free RL.
Our proposed benchmark will serve as a development platform for testing and evaluating the performance of ORL algorithms in real-world legged locomotion tasks.
arXiv Detail & Related papers (2023-09-13T13:18:29Z) - Benchmarking Offline Reinforcement Learning on Real-Robot Hardware [35.29390454207064]
Dexterous manipulation in particular remains an open problem in its general form.
We propose a benchmark including a large collection of data for offline learning from a dexterous manipulation platform on two tasks.
We evaluate prominent open-sourced offline reinforcement learning algorithms on the datasets and provide a reproducible experimental setup for offline reinforcement learning on real systems.
arXiv Detail & Related papers (2023-07-28T17:29:49Z) - Efficient Online Reinforcement Learning with Offline Data [78.92501185886569]
We show that we can simply apply existing off-policy methods to leverage offline data when learning online.
We extensively ablate these design choices, demonstrating the key factors that most affect performance.
We see that correct application of these simple recommendations can provide a $mathbf2.5times$ improvement over existing approaches.
arXiv Detail & Related papers (2023-02-06T17:30:22Z) - CORL: Research-oriented Deep Offline Reinforcement Learning Library [48.47248460865739]
CORL is an open-source library that provides thoroughly benchmarked single-file implementations of reinforcement learning algorithms.
It emphasizes a simple developing experience with a straightforward and a modern analysis tracking tool.
arXiv Detail & Related papers (2022-10-13T15:40:11Z) - Jump-Start Reinforcement Learning [68.82380421479675]
We present a meta algorithm that can use offline data, demonstrations, or a pre-existing policy to initialize an RL policy.
In particular, we propose Jump-Start Reinforcement Learning (JSRL), an algorithm that employs two policies to solve tasks.
We show via experiments that JSRL is able to significantly outperform existing imitation and reinforcement learning algorithms.
arXiv Detail & Related papers (2022-04-05T17:25:22Z) - Single-Shot Pruning for Offline Reinforcement Learning [47.886329599997474]
Deep Reinforcement Learning (RL) is a powerful framework for solving complex real-world problems.
One way to tackle this problem is to prune neural networks leaving only the necessary parameters.
We close the gap between RL and single-shot pruning techniques and present a general pruning approach to the Offline RL.
arXiv Detail & Related papers (2021-12-31T18:10:02Z) - URLB: Unsupervised Reinforcement Learning Benchmark [82.36060735454647]
We introduce the Unsupervised Reinforcement Learning Benchmark (URLB)
URLB consists of two phases: reward-free pre-training and downstream task adaptation with extrinsic rewards.
We provide twelve continuous control tasks from three domains for evaluation and open-source code for eight leading unsupervised RL methods.
arXiv Detail & Related papers (2021-10-28T15:07:01Z) - D4RL: Datasets for Deep Data-Driven Reinforcement Learning [119.49182500071288]
We introduce benchmarks specifically designed for the offline setting, guided by key properties of datasets relevant to real-world applications of offline RL.
By moving beyond simple benchmark tasks and data collected by partially-trained RL agents, we reveal important and unappreciated deficiencies of existing algorithms.
arXiv Detail & Related papers (2020-04-15T17:18:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.