Related papers: Craftax: A Lightning-Fast Benchmark for Open-Ended Reinforcement Learning

Craftax: A Lightning-Fast Benchmark for Open-Ended Reinforcement Learning

URL: http://arxiv.org/abs/2402.16801v2
Date: Mon, 3 Jun 2024 14:12:27 GMT
Title: Craftax: A Lightning-Fast Benchmark for Open-Ended Reinforcement Learning
Authors: Michael Matthews, Michael Beukman, Benjamin Ellis, Mikayel Samvelyan, Matthew Jackson, Samuel Coward, Jakob Foerster,
Abstract summary: Craftax is a ground-up rewrite of Crafter in JAX that runs up to 250x faster than the Python-native original. A run of PPO using 1 billion environment interactions finishes in under an hour using only a single GPU. We show that existing methods including global and episodic exploration, as well as unsupervised environment design fail to make material progress on the benchmark.
Score: 4.067733179628694
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Benchmarks play a crucial role in the development and analysis of reinforcement learning (RL) algorithms. We identify that existing benchmarks used for research into open-ended learning fall into one of two categories. Either they are too slow for meaningful research to be performed without enormous computational resources, like Crafter, NetHack and Minecraft, or they are not complex enough to pose a significant challenge, like Minigrid and Procgen. To remedy this, we first present Craftax-Classic: a ground-up rewrite of Crafter in JAX that runs up to 250x faster than the Python-native original. A run of PPO using 1 billion environment interactions finishes in under an hour using only a single GPU and averages 90% of the optimal reward. To provide a more compelling challenge we present the main Craftax benchmark, a significant extension of the Crafter mechanics with elements inspired from NetHack. Solving Craftax requires deep exploration, long term planning and memory, as well as continual adaptation to novel situations as more of the world is discovered. We show that existing methods including global and episodic exploration, as well as unsupervised environment design fail to make material progress on the benchmark. We believe that Craftax can for the first time allow researchers to experiment in a complex, open-ended environment with limited computational resources.

Related papers

Craftium: An Extensible Framework for Creating Reinforcement Learning Environments [0.5461938536945723]
This paper presents Craftium, a novel framework for exploring and creating rich 3D visual RL environments. Craftium builds upon the Minetest game engine and the popular Gymnasium API.
arXiv Detail & Related papers (2024-07-04T14:38:02Z)
JaxMARL: Multi-Agent RL Environments and Algorithms in JAX [105.343918678781]
We present JaxMARL, the first open-source, Python-based library that combines GPU-enabled efficiency with support for a large number of commonly used MARL environments. Our experiments show that, in terms of wall clock time, our JAX-based training pipeline is around 14 times faster than existing approaches. We also introduce and benchmark SMAX, a JAX-based approximate reimplementation of the popular StarCraft Multi-Agent Challenge.
arXiv Detail & Related papers (2023-11-16T18:58:43Z)
ArchGym: An Open-Source Gymnasium for Machine Learning Assisted Architecture Design [52.57999109204569]
ArchGym is an open-source framework that connects diverse search algorithms to architecture simulators. We evaluate ArchGym across multiple vanilla and domain-specific search algorithms in designing custom memory controller, deep neural network accelerators, and custom SOC for AR/VR workloads.
arXiv Detail & Related papers (2023-06-15T06:41:23Z)
Ghost in the Minecraft: Generally Capable Agents for Open-World Environments via Large Language Models with Text-based Knowledge and Memory [97.87093169454431]
Ghost in the Minecraft (GITM) is a novel framework that integrates Large Language Models (LLMs) with text-based knowledge and memory. We develop a set of structured actions and leverage LLMs to generate action plans for the agents to execute. The resulting LLM-based agent markedly surpasses previous methods, achieving a remarkable improvement of +47.5% in success rate.
arXiv Detail & Related papers (2023-05-25T17:59:49Z)
SPRING: Studying the Paper and Reasoning to Play Games [102.5587155284795]
We propose a novel approach, SPRING, to read the game's original academic paper and use the knowledge learned to reason and play the game through a large language model (LLM) In experiments, we study the quality of in-context "reasoning" induced by different forms of prompts under the setting of the Crafter open-world environment. Our experiments suggest that LLMs, when prompted with consistent chain-of-thought, have great potential in completing sophisticated high-level trajectories.
arXiv Detail & Related papers (2023-05-24T18:14:35Z)
Skill Reinforcement Learning and Planning for Open-World Long-Horizon Tasks [31.084848672383185]
We study building multi-task agents in open-world environments. We convert the multi-task learning problem into learning basic skills and planning over the skills. Our method accomplishes 40 diverse Minecraft tasks, where many tasks require sequentially executing for more than 10 skills.
arXiv Detail & Related papers (2023-03-29T09:45:50Z)
MiniHack the Planet: A Sandbox for Open-Ended Reinforcement Learning Research [24.9044606044585]
MiniHack is a powerful sandbox framework for easily designing novel deep reinforcement learning environments. By leveraging the full set of entities and environment dynamics from NetHack, MiniHack allows designing custom RL testbeds. In addition to a variety of RL tasks and baselines, MiniHack can wrap existing RL benchmarks and provide ways to seamlessly add additional complexity.
arXiv Detail & Related papers (2021-09-27T17:22:42Z)
Benchmarking the Spectrum of Agent Capabilities [7.088856621650764]
We introduce Crafter, an open world survival game with visual inputs that evaluates a wide range of general abilities within a single environment. Agents learn from the provided reward signal or through intrinsic objectives and are evaluated by semantically meaningful achievements. We experimentally verify that Crafter is of appropriate difficulty to drive future research and provide baselines scores of reward agents and unsupervised agents.
arXiv Detail & Related papers (2021-09-14T15:49:31Z)
Scaling Imitation Learning in Minecraft [114.6964571273486]
We apply imitation learning to attain state-of-the-art performance on hard exploration problems in the Minecraft environment. An early version of our approach reached second place in the MineRL competition at NeurIPS 2019.
arXiv Detail & Related papers (2020-07-06T12:47:01Z)
The NetHack Learning Environment [79.06395964379107]
We present the NetHack Learning Environment (NLE), a procedurally generated rogue-like environment for Reinforcement Learning research. We argue that NetHack is sufficiently complex to drive long-term research on problems such as exploration, planning, skill acquisition, and language-conditioned RL. We demonstrate empirical success for early stages of the game using a distributed Deep RL baseline and Random Network Distillation exploration.
arXiv Detail & Related papers (2020-06-24T14:12:56Z)

This list is automatically generated from the titles and abstracts of the papers in this site.