Craftium: An Extensible Framework for Creating Reinforcement Learning Environments
- URL: http://arxiv.org/abs/2407.03969v1
- Date: Thu, 4 Jul 2024 14:38:02 GMT
- Title: Craftium: An Extensible Framework for Creating Reinforcement Learning Environments
- Authors: Mikel Malagón, Josu Ceberio, Jose A. Lozano,
- Abstract summary: This paper presents Craftium, a novel framework for exploring and creating rich 3D visual RL environments.
Craftium builds upon the Minetest game engine and the popular Gymnasium API.
- Score: 0.5461938536945723
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Most Reinforcement Learning (RL) environments are created by adapting existing physics simulators or video games. However, they usually lack the flexibility required for analyzing specific characteristics of RL methods often relevant to research. This paper presents Craftium, a novel framework for exploring and creating rich 3D visual RL environments that builds upon the Minetest game engine and the popular Gymnasium API. Minetest is built to be extended and can be used to easily create voxel-based 3D environments (often similar to Minecraft), while Gymnasium offers a simple and common interface for RL research. Craftium provides a platform that allows practitioners to create fully customized environments to suit their specific research requirements, ranging from simple visual tasks to infinite and procedurally generated worlds. We also provide five ready-to-use environments for benchmarking and as examples of how to develop new ones. The code and documentation are available at https://github.com/mikelma/craftium/.
Related papers
- EmbodiedGen: Towards a Generative 3D World Engine for Embodied Intelligence [8.987157387248317]
EmbodiedGen is a foundational platform for interactive 3D world generation.<n>It enables the scalable generation of high-quality, controllable and photorealistic 3D assets at low cost.
arXiv Detail & Related papers (2025-06-12T11:43:50Z) - Optimus-3: Towards Generalist Multimodal Minecraft Agents with Scalable Task Experts [54.21319853862452]
We present Optimus-3, a general-purpose agent for Minecraft.<n>We propose a knowledge-enhanced data generation pipeline to provide scalable and high-quality training data for agent development.<n>We develop a Multimodal Reasoning-Augmented Reinforcement Learning approach to enhance the agent's reasoning ability for visual diversity.
arXiv Detail & Related papers (2025-06-12T05:29:40Z) - MambaNeXt-YOLO: A Hybrid State Space Model for Real-time Object Detection [4.757840725810513]
YOLO-series models have set strong benchmarks by balancing speed and accuracy.<n>Transformers have high computational complexity because of their self-attention mechanism.<n>We propose MambaNeXt-YOLO, a novel object detection framework that balances accuracy and efficiency.
arXiv Detail & Related papers (2025-06-04T07:46:24Z) - Exploration-Driven Generative Interactive Environments [53.05314852577144]
We focus on using many virtual environments for inexpensive, automatically collected interaction data.<n>We propose a training framework merely using a random agent in virtual environments.<n>Our agent is fully independent of environment-specific rewards and thus adapts easily to new environments.
arXiv Detail & Related papers (2025-04-03T12:01:41Z) - SynCity: Training-Free Generation of 3D Worlds [107.69875149880679]
We propose SynCity, a training- and optimization-free approach to generating 3D worlds from textual descriptions.<n>We show how 3D and 2D generators can be combined to generate ever-expanding scenes.
arXiv Detail & Related papers (2025-03-20T17:59:40Z) - UnrealZoo: Enriching Photo-realistic Virtual Worlds for Embodied AI [37.47562766916571]
We introduce UnrealZoo, a rich collection of photo-realistic 3D virtual worlds built on Unreal Engine.<n>We offer a variety of playable entities for embodied AI agents.
arXiv Detail & Related papers (2024-12-30T14:31:01Z) - GenEx: Generating an Explorable World [59.0666303068111]
We introduce GenEx, a system capable of planning complex embodied world exploration, guided by its generative imagination.<n>GenEx generates an entire 3D-consistent imaginative environment from as little as a single RGB image.<n> GPT-assisted agents are equipped to perform complex embodied tasks, including both goal-agnostic exploration and goal-driven navigation.
arXiv Detail & Related papers (2024-12-12T18:59:57Z) - Proc-GS: Procedural Building Generation for City Assembly with 3D Gaussians [65.09942210464747]
Building asset creation is labor-intensive and requires specialized skills to develop design rules.<n>Recent generative models for building creation often overlook these patterns, leading to low visual fidelity and limited scalability.<n>By manipulating procedural code, we can streamline this process and generate an infinite variety of buildings.
arXiv Detail & Related papers (2024-12-10T16:45:32Z) - Gymnasium: A Standard Interface for Reinforcement Learning Environments [5.7144222327514616]
Reinforcement Learning (RL) is a growing field that has the potential to revolutionize many areas of artificial intelligence.
Despite its promise, RL research is often hindered by the lack of standardization in environment and algorithm implementations.
Gymnasium is an open-source library that provides a standard API for RL environments.
arXiv Detail & Related papers (2024-07-24T06:35:05Z) - LEGENT: Open Platform for Embodied Agents [60.71847900126832]
We introduce LEGENT, an open, scalable platform for developing embodied agents using Large Language Models (LLMs) and Large Multimodal Models (LMMs)
LEGENT offers a rich, interactive 3D environment with communicable and actionable agents, paired with a user-friendly interface.
In experiments, an embryonic vision-language-action model trained on LEGENT-generated data surpasses GPT-4V in embodied tasks.
arXiv Detail & Related papers (2024-04-28T16:50:12Z) - Craftax: A Lightning-Fast Benchmark for Open-Ended Reinforcement Learning [4.067733179628694]
Craftax is a ground-up rewrite of Crafter in JAX that runs up to 250x faster than the Python-native original.
A run of PPO using 1 billion environment interactions finishes in under an hour using only a single GPU.
We show that existing methods including global and episodic exploration, as well as unsupervised environment design fail to make material progress on the benchmark.
arXiv Detail & Related papers (2024-02-26T18:19:07Z) - Ghost in the Minecraft: Generally Capable Agents for Open-World
Environments via Large Language Models with Text-based Knowledge and Memory [97.87093169454431]
Ghost in the Minecraft (GITM) is a novel framework that integrates Large Language Models (LLMs) with text-based knowledge and memory.
We develop a set of structured actions and leverage LLMs to generate action plans for the agents to execute.
The resulting LLM-based agent markedly surpasses previous methods, achieving a remarkable improvement of +47.5% in success rate.
arXiv Detail & Related papers (2023-05-25T17:59:49Z) - SPRING: Studying the Paper and Reasoning to Play Games [102.5587155284795]
We propose a novel approach, SPRING, to read the game's original academic paper and use the knowledge learned to reason and play the game through a large language model (LLM)
In experiments, we study the quality of in-context "reasoning" induced by different forms of prompts under the setting of the Crafter open-world environment.
Our experiments suggest that LLMs, when prompted with consistent chain-of-thought, have great potential in completing sophisticated high-level trajectories.
arXiv Detail & Related papers (2023-05-24T18:14:35Z) - WILD-SCAV: Benchmarking FPS Gaming AI on Unity3D-based Environments [5.020816812380825]
Recent advances in deep reinforcement learning (RL) have demonstrated complex decision-making capabilities in simulation environments.
However, they are hardly to more complicated problems, due to the lack of complexity and variations in the environments they are trained and tested on.
We developed WILD-SCAV, a powerful and open-world environment based on a 3D open-world FPS game to bridge the gap.
It provides realistic 3D environments of variable complexity, various tasks, and multiple modes of interaction, where agents can learn to perceive 3D environments, navigate and plan, compete and cooperate in a human-like manner
arXiv Detail & Related papers (2022-10-14T13:39:41Z) - MineDojo: Building Open-Ended Embodied Agents with Internet-Scale
Knowledge [70.47759528596711]
We introduce MineDojo, a new framework built on the popular Minecraft game.
We propose a novel agent learning algorithm that leverages large pre-trained video-language models as a learned reward function.
Our agent is able to solve a variety of open-ended tasks specified in free-form language without any manually designed dense shaping reward.
arXiv Detail & Related papers (2022-06-17T15:53:05Z) - OPEn: An Open-ended Physics Environment for Learning Without a Task [132.6062618135179]
We will study if models of the world learned in an open-ended physics environment, without any specific tasks, can be reused for downstream physics reasoning tasks.
We build a benchmark Open-ended Physics ENvironment (OPEn) and also design several tasks to test learning representations in this environment explicitly.
We find that an agent using unsupervised contrastive learning for representation learning, and impact-driven learning for exploration, achieved the best results.
arXiv Detail & Related papers (2021-10-13T17:48:23Z) - MiniHack the Planet: A Sandbox for Open-Ended Reinforcement Learning
Research [24.9044606044585]
MiniHack is a powerful sandbox framework for easily designing novel deep reinforcement learning environments.
By leveraging the full set of entities and environment dynamics from NetHack, MiniHack allows designing custom RL testbeds.
In addition to a variety of RL tasks and baselines, MiniHack can wrap existing RL benchmarks and provide ways to seamlessly add additional complexity.
arXiv Detail & Related papers (2021-09-27T17:22:42Z) - Evaluating Continual Learning Algorithms by Generating 3D Virtual
Environments [66.83839051693695]
Continual learning refers to the ability of humans and animals to incrementally learn over time in a given environment.
We propose to leverage recent advances in 3D virtual environments in order to approach the automatic generation of potentially life-long dynamic scenes with photo-realistic appearance.
A novel element of this paper is that scenes are described in a parametric way, thus allowing the user to fully control the visual complexity of the input stream the agent perceives.
arXiv Detail & Related papers (2021-09-16T10:37:21Z) - The NetHack Learning Environment [79.06395964379107]
We present the NetHack Learning Environment (NLE), a procedurally generated rogue-like environment for Reinforcement Learning research.
We argue that NetHack is sufficiently complex to drive long-term research on problems such as exploration, planning, skill acquisition, and language-conditioned RL.
We demonstrate empirical success for early stages of the game using a distributed Deep RL baseline and Random Network Distillation exploration.
arXiv Detail & Related papers (2020-06-24T14:12:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.