Related papers: CrafterDojo: A Suite of Foundation Models for Building Open-Ended Embodied Agents in Crafter

CrafterDojo: A Suite of Foundation Models for Building Open-Ended Embodied Agents in Crafter

URL: http://arxiv.org/abs/2508.13530v1
Date: Tue, 19 Aug 2025 05:43:19 GMT
Title: CrafterDojo: A Suite of Foundation Models for Building Open-Ended Embodied Agents in Crafter
Authors: Junyeong Park, Hyeonseo Cho, Sungjin Ahn,
Abstract summary: Minecraft provides rich complexity and internet-scale data, but its slow speed and engineering overhead make it unsuitable for rapid prototyping.<n>Crafter offers a lightweight alternative that retains key challenges from Minecraft, yet its use has remained limited to narrow tasks.<n>We present CrafterDojo, a suite of foundation models and tools that unlock the Crafter environment as a lightweight, prototyping-friendly, and Minecraft-like testbed for embodied agent research.
Score: 14.859398858994302
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Developing general-purpose embodied agents is a core challenge in AI. Minecraft provides rich complexity and internet-scale data, but its slow speed and engineering overhead make it unsuitable for rapid prototyping. Crafter offers a lightweight alternative that retains key challenges from Minecraft, yet its use has remained limited to narrow tasks due to the absence of foundation models that have driven progress in the Minecraft setting. In this paper, we present CrafterDojo, a suite of foundation models and tools that unlock the Crafter environment as a lightweight, prototyping-friendly, and Minecraft-like testbed for general-purpose embodied agent research. CrafterDojo addresses this by introducing CrafterVPT, CrafterCLIP, and CrafterSteve-1 for behavior priors, vision-language grounding, and instruction following, respectively. In addition, we provide toolkits for generating behavior and caption datasets (CrafterPlay and CrafterCaption), reference agent implementations, benchmark evaluations, and a complete open-source codebase.

Related papers

SkillCraft: Can LLM Agents Learn to Use Tools Skillfully? [67.69996753743129]
We introduce SkillCraft, a benchmark explicitly stress-test agent ability to form and reuse higher-level tool compositions.<n> SkillCraft features realistic, highly compositional tool-use scenarios with difficulty scaled along both quantitative and structural dimensions.<n>We propose a lightweight evaluation protocol that enables agents to auto-compose atomic tools into executable Skills, cache and reuse them inside and across tasks.
arXiv Detail & Related papers (2026-02-28T15:44:31Z)
SC2Tools: StarCraft II Toolset and Dataset API [0.26097841018267615]
Gaming and esports are key areas influenced by the application of Artificial Intelligence (AI) and Machine Learning (ML) solutions at scale.<n>In this work, we present SC2Tools'', a toolset containing multiple submodules responsible for working with, and producing larger datasets.<n>The tools we present were leveraged in creating one of the largest StarCraft2 tournament datasets to date with a separate PyTorch and PyTorch application Lightning programming interface (API) for easy access to the data.
arXiv Detail & Related papers (2025-09-22T22:25:21Z)
Optimus-3: Towards Generalist Multimodal Minecraft Agents with Scalable Task Experts [54.21319853862452]
We present Optimus-3, a general-purpose agent for Minecraft.<n>We propose a knowledge-enhanced data generation pipeline to provide scalable and high-quality training data for agent development.<n>We develop a Multimodal Reasoning-Augmented Reinforcement Learning approach to enhance the agent's reasoning ability for visual diversity.
arXiv Detail & Related papers (2025-06-12T05:29:40Z)
MineStudio: A Streamlined Package for Minecraft AI Agent Development [12.327116914644627]
This paper presents MineStudio, an open-source software package designed to streamline the development of autonomous agents in Minecraft.<n>MineStudio represents the first comprehensive integration of seven critical engineering components: simulator, data, model, offline pre-training, online fine-tuning, inference, and benchmark.<n>We provide a user-friendly API design accompanied by comprehensive documentation and tutorials.
arXiv Detail & Related papers (2024-12-24T09:01:43Z)
Odyssey: Empowering Minecraft Agents with Open-World Skills [26.537984734738764]
We introduce Odyssey, a new framework that empowers Large Language Model (LLM)-based agents with open-world skills to explore the vast Minecraft world.<n>Odyssey comprises three key parts: (1) An interactive agent with an open-world skill library that consists of 40 primitive skills and 183 compositional skills; (2) A fine-tuned LLaMA-3 model trained on a large question-answering dataset with 390k+ instruction entries derived from the Minecraft Wiki; and (3) A new agent capability benchmark.
arXiv Detail & Related papers (2024-07-22T02:06:59Z)
Craftax: A Lightning-Fast Benchmark for Open-Ended Reinforcement Learning [4.067733179628694]
Craftax is a ground-up rewrite of Crafter in JAX that runs up to 250x faster than the Python-native original. A run of PPO using 1 billion environment interactions finishes in under an hour using only a single GPU. We show that existing methods including global and episodic exploration, as well as unsupervised environment design fail to make material progress on the benchmark.
arXiv Detail & Related papers (2024-02-26T18:19:07Z)
Auto MC-Reward: Automated Dense Reward Design with Large Language Models for Minecraft [88.80684763462384]
This paper introduces an advanced learning system, named Auto MC-Reward, that leverages Large Language Models (LLMs) to automatically design dense reward functions. Experiments demonstrate a significant improvement in the success rate and learning efficiency of our agents in complex tasks in Minecraft.
arXiv Detail & Related papers (2023-12-14T18:58:12Z)
Ghost in the Minecraft: Generally Capable Agents for Open-World Environments via Large Language Models with Text-based Knowledge and Memory [97.87093169454431]
Ghost in the Minecraft (GITM) is a novel framework that integrates Large Language Models (LLMs) with text-based knowledge and memory. We develop a set of structured actions and leverage LLMs to generate action plans for the agents to execute. The resulting LLM-based agent markedly surpasses previous methods, achieving a remarkable improvement of +47.5% in success rate.
arXiv Detail & Related papers (2023-05-25T17:59:49Z)
Learning to Generalize with Object-centric Agents in the Open World Survival Game Crafter [72.80855376702746]
Reinforcement learning agents must generalize beyond their training experience. We introduce a new set of environments suitable for evaluating some agent's ability to generalize. We show that current agents struggle to generalize, and introduce novel object-centric agents that improve over strong baselines.
arXiv Detail & Related papers (2022-08-05T20:05:46Z)
MineDojo: Building Open-Ended Embodied Agents with Internet-Scale Knowledge [70.47759528596711]
We introduce MineDojo, a new framework built on the popular Minecraft game. We propose a novel agent learning algorithm that leverages large pre-trained video-language models as a learned reward function. Our agent is able to solve a variety of open-ended tasks specified in free-form language without any manually designed dense shaping reward.
arXiv Detail & Related papers (2022-06-17T15:53:05Z)
EvoCraft: A New Challenge for Open-Endedness [7.927206441149002]
EvoCraft is a framework for Minecraft designed to study open-ended algorithms. EvoCraft offers a challenging new environment for automated search methods (such as evolution) to find complex artifacts.
arXiv Detail & Related papers (2020-12-08T21:36:18Z)

This list is automatically generated from the titles and abstracts of the papers in this site.