Dreaming in Code for Curriculum Learning in Open-Ended Worlds
- URL: http://arxiv.org/abs/2602.08194v1
- Date: Mon, 09 Feb 2026 01:24:40 GMT
- Title: Dreaming in Code for Curriculum Learning in Open-Ended Worlds
- Authors: Konstantinos Mitsides, Maxence Faldor, Antoine Cully,
- Abstract summary: Dreaming in Code (DiCode) is a framework in which foundation models synthesize environment code to learn toward increasing competence.<n>We instantiate DiCode in Craftax, a challenging open-ended benchmark characterized by rich mechanics and longhorizon progression.<n>Our results suggest that code-level environment design provides a practical mechanism for curriculum control, enabling the construction of intermediate environments that bridge competence gaps in open-ended worlds.
- Score: 11.954246951892905
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Open-ended learning frames intelligence as emerging from continual interaction with an ever-expanding space of environments. While recent advances have utilized foundation models to programmatically generate diverse environments, these approaches often focus on discovering isolated behaviors rather than orchestrating sustained progression. In complex open-ended worlds, the large combinatorial space of possible challenges makes it difficult for agents to discover sequences of experiences that remain consistently learnable. To address this, we propose Dreaming in Code (DiCode), a framework in which foundation models synthesize executable environment code to scaffold learning toward increasing competence. In DiCode, "dreaming" takes the form of materializing code-level variations of the world. We instantiate DiCode in Craftax, a challenging open-ended benchmark characterized by rich mechanics and long-horizon progression. Empirically, DiCode enables agents to acquire long-horizon skills, achieving a $16\%$ improvement in mean return over the strongest baseline and non-zero success on late-game combat tasks where prior methods fail. Our results suggest that code-level environment design provides a practical mechanism for curriculum control, enabling the construction of intermediate environments that bridge competence gaps in open-ended worlds. Project page and source code are available at https://konstantinosmitsides.github.io/dreaming-in-code and https://github.com/konstantinosmitsides/dreaming-in-code.
Related papers
- CODE-SHARP: Continuous Open-ended Discovery and Evolution of Skills as Hierarchical Reward Programs [8.81909423168606]
We introduce Continuous Open-ended Discovery and Evolution of Skills as Hierarchical Reward Programs (CODE-SHARP)<n>We show that a goal-conditioned agent trained exclusively on the rewards generated by the discovered skills learns to solve increasingly long-horizon goals.<n>When composed by a high-level FM-based planner, the discovered skills enable a single goal-conditioned agent to solve complex, long-horizon tasks, outperforming both pretrained agents and task-specific expert policies by over $134$% on average.
arXiv Detail & Related papers (2026-02-10T18:51:39Z) - Web World Models [60.208836336654315]
We introduce the Web World Model (WWM), a middle ground where world state and physics'' are implemented in ordinary web code.<n>We build a suite of WWMs on a realistic web stack, including an infinite travel atlas grounded in real geography, fictional galaxy explorers, web-scale encyclopedic and narrative worlds, and simulation- and game-like environments.<n>Our results suggest that web stacks themselves can serve as a scalable substrate for world models, enabling controllable yet open-ended environments.
arXiv Detail & Related papers (2025-12-29T18:31:45Z) - CodeClash: Benchmarking Goal-Oriented Software Engineering [63.65464283837602]
We run 1680 tournaments (25,200 rounds total) to evaluate 8 LMs across 6 arenas.<n>Our results reveal that while models exhibit diverse development styles, they share fundamental limitations in strategic reasoning.<n>We open-source CodeClash to advance the study of autonomous, goal-oriented code development.
arXiv Detail & Related papers (2025-11-02T07:42:51Z) - OMNI-EPIC: Open-endedness via Models of human Notions of Interestingness with Environments Programmed in Code [6.067502582087556]
Open-ended and AI-generating algorithms aim to continuously generate and solve increasingly complex tasks indefinitely.<n>To accomplish this grand vision, learning must occur within a vast array of potential tasks.<n>We introduce a novel framework, OMNI-EPIC, that augments previous work in Open-endedness.
arXiv Detail & Related papers (2024-05-24T13:57:32Z) - A Survey of Neural Code Intelligence: Paradigms, Advances and Beyond [84.95530356322621]
This survey presents a systematic review of the advancements in code intelligence.<n>It covers over 50 representative models and their variants, more than 20 categories of tasks, and an extensive coverage of over 680 related works.<n>Building on our examination of the developmental trajectories, we further investigate the emerging synergies between code intelligence and broader machine intelligence.
arXiv Detail & Related papers (2024-03-21T08:54:56Z) - Octopus: Embodied Vision-Language Programmer from Environmental Feedback [58.04529328728999]
Embodied vision-language models (VLMs) have achieved substantial progress in multimodal perception and reasoning.
To bridge this gap, we introduce Octopus, an embodied vision-language programmer that uses executable code generation as a medium to connect planning and manipulation.
Octopus is designed to 1) proficiently comprehend an agent's visual and textual task objectives, 2) formulate intricate action sequences, and 3) generate executable code.
arXiv Detail & Related papers (2023-10-12T17:59:58Z) - Mastering Diverse Domains through World Models [43.382115013586535]
We present DreamerV3, a general algorithm that outperforms specialized methods across over 150 diverse tasks, with a single configuration.
Dreamer is the first algorithm to collect diamonds in Minecraft from scratch without human data or curricula.
arXiv Detail & Related papers (2023-01-10T18:12:16Z) - WILD-SCAV: Benchmarking FPS Gaming AI on Unity3D-based Environments [5.020816812380825]
Recent advances in deep reinforcement learning (RL) have demonstrated complex decision-making capabilities in simulation environments.
However, they are hardly to more complicated problems, due to the lack of complexity and variations in the environments they are trained and tested on.
We developed WILD-SCAV, a powerful and open-world environment based on a 3D open-world FPS game to bridge the gap.
It provides realistic 3D environments of variable complexity, various tasks, and multiple modes of interaction, where agents can learn to perceive 3D environments, navigate and plan, compete and cooperate in a human-like manner
arXiv Detail & Related papers (2022-10-14T13:39:41Z) - Towards Top-Down Automated Development in Limited Scopes: A
Neuro-Symbolic Framework from Expressibles to Executables [4.844958528198992]
We build a taxonomy on code data, namely code taxonomy, leveraging the categorization of code information.
We introduce a three-layer semantic pyramid (SP) to associate text data and code data.
We propose a semantic pyramid framework (SPF) as the approach, focusing on software of high modularity and low complexity.
arXiv Detail & Related papers (2022-09-04T08:35:16Z) - MineDojo: Building Open-Ended Embodied Agents with Internet-Scale
Knowledge [70.47759528596711]
We introduce MineDojo, a new framework built on the popular Minecraft game.
We propose a novel agent learning algorithm that leverages large pre-trained video-language models as a learned reward function.
Our agent is able to solve a variety of open-ended tasks specified in free-form language without any manually designed dense shaping reward.
arXiv Detail & Related papers (2022-06-17T15:53:05Z) - COSEA: Convolutional Code Search with Layer-wise Attention [90.35777733464354]
We propose a new deep learning architecture, COSEA, which leverages convolutional neural networks with layer-wise attention to capture the code's intrinsic structural logic.
COSEA can achieve significant improvements over state-of-the-art methods on code search tasks.
arXiv Detail & Related papers (2020-10-19T13:53:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.