Related papers: Generative World Models of Tasks: LLM-Driven Hierarchical Scaffolding for Embodied Agents

Generative World Models of Tasks: LLM-Driven Hierarchical Scaffolding for Embodied Agents

URL: http://arxiv.org/abs/2509.04731v3
Date: Tue, 04 Nov 2025 06:25:59 GMT
Title: Generative World Models of Tasks: LLM-Driven Hierarchical Scaffolding for Embodied Agents
Authors: Brennen Hill,
Abstract summary: We propose an effective world model for decision-making that models the world's physics and its task semantics.<n>A systematic review of 2024 research in low-resource multi-agent soccer reveals a clear trend towards integrating symbolic and hierarchical methods.<n>We formalize this trend into a framework for Hierarchical Task Environments (HTEs), which are essential for bridging the gap between simple, reactive behaviors and sophisticated, strategic team play.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Recent advances in agent development have focused on scaling model size and raw interaction data, mirroring successes in large language models. However, for complex, long-horizon multi-agent tasks such as robotic soccer, this end-to-end approach often fails due to intractable exploration spaces and sparse rewards. We propose that an effective world model for decision-making must model the world's physics and also its task semantics. A systematic review of 2024 research in low-resource multi-agent soccer reveals a clear trend towards integrating symbolic and hierarchical methods, such as Hierarchical Task Networks (HTNs) and Bayesian Strategy Networks (BSNs), with multi-agent reinforcement learning (MARL). These methods decompose complex goals into manageable subgoals, creating an intrinsic curriculum that shapes agent learning. We formalize this trend into a framework for Hierarchical Task Environments (HTEs), which are essential for bridging the gap between simple, reactive behaviors and sophisticated, strategic team play. Our framework incorporates the use of Large Language Models (LLMs) as generative world models of tasks, capable of dynamically generating this scaffolding. We argue that HTEs provide a mechanism to guide exploration, generate meaningful learning signals, and train agents to internalize hierarchical structure, enabling the development of more capable and general-purpose agents with greater sample efficiency than purely end-to-end approaches.

Related papers

MARTI-MARS$^2$: Scaling Multi-Agent Self-Search via Reinforcement Learning for Code Generation [64.2621682259008]
Multi-Agent Reinforced Training and Inference Framework with Self-Search Scaling (MARTI-MARS2)<n>We propose a Multi-Agent Reinforced Training and Inference Framework with Self-Search Scaling (MARTI-MARS2) to integrate policy learning with multi-agent tree search.<n>We show that MARTI-MARS2 achieves 77.7%, outperforming strong baselines like GPT-5.1 on challenging code generation benchmarks.
arXiv Detail & Related papers (2026-02-08T07:28:44Z)
SYMPHONY: Synergistic Multi-agent Planning with Heterogeneous Language Model Assembly [6.444704310331922]
We propose Synergistic Multi-agent Planning with Heterogeneous langauge model assembly (SYMPHONY), a novel multi-agent planning framework.<n>By leveraging diverse reasoning patterns across agents, SYMPHONY enhances rollout diversity and facilitates more effective exploration.<n> Empirical results show that SYMPHONY achieves strong performance even when instantiated with open-source LLMs deployable on consumer-grade hardware.
arXiv Detail & Related papers (2026-01-30T06:26:34Z)
Youtu-LLM: Unlocking the Native Agentic Potential for Lightweight Large Language Models [78.73992315826035]
We introduce Youtu-LLM, a lightweight language model that harmonizes high computational efficiency with native agentic intelligence.<n>Youtu-LLM is pre-trained from scratch to systematically cultivate reasoning and planning capabilities.
arXiv Detail & Related papers (2025-12-31T04:25:11Z)
Designing Domain-Specific Agents via Hierarchical Task Abstraction Mechanism [61.01709143437043]
We introduce a novel agent design framework centered on a Hierarchical Task Abstraction Mechanism (HTAM)<n>Specifically, HTAM moves beyond emulating social roles, instead structuring multi-agent systems into a logical hierarchy that mirrors the intrinsic task-dependency graph of a given domain.<n>We instantiate this framework as EarthAgent, a multi-agent system tailored for complex geospatial analysis.
arXiv Detail & Related papers (2025-11-21T12:25:47Z)
The Landscape of Agentic Reinforcement Learning for LLMs: A Survey [104.31926740841128]
The emergence of agentic reinforcement learning (Agentic RL) marks a paradigm shift from conventional reinforcement learning applied to large language models (LLM RL)<n>This survey formalizes this conceptual shift by contrasting the degenerate single-step Markov Decision Processes (MDPs) of LLM-RL with the temporally extended, partially observable Markov decision processes (POMDPs) that define Agentic RL.
arXiv Detail & Related papers (2025-09-02T17:46:26Z)
CREW-WILDFIRE: Benchmarking Agentic Multi-Agent Collaborations at Scale [4.464959191643012]
We introduce CREW-Wildfire, an open-source benchmark designed to evaluate next-generation multi-agent Agentic AI frameworks.<n> CREW-Wildfire offers procedurally generated wildfire response scenarios featuring large maps, heterogeneous agents, partial observability, dynamics, and long-horizon planning objectives.<n>We implement and evaluate several state-of-the-art LLM-based multi-agent Agentic AI frameworks, uncovering significant performance gaps.
arXiv Detail & Related papers (2025-07-07T16:33:42Z)
Continual Learning for Generative AI: From LLMs to MLLMs and Beyond [56.29231194002407]
We present a comprehensive survey of continual learning methods for mainstream generative AI models.<n>We categorize these approaches into three paradigms: architecture-based, regularization-based, and replay-based.<n>We analyze continual learning setups for different generative models, including training objectives, benchmarks, and core backbones.
arXiv Detail & Related papers (2025-06-16T02:27:25Z)
World Models for Cognitive Agents: Transforming Edge Intelligence in Future Networks [55.90051810762702]
We present a comprehensive overview of world models, highlighting their architecture, training paradigms, and applications across prediction, generation, planning, and causal reasoning.<n>We propose Wireless Dreamer, a novel world model-based reinforcement learning framework tailored for wireless edge intelligence optimization.
arXiv Detail & Related papers (2025-05-31T06:43:00Z)
Large Language Model Agent: A Survey on Methodology, Applications and Challenges [88.3032929492409]
Large Language Model (LLM) agents, with goal-driven behaviors and dynamic adaptation capabilities, potentially represent a critical pathway toward artificial general intelligence.<n>This survey systematically deconstructs LLM agent systems through a methodology-centered taxonomy.<n>Our work provides a unified architectural perspective, examining how agents are constructed, how they collaborate, and how they evolve over time.
arXiv Detail & Related papers (2025-03-27T12:50:17Z)
APT: Architectural Planning and Text-to-Blueprint Construction Using Large Language Models for Open-World Agents [8.479128275067742]
We present an advanced Large Language Model (LLM)-driven framework that enables autonomous agents to construct complex structures in Minecraft.<n>By employing chain-of-thought decomposition along with multimodal inputs, the framework generates detailed architectural layouts and blueprints.<n>Our agent incorporates both memory and reflection modules to facilitate lifelong learning, adaptive refinement, and error correction throughout the building process.
arXiv Detail & Related papers (2024-11-26T09:31:28Z)
Toward Universal and Interpretable World Models for Open-ended Learning Agents [0.0]
We introduce a generic, compositional and interpretable class of generative world models that supports open-ended learning agents. This is a sparse class of Bayesian networks capable of approximating a broad range of processes, which provide agents with the ability to learn world models in a manner that may be both interpretable and computationally scalable.
arXiv Detail & Related papers (2024-09-27T12:03:15Z)
Decentralized Transformers with Centralized Aggregation are Sample-Efficient Multi-Agent World Models [106.35361897941898]
We propose a novel world model for Multi-Agent RL (MARL) that learns decentralized local dynamics for scalability.<n>We also introduce a Perceiver Transformer as an effective solution to enable centralized representation aggregation.<n>Results on Starcraft Multi-Agent Challenge (SMAC) show that it outperforms strong model-free approaches and existing model-based methods in both sample efficiency and overall performance.
arXiv Detail & Related papers (2024-06-22T12:40:03Z)
VDFD: Multi-Agent Value Decomposition Framework with Disentangled World Model [10.36125908359289]
We propose a novel model-based multi-agent reinforcement learning approach named Value Decomposition Framework with Disentangled World Model.<n>Our method achieves high sample efficiency and exhibits superior performance compared to other baselines across a wide range of multi-agent learning tasks.
arXiv Detail & Related papers (2023-09-08T22:12:43Z)
Entity Divider with Language Grounding in Multi-Agent Reinforcement Learning [28.619845209653274]
We investigate the use of natural language to drive the generalization of policies in multi-agent settings. We propose a novel framework for language grounding in multi-agent reinforcement learning, entity divider (EnDi) EnDi enables agents to independently learn subgoal division at the entity level and act in the environment based on the associated entities.
arXiv Detail & Related papers (2022-10-25T11:53:52Z)

This list is automatically generated from the titles and abstracts of the papers in this site.