LODGE: Joint Hierarchical Task Planning and Learning of Domain Models with Grounded Execution
- URL: http://arxiv.org/abs/2505.13497v1
- Date: Thu, 15 May 2025 20:23:21 GMT
- Title: LODGE: Joint Hierarchical Task Planning and Learning of Domain Models with Grounded Execution
- Authors: Claudius Kienle, Benjamin Alt, Oleg Arenz, Jan Peters,
- Abstract summary: Large Language Models (LLMs) enable planning from natural language instructions using implicit world knowledge.<n>Recent methods aim to learn a problem domain that can be solved for different goal states using classical planners.<n>We address this shortcoming by learning hierarchical domains, where low-level predicates and actions are composed into higher-level counterparts.
- Score: 16.16223684887115
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Large Language Models (LLMs) enable planning from natural language instructions using implicit world knowledge, but often produce flawed plans that require refinement. Instead of directly predicting plans, recent methods aim to learn a problem domain that can be solved for different goal states using classical planners. However, these approaches require significant human feedback to obtain useful models. We address this shortcoming by learning hierarchical domains, where low-level predicates and actions are composed into higher-level counterparts, and by leveraging simulation to validate their preconditions and effects. This hierarchical approach is particularly powerful for long-horizon planning, where LLM-based planning approaches typically struggle. Furthermore, we introduce a central error reasoner to ensure consistency among the different planning levels. Evaluation on two challenging International Planning Competition (IPC) domains and a long-horizon robot manipulation task demonstrates higher planning success rates than state-of-the-art domain synthesis and LLM-modulo planning methods, while constructing high-quality models of the domain. Resources, videos and detailed experiment results are available at https://claudius-kienle.github.io/lodge/.
Related papers
- OFA-MAS: One-for-All Multi-Agent System Topology Design based on Mixture-of-Experts Graph Generative Models [57.94189874119267]
Multi-Agent Systems (MAS) offer a powerful paradigm for solving complex problems.<n>Current graph learning-based design methodologies often adhere to a "one-for-one" paradigm.<n>We propose OFA-TAD, a one-for-all framework that generates adaptive collaboration graphs for any task described in natural language.
arXiv Detail & Related papers (2026-01-19T12:23:44Z) - HELP: Hierarchical Embodied Language Planner for Household Tasks [75.38606213726906]
Embodied agents tasked with complex scenarios rely heavily on robust planning capabilities.<n>Large language models equipped with extensive linguistic knowledge can play this role.<n>We propose a Hierarchical Embodied Language Planner, called HELP, consisting of a set of LLM-based agents.
arXiv Detail & Related papers (2025-12-25T15:54:08Z) - Designing Domain-Specific Agents via Hierarchical Task Abstraction Mechanism [61.01709143437043]
We introduce a novel agent design framework centered on a Hierarchical Task Abstraction Mechanism (HTAM)<n>Specifically, HTAM moves beyond emulating social roles, instead structuring multi-agent systems into a logical hierarchy that mirrors the intrinsic task-dependency graph of a given domain.<n>We instantiate this framework as EarthAgent, a multi-agent system tailored for complex geospatial analysis.
arXiv Detail & Related papers (2025-11-21T12:25:47Z) - World Model Implanting for Test-time Adaptation of Embodied Agents [29.514831254621438]
In embodied AI, a persistent challenge is enabling agents to robustly adapt to novel domains without requiring extensive data collection or retraining.<n>We present a world model implanting framework (WorMI) that combines the reasoning capabilities of large language models with independently learned, domain-specific world models.<n>We evaluate our WorMI on the VirtualHome and ALFWorld benchmarks, demonstrating superior zero-shot and few-shot performance compared to several LLM-based approaches.
arXiv Detail & Related papers (2025-09-04T07:32:16Z) - Can LLM-Reasoning Models Replace Classical Planning? A Benchmark Study [0.0]
Large Language Models have sparked interest in their potential for robotic task planning.<n>While these models demonstrate strong generative capabilities, their effectiveness in producing structured and executable plans remains uncertain.<n>This paper presents a systematic evaluation of a broad spectrum of current state of the art language models.
arXiv Detail & Related papers (2025-07-31T14:25:54Z) - CREW-WILDFIRE: Benchmarking Agentic Multi-Agent Collaborations at Scale [4.464959191643012]
We introduce CREW-Wildfire, an open-source benchmark designed to evaluate next-generation multi-agent Agentic AI frameworks.<n> CREW-Wildfire offers procedurally generated wildfire response scenarios featuring large maps, heterogeneous agents, partial observability, dynamics, and long-horizon planning objectives.<n>We implement and evaluate several state-of-the-art LLM-based multi-agent Agentic AI frameworks, uncovering significant performance gaps.
arXiv Detail & Related papers (2025-07-07T16:33:42Z) - Adaptive Domain Modeling with Language Models: A Multi-Agent Approach to Task Planning [5.638621244710438]
TAPAS employs specialized LLM-based agents that collaboratively generate and adapt domain models.<n>A ReAct (Reason+Act)-style execution agent, coupled with natural language plan translation, bridges the gap between dynamically generated plans and real-world robot capabilities.
arXiv Detail & Related papers (2025-06-24T13:02:06Z) - Hierarchical Language Models for Semantic Navigation and Manipulation in an Aerial-Ground Robotic System [8.88014241557266]
Heterogeneous multirobot systems show great potential in complex tasks requiring coordinated hybrid cooperation.<n>Existing methods that rely on static or task-specific models often lack generalizability across diverse tasks and dynamic environments.<n>We propose a hierarchical multimodal framework that integrates a prompted large language model (LLM) with a fine-tuned vision-language model (VLM)
arXiv Detail & Related papers (2025-06-05T13:27:41Z) - OWL: Optimized Workforce Learning for General Multi-Agent Assistance in Real-World Task Automation [65.15955645757705]
We introduce Workforce, a hierarchical multi-agent framework that decouples strategic planning from specialized execution.<n>During inference, Workforce seamlessly adapts to new domains by adding or modifying worker agents.<n>For training, we introduce optimized Workforce Learning (OWL), which improves generalization across domains.
arXiv Detail & Related papers (2025-05-29T17:51:58Z) - Plan-and-Act: Improving Planning of Agents for Long-Horizon Tasks [36.63527489464188]
Plan-and-Act is a framework that incorporates explicit planning into large language models (LLMs)<n>Plan-and-Act consists of a Planner model which generates structured, high-level plans to achieve user goals, and an Executor model that translates these plans into environment-specific actions.<n>We present a state-of-the-art 57.58% success rate on the WebArena-Lite benchmark as well as a text-only state-of-the-art 81.36% success rate on WebVoyager.
arXiv Detail & Related papers (2025-03-12T17:40:52Z) - LLM-Generated Heuristics for AI Planning: Do We Even Need Domain-Independence Anymore? [87.71321254733384]
Large language models (LLMs) can generate planning approaches tailored to specific planning problems.<n>LLMs can achieve state-of-the-art performance on some standard IPC domains.<n>We discuss whether these results signify a paradigm shift and how they can complement existing planning approaches.
arXiv Detail & Related papers (2025-01-30T22:21:12Z) - Nl2Hltl2Plan: Scaling Up Natural Language Understanding for Multi-Robots Through Hierarchical Temporal Logic Task Representation [8.180994118420053]
Nl2Hltl2Plan is a framework that translates natural language commands into hierarchical Linear Temporal Logic (LTL)<n>First, an LLM transforms instructions into a Hierarchical Task Tree, capturing logical and temporal relations.<n>Next, a fine-tuned LLM converts sub-tasks into flat formulas, which are aggregated into hierarchical specifications.
arXiv Detail & Related papers (2024-08-15T14:46:13Z) - Exploring and Benchmarking the Planning Capabilities of Large Language Models [57.23454975238014]
This work lays the foundations for improving planning capabilities of large language models (LLMs)
We construct a comprehensive benchmark suite encompassing both classical planning benchmarks and natural language scenarios.
We investigate the use of many-shot in-context learning to enhance LLM planning, exploring the relationship between increased context length and improved planning performance.
arXiv Detail & Related papers (2024-06-18T22:57:06Z) - LLM-Assist: Enhancing Closed-Loop Planning with Language-Based Reasoning [65.86754998249224]
We develop a novel hybrid planner that leverages a conventional rule-based planner in conjunction with an LLM-based planner.
Our approach navigates complex scenarios which existing planners struggle with, produces well-reasoned outputs while also remaining grounded through working alongside the rule-based approach.
arXiv Detail & Related papers (2023-12-30T02:53:45Z) - Learning adaptive planning representations with natural language
guidance [90.24449752926866]
This paper describes Ada, a framework for automatically constructing task-specific planning representations.
Ada interactively learns a library of planner-compatible high-level action abstractions and low-level controllers adapted to a particular domain of planning tasks.
arXiv Detail & Related papers (2023-12-13T23:35:31Z) - Compositional Foundation Models for Hierarchical Planning [52.18904315515153]
We propose a foundation model which leverages expert foundation model trained on language, vision and action data individually together to solve long-horizon tasks.
We use a large language model to construct symbolic plans that are grounded in the environment through a large video diffusion model.
Generated video plans are then grounded to visual-motor control, through an inverse dynamics model that infers actions from generated videos.
arXiv Detail & Related papers (2023-09-15T17:44:05Z) - AI planning in the imagination: High-level planning on learned abstract
search spaces [68.75684174531962]
We propose a new method, called PiZero, that gives an agent the ability to plan in an abstract search space that the agent learns during training.
We evaluate our method on multiple domains, including the traveling salesman problem, Sokoban, 2048, the facility location problem, and Pacman.
arXiv Detail & Related papers (2023-08-16T22:47:16Z) - Learning to Reason over Scene Graphs: A Case Study of Finetuning GPT-2
into a Robot Language Model for Grounded Task Planning [45.51792981370957]
We investigate the applicability of a smaller class of large language models (LLMs) in robotic task planning by learning to decompose tasks into subgoal specifications for a planner to execute sequentially.
Our method grounds the input of the LLM on the domain that is represented as a scene graph, enabling it to translate human requests into executable robot plans.
Our findings suggest that the knowledge stored in an LLM can be effectively grounded to perform long-horizon task planning, demonstrating the promising potential for the future application of neuro-symbolic planning methods in robotics.
arXiv Detail & Related papers (2023-05-12T18:14:32Z) - A Framework for Neurosymbolic Robot Action Planning using Large Language Models [3.0501524254444767]
We present a framework aimed at bridging the gap between symbolic task planning and machine learning approaches.
The rationale is training Large Language Models (LLMs) into a neurosymbolic task planner compatible with the Planning Domain Definition Language (PDDL)
Preliminary results in selected domains show that our method can: (i) solve 95.5% of problems in a test data set of 1,000 samples; (ii) produce plans up to 13.5% shorter than a traditional symbolic planner; (iii) reduce average overall waiting times for a plan availability by up to 61.4%.
arXiv Detail & Related papers (2023-03-01T11:54:22Z) - Divide-and-Conquer Monte Carlo Tree Search For Goal-Directed Planning [78.65083326918351]
We consider alternatives to an implicit sequential planning assumption.
We propose Divide-and-Conquer Monte Carlo Tree Search (DC-MCTS) for approximating the optimal plan.
We show that this algorithmic flexibility over planning order leads to improved results in navigation tasks in grid-worlds.
arXiv Detail & Related papers (2020-04-23T18:08:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.