Related papers: L3M+P: Lifelong Planning with Large Language Models

L3M+P: Lifelong Planning with Large Language Models

URL: http://arxiv.org/abs/2508.01917v1
Date: Sun, 03 Aug 2025 21:01:50 GMT
Title: L3M+P: Lifelong Planning with Large Language Models
Authors: Krish Agarwal, Yuqian Jiang, Jiaheng Hu, Bo Liu, Peter Stone,
Abstract summary: This paper introduces L3M+P, a framework that uses an external knowledge graph as a representation of the world state.<n>At planning time, given a natural language description of a task, L3M+P retrieves context from the knowledge graph and generates a problem definition for classical planners.
Score: 33.88987644905278
License: http://creativecommons.org/licenses/by/4.0/
Abstract: By combining classical planning methods with large language models (LLMs), recent research such as LLM+P has enabled agents to plan for general tasks given in natural language. However, scaling these methods to general-purpose service robots remains challenging: (1) classical planning algorithms generally require a detailed and consistent specification of the environment, which is not always readily available; and (2) existing frameworks mainly focus on isolated planning tasks, whereas robots are often meant to serve in long-term continuous deployments, and therefore must maintain a dynamic memory of the environment which can be updated with multi-modal inputs and extracted as planning knowledge for future tasks. To address these two issues, this paper introduces L3M+P (Lifelong LLM+P), a framework that uses an external knowledge graph as a representation of the world state. The graph can be updated from multiple sources of information, including sensory input and natural language interactions with humans. L3M+P enforces rules for the expected format of the absolute world state graph to maintain consistency between graph updates. At planning time, given a natural language description of a task, L3M+P retrieves context from the knowledge graph and generates a problem definition for classical planners. Evaluated on household robot simulators and on a real-world service robot, L3M+P achieves significant improvement over baseline methods both on accurately registering natural language state changes and on correctly generating plans, thanks to the knowledge graph retrieval and verification.

Related papers

LODGE: Joint Hierarchical Task Planning and Learning of Domain Models with Grounded Execution [16.16223684887115]
Large Language Models (LLMs) enable planning from natural language instructions using implicit world knowledge.<n>Recent methods aim to learn a problem domain that can be solved for different goal states using classical planners.<n>We address this shortcoming by learning hierarchical domains, where low-level predicates and actions are composed into higher-level counterparts.
arXiv Detail & Related papers (2025-05-15T20:23:21Z)
A Temporal Planning Framework for Multi-Agent Systems via LLM-Aided Knowledge Base Management [5.548477348501636]
This paper presents a novel framework, called PLANTOR, that integrates Large Language Models (LLMs) with Prolog-based knowledge management and planning for multi-robot tasks.<n>Results demonstrate that LLMs can produce accurate knowledge bases with modest human feedback, while Prolog guarantees formal correctness and explainability.<n>This approach underscores the potential of LLM integration for advanced robotics tasks requiring flexible, scalable, and human-understandable planning.
arXiv Detail & Related papers (2025-02-26T13:51:28Z)
Plan-over-Graph: Towards Parallelable LLM Agent Schedule [53.834646147919436]
Large Language Models (LLMs) have demonstrated exceptional abilities in reasoning for task planning.<n>This paper introduces a novel paradigm, plan-over-graph, in which the model first decomposes a real-life textual task into executable subtasks and constructs an abstract task graph.<n>The model then understands this task graph as input and generates a plan for parallel execution.
arXiv Detail & Related papers (2025-02-20T13:47:51Z)
EgoPlan-Bench2: A Benchmark for Multimodal Large Language Model Planning in Real-World Scenarios [53.26658545922884]
We introduce EgoPlan-Bench2, a benchmark designed to assess the planning capabilities of MLLMs across a wide range of real-world scenarios.<n>We evaluate 21 competitive MLLMs and provide an in-depth analysis of their limitations, revealing that they face significant challenges in real-world planning.<n>Our approach enhances the performance of GPT-4V by 10.24 on EgoPlan-Bench2 without additional training.
arXiv Detail & Related papers (2024-12-05T18:57:23Z)
DKPROMPT: Domain Knowledge Prompting Vision-Language Models for Open-World Planning [9.31108717722043]
Vision-language models (VLMs) have been applied to robot task planning problems. DKPROMPT automates VLM prompting using domain knowledge in PDDL for classical planning in open worlds.
arXiv Detail & Related papers (2024-06-25T15:49:47Z)
Learning adaptive planning representations with natural language guidance [90.24449752926866]
This paper describes Ada, a framework for automatically constructing task-specific planning representations. Ada interactively learns a library of planner-compatible high-level action abstractions and low-level controllers adapted to a particular domain of planning tasks.
arXiv Detail & Related papers (2023-12-13T23:35:31Z)
Embodied Task Planning with Large Language Models [86.63533340293361]
We propose a TAsk Planing Agent (TaPA) in embodied tasks for grounded planning with physical scene constraint. During inference, we discover the objects in the scene by extending open-vocabulary object detectors to multi-view RGB images collected in different achievable locations. Experimental results show that the generated plan from our TaPA framework can achieve higher success rate than LLaVA and GPT-3.5 by a sizable margin.
arXiv Detail & Related papers (2023-07-04T17:58:25Z)
Learning to Reason over Scene Graphs: A Case Study of Finetuning GPT-2 into a Robot Language Model for Grounded Task Planning [45.51792981370957]
We investigate the applicability of a smaller class of large language models (LLMs) in robotic task planning by learning to decompose tasks into subgoal specifications for a planner to execute sequentially. Our method grounds the input of the LLM on the domain that is represented as a scene graph, enabling it to translate human requests into executable robot plans. Our findings suggest that the knowledge stored in an LLM can be effectively grounded to perform long-horizon task planning, demonstrating the promising potential for the future application of neuro-symbolic planning methods in robotics.
arXiv Detail & Related papers (2023-05-12T18:14:32Z)
Sequential Manipulation Planning on Scene Graph [90.28117916077073]
We devise a 3D scene graph representation, contact graph+ (cg+), for efficient sequential task planning. Goal configurations, naturally specified on contact graphs, can be produced by a genetic algorithm with an optimization method. A task plan is then succinct by computing the Graph Editing Distance (GED) between the initial contact graphs and the goal configurations, which generates graph edit operations corresponding to possible robot actions.
arXiv Detail & Related papers (2022-07-10T02:01:33Z)
iCORPP: Interleaved Commonsense Reasoning and Probabilistic Planning on Robots [46.13039152809055]
We present a novel algorithm, called iCORPP, to simultaneously estimate the current world state, reason about world dynamics, and construct task-oriented controllers. Results show significant improvements in scalability, efficiency, and adaptiveness, compared to competitive baselines.
arXiv Detail & Related papers (2020-04-18T17:46:59Z)

This list is automatically generated from the titles and abstracts of the papers in this site.