Toward PDDL Planning Copilot
- URL: http://arxiv.org/abs/2509.12987v1
- Date: Tue, 16 Sep 2025 11:51:07 GMT
- Title: Toward PDDL Planning Copilot
- Authors: Yarin Benyamin, Argaman Mordoch, Shahaf S. Shperberg, Roni Stern,
- Abstract summary: The Planning Copilot integrates multiple planning tools and allows users to invoke them through instructions in natural language.<n>We empirically evaluate the ability of our Planning Copilot to perform planning tasks using three open-source LLMs.<n>Our results show that our Planning Copilot significantly outperforms GPT-5 despite relying on a much smaller LLM.
- Score: 7.9654550247344895
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Large Language Models (LLMs) are increasingly being used as autonomous agents capable of performing complicated tasks. However, they lack the ability to perform reliable long-horizon planning on their own. This paper bridges this gap by introducing the Planning Copilot, a chatbot that integrates multiple planning tools and allows users to invoke them through instructions in natural language. The Planning Copilot leverages the Model Context Protocol (MCP), a recently developed standard for connecting LLMs with external tools and systems. This approach allows using any LLM that supports MCP without domain-specific fine-tuning. Our Planning Copilot supports common planning tasks such as checking the syntax of planning problems, selecting an appropriate planner, calling it, validating the plan it generates, and simulating their execution. We empirically evaluate the ability of our Planning Copilot to perform these tasks using three open-source LLMs. The results show that the Planning Copilot highly outperforms using the same LLMs without the planning tools. We also conducted a limited qualitative comparison of our tool against Chat GPT-5, a very recent commercial LLM. Our results shows that our Planning Copilot significantly outperforms GPT-5 despite relying on a much smaller LLM. This suggests dedicated planning tools may be an effective way to enable LLMs to perform planning tasks.
Related papers
- Learning to Reason and Navigate: Parameter Efficient Action Planning with Large Language Models [63.765846080050906]
This paper proposes a novel parameter-efficient action planner using large language models (PEAP-LLM) to generate a single-step instruction at each location.<n>Experiments show the superiority of our proposed model on REVERIE compared to the previous state-of-the-art.
arXiv Detail & Related papers (2025-05-12T12:38:20Z) - SmallPlan: Leverage Small Language Models for Sequential Path Planning with Simulation-Powered, LLM-Guided Distillation [22.71952631383849]
SmallPlan is a novel framework leveraging Large Language Models as teacher models to train lightweight Small Language Models (SLMs) for high-level path planning tasks.<n>SLMs are trained in a simulation-powered, interleaved manner with LLM-guided supervised fine-tuning (SFT) and reinforcement learning (RL)<n>We demonstrate that the fine-tuned SLMs perform competitively with larger models like GPT-4o on sequential path planning, without suffering from hallucination and overfitting.
arXiv Detail & Related papers (2025-05-01T19:44:36Z) - LLM+MAP: Bimanual Robot Task Planning using Large Language Models and Planning Domain Definition Language [17.914580097058106]
Bimanual robotic manipulation presents an inherent challenge due to the complexity involved in the spatial and temporal coordination between two hands.<n>Existing works predominantly focus on attaining human-level manipulation skills for robotic hands, yet little attention has been paid to task planning on long-horizon timescales.<n>This paper introduces LLM+MAP, a bimanual planning framework that integrates LLM reasoning and multi-agent planning.
arXiv Detail & Related papers (2025-03-21T17:04:01Z) - Hindsight Planner: A Closed-Loop Few-Shot Planner for Embodied Instruction Following [62.10809033451526]
This work focuses on building a task planner for Embodied Instruction Following (EIF) using Large Language Models (LLMs)<n>We frame the task as a Partially Observable Markov Decision Process (POMDP) and aim to develop a robust planner under a few-shot assumption.<n>Our experiments on the ALFRED dataset indicate that our planner achieves competitive performance under a few-shot assumption.
arXiv Detail & Related papers (2024-12-27T10:05:45Z) - Query-Efficient Planning with Language Models [8.136901056728945]
Planning in complex environments requires an agent to efficiently query a world model to find a sequence of actions from start to goal.<n>Recent work has shown that Large Language Models (LLMs) can potentially help with planning by searching over promising states and adapting to feedback from the world.<n>We show that while both approaches improve upon comparable baselines, using an LLM as a generative planner results in significantly fewer interactions.
arXiv Detail & Related papers (2024-12-09T02:51:21Z) - Interactive and Expressive Code-Augmented Planning with Large Language Models [62.799579304821826]
Large Language Models (LLMs) demonstrate strong abilities in common-sense reasoning and interactive decision-making.
Recent techniques have sought to structure LLM outputs using control flow and other code-adjacent techniques to improve planning performance.
We propose REPL-Plan, an LLM planning approach that is fully code-expressive and dynamic.
arXiv Detail & Related papers (2024-11-21T04:23:17Z) - NATURAL PLAN: Benchmarking LLMs on Natural Language Planning [109.73382347588417]
We introduce NATURAL PLAN, a realistic planning benchmark in natural language containing 3 key tasks: Trip Planning, Meeting Planning, and Calendar Scheduling.
We focus our evaluation on the planning capabilities of LLMs with full information on the task, by providing outputs from tools such as Google Flights, Google Maps, and Google Calendar as contexts to the models.
arXiv Detail & Related papers (2024-06-06T21:27:35Z) - LLM-Assist: Enhancing Closed-Loop Planning with Language-Based Reasoning [65.86754998249224]
We develop a novel hybrid planner that leverages a conventional rule-based planner in conjunction with an LLM-based planner.
Our approach navigates complex scenarios which existing planners struggle with, produces well-reasoned outputs while also remaining grounded through working alongside the rule-based approach.
arXiv Detail & Related papers (2023-12-30T02:53:45Z) - Automating the Generation of Prompts for LLM-based Action Choice in PDDL Planning [59.543858889996024]
Large language models (LLMs) have revolutionized a large variety of NLP tasks.<n>We show how to leverage an LLM to automatically generate NL prompts from PDDL input.<n>Our NL prompts yield better performance than PDDL prompts and simple template-based NL prompts.
arXiv Detail & Related papers (2023-11-16T11:55:27Z) - AdaPlanner: Adaptive Planning from Feedback with Language Models [56.367020818139665]
Large language models (LLMs) have recently demonstrated the potential in acting as autonomous agents for sequential decision-making tasks.
We propose a closed-loop approach, AdaPlanner, which allows the LLM agent to refine its self-generated plan adaptively in response to environmental feedback.
To mitigate hallucination, we develop a code-style LLM prompt structure that facilitates plan generation across a variety of tasks, environments, and agent capabilities.
arXiv Detail & Related papers (2023-05-26T05:52:27Z) - LLM-Planner: Few-Shot Grounded Planning for Embodied Agents with Large
Language Models [27.318186938382233]
This study focuses on using large language models (LLMs) as a planner for embodied agents.
We propose a novel method, LLM-Planner, that harnesses the power of large language models to do few-shot planning.
arXiv Detail & Related papers (2022-12-08T05:46:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.