Related papers: A Goal Without a Plan Is Just a Wish: Efficient and Effective Global Planner Training for Long-Horizon Agent Tasks

A Goal Without a Plan Is Just a Wish: Efficient and Effective Global Planner Training for Long-Horizon Agent Tasks

URL: http://arxiv.org/abs/2510.05608v1
Date: Tue, 07 Oct 2025 06:10:53 GMT
Title: A Goal Without a Plan Is Just a Wish: Efficient and Effective Global Planner Training for Long-Horizon Agent Tasks
Authors: Shuzheng Si, Haozhe Zhao, Kangyang Luo, Gang Chen, Fanchao Qi, Minjia Zhang, Baobao Chang, Maosong Sun,
Abstract summary: Agents based on large language models (LLMs) struggle with brainless trial-and-error and generating hallucinatory actions due to a lack of global planning in long-horizon tasks.<n>We introduce a plan-and-execute framework and propose a planner training method to enhance the executor agent's planning abilities without human effort.<n>Experiments show that executor agents equipped with our planner outperform existing methods, achieving new state-of-the-art performance.
Score: 66.86312354478478
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Agents based on large language models (LLMs) struggle with brainless trial-and-error and generating hallucinatory actions due to a lack of global planning in long-horizon tasks. In this paper, we introduce a plan-and-execute framework and propose EAGLET, an efficient and effective planner training method to enhance the executor agent's planning abilities without human effort. Specifically, we train a plug-and-play global planner through a two-step process: we first synthesize high-quality plans from an advanced LLM using our proposed homologous consensus filtering strategy, and apply fine-tuning as a cold start. Moreover, we further improve the planner with a rule-based reinforcement learning stage using a novel executor capability gain reward, ensuring it can handle task instructions of varying difficulty. Experiments on three long-horizon agent tasks show that executor agents equipped with our planner outperform existing methods, achieving new state-of-the-art performance. Meanwhile, EAGLET reduces training costs by 8x compared to RL-based baselines, and it does not require manual effort or extra training data, offering an efficient and effective solution.

Related papers

Training Task Reasoning LLM Agents for Multi-turn Task Planning via Single-turn Reinforcement Learning [15.393743659727926]
Large Language Models (LLMs) have demonstrated remarkable capabilities in knowledge acquisition, reasoning, and tool use.<n>This paper introduces a novel approach that transforms multi-turn task planning into single-turn task reasoning problems.
arXiv Detail & Related papers (2025-09-24T23:47:36Z)
Learning When to Plan: Efficiently Allocating Test-Time Compute for LLM Agents [35.79575378215309]
Training large language models (LLMs) to reason via reinforcement learning (RL) significantly improves their problem-solving capabilities.<n>We introduce a conceptual framework formalizing dynamic planning for LLM agents, enabling them to flexibly decide when to allocate test-time compute for planning.<n>Experiments on the Crafter environment show that dynamic planning agents trained with this approach are more sample-efficient and consistently achieve more complex objectives.
arXiv Detail & Related papers (2025-09-03T18:00:13Z)
Leveraging Pre-trained Large Language Models with Refined Prompting for Online Task and Motion Planning [24.797220935378057]
We present a closed-loop task planning and acting system, LLM-PAS, which is assisted by a pre-trained Large Language Model (LLM)<n>We demonstrate the effectiveness and robustness of LLM-PAS in handling anomalous conditions during task execution.
arXiv Detail & Related papers (2025-04-30T12:53:53Z)
World Modeling Makes a Better Planner: Dual Preference Optimization for Embodied Task Planning [60.100794160682646]
We propose a new learning framework that jointly optimize state prediction and action selection through preference learning.<n>To automatically collect trajectories and stepwise preference data without human annotation, we introduce a tree search mechanism for extensive exploration via trial-and-error.<n>Our method significantly outperforms existing methods and GPT-4o when applied to Qwen2-VL (7B), LLaVA-1.6 (7B), and LLaMA-3.2 (11B)
arXiv Detail & Related papers (2025-03-13T15:49:56Z)
Plan-and-Act: Improving Planning of Agents for Long-Horizon Tasks [36.63527489464188]
Plan-and-Act is a framework that incorporates explicit planning into large language models (LLMs)<n>Plan-and-Act consists of a Planner model which generates structured, high-level plans to achieve user goals, and an Executor model that translates these plans into environment-specific actions.<n>We present a state-of-the-art 57.58% success rate on the WebArena-Lite benchmark as well as a text-only state-of-the-art 81.36% success rate on WebVoyager.
arXiv Detail & Related papers (2025-03-12T17:40:52Z)
Hindsight Planner: A Closed-Loop Few-Shot Planner for Embodied Instruction Following [62.10809033451526]
This work focuses on building a task planner for Embodied Instruction Following (EIF) using Large Language Models (LLMs)<n>We frame the task as a Partially Observable Markov Decision Process (POMDP) and aim to develop a robust planner under a few-shot assumption.<n>Our experiments on the ALFRED dataset indicate that our planner achieves competitive performance under a few-shot assumption.
arXiv Detail & Related papers (2024-12-27T10:05:45Z)
AdaPlanner: Adaptive Planning from Feedback with Language Models [56.367020818139665]
Large language models (LLMs) have recently demonstrated the potential in acting as autonomous agents for sequential decision-making tasks. We propose a closed-loop approach, AdaPlanner, which allows the LLM agent to refine its self-generated plan adaptively in response to environmental feedback. To mitigate hallucination, we develop a code-style LLM prompt structure that facilitates plan generation across a variety of tasks, environments, and agent capabilities.
arXiv Detail & Related papers (2023-05-26T05:52:27Z)
EmbodiedGPT: Vision-Language Pre-Training via Embodied Chain of Thought [95.37585041654535]
Embodied AI is capable of planning and executing action sequences for robots to accomplish long-horizon tasks in physical environments. In this work, we introduce EmbodiedGPT, an end-to-end multi-modal foundation model for embodied AI. Experiments show the effectiveness of EmbodiedGPT on embodied tasks, including embodied planning, embodied control, visual captioning, and visual question answering.
arXiv Detail & Related papers (2023-05-24T11:04:30Z)
LLM-Planner: Few-Shot Grounded Planning for Embodied Agents with Large Language Models [27.318186938382233]
This study focuses on using large language models (LLMs) as a planner for embodied agents. We propose a novel method, LLM-Planner, that harnesses the power of large language models to do few-shot planning.
arXiv Detail & Related papers (2022-12-08T05:46:32Z)
DL-DRL: A double-level deep reinforcement learning approach for large-scale task scheduling of multi-UAV [65.07776277630228]
We propose a double-level deep reinforcement learning (DL-DRL) approach based on a divide and conquer framework (DCF) Particularly, we design an encoder-decoder structured policy network in our upper-level DRL model to allocate the tasks to different UAVs. We also exploit another attention based policy network in our lower-level DRL model to construct the route for each UAV, with the objective to maximize the number of executed tasks.
arXiv Detail & Related papers (2022-08-04T04:35:53Z)

This list is automatically generated from the titles and abstracts of the papers in this site.