Hindsight Planner: A Closed-Loop Few-Shot Planner for Embodied Instruction Following
- URL: http://arxiv.org/abs/2412.19562v1
- Date: Fri, 27 Dec 2024 10:05:45 GMT
- Title: Hindsight Planner: A Closed-Loop Few-Shot Planner for Embodied Instruction Following
- Authors: Yuxiao Yang, Shenao Zhang, Zhihan Liu, Huaxiu Yao, Zhaoran Wang,
- Abstract summary: This work focuses on building a task planner for Embodied Instruction Following (EIF) using Large Language Models (LLMs)
We frame the task as a Partially Observable Markov Decision Process (POMDP) and aim to develop a robust planner under a few-shot assumption.
Our experiments on the ALFRED dataset indicate that our planner achieves competitive performance under a few-shot assumption.
- Score: 62.10809033451526
- License:
- Abstract: This work focuses on building a task planner for Embodied Instruction Following (EIF) using Large Language Models (LLMs). Previous works typically train a planner to imitate expert trajectories, treating this as a supervised task. While these methods achieve competitive performance, they often lack sufficient robustness. When a suboptimal action is taken, the planner may encounter an out-of-distribution state, which can lead to task failure. In contrast, we frame the task as a Partially Observable Markov Decision Process (POMDP) and aim to develop a robust planner under a few-shot assumption. Thus, we propose a closed-loop planner with an adaptation module and a novel hindsight method, aiming to use as much information as possible to assist the planner. Our experiments on the ALFRED dataset indicate that our planner achieves competitive performance under a few-shot assumption. For the first time, our few-shot agent's performance approaches and even surpasses that of the full-shot supervised agent.
Related papers
- DHP: Discrete Hierarchical Planning for Hierarchical Reinforcement Learning Agents [2.1438108757511958]
Our key contribution is a Discrete Hierarchical Planning (DHP) method, an alternative to traditional distance-based approaches.
We provide theoretical foundations for the method and demonstrate its effectiveness through extensive empirical evaluations.
We evaluate our method on long-horizon visual planning tasks in a 25-room environment, where it significantly outperforms previous benchmarks at success rate and average episode length.
arXiv Detail & Related papers (2025-02-04T03:05:55Z) - GenPlan: Generative Sequence Models as Adaptive Planners [0.0]
Sequence models have demonstrated remarkable success in behavioral planning by leveraging previously collected demonstrations.
However, solving multi-task missions remains a significant challenge, particularly when the planner must adapt to unseen constraints and tasks.
We propose GenPlan: a discrete-flow model for adaptive planner, enabling sample-generative exploration and exploitation.
arXiv Detail & Related papers (2024-12-11T17:32:33Z) - Propose, Assess, Search: Harnessing LLMs for Goal-Oriented Planning in Instructional Videos [48.15438373870542]
VidAssist is an integrated framework designed for zero/few-shot goal-oriented planning in instructional videos.
It employs a breadth-first search algorithm for optimal plan generation.
Experiments demonstrate that VidAssist offers a unified framework for different goal-oriented planning setups.
arXiv Detail & Related papers (2024-09-30T17:57:28Z) - Ask-before-Plan: Proactive Language Agents for Real-World Planning [68.08024918064503]
Proactive Agent Planning requires language agents to predict clarification needs based on user-agent conversation and agent-environment interaction.
We propose a novel multi-agent framework, Clarification-Execution-Planning (textttCEP), which consists of three agents specialized in clarification, execution, and planning.
arXiv Detail & Related papers (2024-06-18T14:07:28Z) - Socratic Planner: Inquiry-Based Zero-Shot Planning for Embodied Instruction Following [17.608330952846075]
Embodied Instruction Following (EIF) is the task of executing natural language instructions by navigating and interacting with objects in 3D environments.
One of the primary challenges in EIF is compositional task planning, which is often addressed with supervised or in-context learning with labeled data.
We introduce the Socratic Planner, the first zero-shot planning method that infers without the need for any training data.
arXiv Detail & Related papers (2024-04-21T08:10:20Z) - Planning as In-Painting: A Diffusion-Based Embodied Task Planning
Framework for Environments under Uncertainty [56.30846158280031]
Task planning for embodied AI has been one of the most challenging problems.
We propose a task-agnostic method named 'planning as in-painting'
The proposed framework achieves promising performances in various embodied AI tasks.
arXiv Detail & Related papers (2023-12-02T10:07:17Z) - Skip-Plan: Procedure Planning in Instructional Videos via Condensed
Action Space Learning [85.84504287685884]
Skip-Plan is a condensed action space learning method for procedure planning in instructional videos.
By skipping uncertain nodes and edges in action chains, we transfer long and complex sequence functions into short but reliable ones.
Our model explores all sorts of reliable sub-relations within an action sequence in the condensed action space.
arXiv Detail & Related papers (2023-10-01T08:02:33Z) - AdaPlanner: Adaptive Planning from Feedback with Language Models [56.367020818139665]
Large language models (LLMs) have recently demonstrated the potential in acting as autonomous agents for sequential decision-making tasks.
We propose a closed-loop approach, AdaPlanner, which allows the LLM agent to refine its self-generated plan adaptively in response to environmental feedback.
To mitigate hallucination, we develop a code-style LLM prompt structure that facilitates plan generation across a variety of tasks, environments, and agent capabilities.
arXiv Detail & Related papers (2023-05-26T05:52:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.