Related papers: Egocentric Planning for Scalable Embodied Task Achievement

Egocentric Planning for Scalable Embodied Task Achievement

URL: http://arxiv.org/abs/2306.01295v1
Date: Fri, 2 Jun 2023 06:41:24 GMT
Title: Egocentric Planning for Scalable Embodied Task Achievement
Authors: Xiaotian Liu, Hector Palacios, Christian Muise
Abstract summary: Egocentric Planning is an innovative approach that combines symbolic planning and Object-oriented POMDPs to solve tasks in complex environments. We evaluated our approach in ALFRED, a simulated environment designed for domestic tasks, and demonstrated its high scalability. Our method requires reliable perception and the specification or learning of a symbolic description of the preconditions and effects of the agent's actions.
Score: 6.870094263016224
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Embodied agents face significant challenges when tasked with performing actions in diverse environments, particularly in generalizing across object types and executing suitable actions to accomplish tasks. Furthermore, agents should exhibit robustness, minimizing the execution of illegal actions. In this work, we present Egocentric Planning, an innovative approach that combines symbolic planning and Object-oriented POMDPs to solve tasks in complex environments, harnessing existing models for visual perception and natural language processing. We evaluated our approach in ALFRED, a simulated environment designed for domestic tasks, and demonstrated its high scalability, achieving an impressive 36.07% unseen success rate in the ALFRED benchmark and winning the ALFRED challenge at CVPR Embodied AI workshop. Our method requires reliable perception and the specification or learning of a symbolic description of the preconditions and effects of the agent's actions, as well as what object types reveal information about others. It is capable of naturally scaling to solve new tasks beyond ALFRED, as long as they can be solved using the available skills. This work offers a solid baseline for studying end-to-end and hybrid methods that aim to generalize to new tasks, including recent approaches relying on LLMs, but often struggle to scale to long sequences of actions or produce robust plans for novel tasks.

Related papers

Task-Aware Virtual Training: Enhancing Generalization in Meta-Reinforcement Learning for Out-of-Distribution Tasks [4.374837991804085]
Task-Aware Virtual Training (TAVT) is a novel algorithm that captures task characteristics for both training and out-of-distribution (OOD) scenarios. Numerical results demonstrate that TAVT significantly enhances generalization to OOD tasks across various MuJoCo and MetaWorld environments.
arXiv Detail & Related papers (2025-02-05T02:31:50Z)
Planning with affordances: Integrating learned affordance models and symbolic planning [0.0]
We augment an existing task and motion planning framework with learned affordance models of objects in the world. Each task can be seen as changing the current state of the world to a given goal state. A symbolic planning algorithm uses this information and the starting and goal state to create a feasible plan to reach the desired goal state.
arXiv Detail & Related papers (2025-02-04T23:15:38Z)
Anticipate & Act : Integrating LLMs and Classical Planning for Efficient Task Execution in Household Environments [16.482992646001996]
We develop a framework for anticipating household tasks and computing an action sequence that jointly achieves these tasks. We demonstrate a 31% reduction in execution time compared with a system that does not consider upcoming tasks.
arXiv Detail & Related papers (2025-02-04T07:31:55Z)
Adaptformer: Sequence models as adaptive iterative planners [0.0]
Decision-making in multi-task missions is a challenging problem for autonomous systems. We propose Adaptformer, an adaptive planner that utilizes sequence models for sample-efficient exploration and exploitation. We show that Adaptformer outperforms the state-of-the-art method by up to 25% in multi-goal maze reachability tasks.
arXiv Detail & Related papers (2024-11-30T00:34:41Z)
ET-Plan-Bench: Embodied Task-level Planning Benchmark Towards Spatial-Temporal Cognition with Foundation Models [39.606908488885125]
ET-Plan-Bench is a benchmark for embodied task planning using Large Language Models (LLMs) It features a controllable and diverse set of embodied tasks varying in different levels of difficulties and complexities. Our benchmark distinguishes itself as a large-scale, quantifiable, highly automated, and fine-grained diagnostic framework.
arXiv Detail & Related papers (2024-10-02T19:56:38Z)
Spatial Reasoning and Planning for Deep Embodied Agents [2.7195102129095003]
This thesis explores the development of data-driven techniques for spatial reasoning and planning tasks. It focuses on enhancing learning efficiency, interpretability, and transferability across novel scenarios.
arXiv Detail & Related papers (2024-09-28T23:05:56Z)
AgentGen: Enhancing Planning Abilities for Large Language Model based Agent via Environment and Task Generation [89.68433168477227]
Large Language Model (LLM) based agents have garnered significant attention and are becoming increasingly popular. This paper investigates enhancing the planning abilities of LLMs through instruction tuning. To address this limitation, this paper explores the automated synthesis of diverse environments and a gradual range of planning tasks.
arXiv Detail & Related papers (2024-08-01T17:59:46Z)
Embodied Instruction Following in Unknown Environments [66.60163202450954]
We propose an embodied instruction following (EIF) method for complex tasks in the unknown environment. We build a hierarchical embodied instruction following framework including the high-level task planner and the low-level exploration controller. For the task planner, we generate the feasible step-by-step plans for human goal accomplishment according to the task completion process and the known visual clues.
arXiv Detail & Related papers (2024-06-17T17:55:40Z)
Embodied Task Planning with Large Language Models [86.63533340293361]
We propose a TAsk Planing Agent (TaPA) in embodied tasks for grounded planning with physical scene constraint. During inference, we discover the objects in the scene by extending open-vocabulary object detectors to multi-view RGB images collected in different achievable locations. Experimental results show that the generated plan from our TaPA framework can achieve higher success rate than LLaVA and GPT-3.5 by a sizable margin.
arXiv Detail & Related papers (2023-07-04T17:58:25Z)
Automaton-Guided Curriculum Generation for Reinforcement Learning Agents [14.20447398253189]
Automaton-guided Curriculum Learning (AGCL) is a novel method for automatically generating curricula for the target task in the form of Directed Acyclic Graphs (DAGs) AGCL encodes the specification in the form of a deterministic finite automaton (DFA), and then uses the DFA along with the Object-Oriented MDP representation to generate a curriculum as a DAG. Experiments in gridworld and physics-based simulated robotics domains show that the curricula produced by AGCL achieve improved time-to-threshold performance.
arXiv Detail & Related papers (2023-04-11T15:14:31Z)
Knowledge Retrieval using Functional Object-Oriented Network [0.0]
The functional object-oriented network (FOON) is a knowledge representation for symbolic task planning that takes the shape of a graph. A graph retrieval methodology is shown to produce manipulation motion sequences from the FOON to accomplish a desired aim. The outcomes are illustrated using motion sequences created by the FOON to complete the desired objectives in a simulated environment.
arXiv Detail & Related papers (2022-11-06T06:02:29Z)
Reinforcement Learning for Sparse-Reward Object-Interaction Tasks in a First-person Simulated 3D Environment [73.9469267445146]
First-person object-interaction tasks in high-fidelity, 3D, simulated environments such as the AI2Thor pose significant sample-efficiency challenges for reinforcement learning agents. We show that one can learn object-interaction tasks from scratch without supervision by learning an attentive object-model as an auxiliary task.
arXiv Detail & Related papers (2020-10-28T19:27:26Z)
Goal-Aware Prediction: Learning to Model What Matters [105.43098326577434]
One of the fundamental challenges in using a learned forward dynamics model is the mismatch between the objective of the learned model and that of the downstream planner or policy. We propose to direct prediction towards task relevant information, enabling the model to be aware of the current task and encouraging it to only model relevant quantities of the state space. We find that our method more effectively models the relevant parts of the scene conditioned on the goal, and as a result outperforms standard task-agnostic dynamics models and model-free reinforcement learning.
arXiv Detail & Related papers (2020-07-14T16:42:59Z)
Adaptive Procedural Task Generation for Hard-Exploration Problems [78.20918366839399]
We introduce Adaptive Procedural Task Generation (APT-Gen) to facilitate reinforcement learning in hard-exploration problems. At the heart of our approach is a task generator that learns to create tasks from a parameterized task space via a black-box procedural generation module. To enable curriculum learning in the absence of a direct indicator of learning progress, we propose to train the task generator by balancing the agent's performance in the generated tasks and the similarity to the target tasks.
arXiv Detail & Related papers (2020-07-01T09:38:51Z)

This list is automatically generated from the titles and abstracts of the papers in this site.