Egocentric Planning for Scalable Embodied Task Achievement
- URL: http://arxiv.org/abs/2306.01295v1
- Date: Fri, 2 Jun 2023 06:41:24 GMT
- Title: Egocentric Planning for Scalable Embodied Task Achievement
- Authors: Xiaotian Liu, Hector Palacios, Christian Muise
- Abstract summary: Egocentric Planning is an innovative approach that combines symbolic planning and Object-oriented POMDPs to solve tasks in complex environments.
We evaluated our approach in ALFRED, a simulated environment designed for domestic tasks, and demonstrated its high scalability.
Our method requires reliable perception and the specification or learning of a symbolic description of the preconditions and effects of the agent's actions.
- Score: 6.870094263016224
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Embodied agents face significant challenges when tasked with performing
actions in diverse environments, particularly in generalizing across object
types and executing suitable actions to accomplish tasks. Furthermore, agents
should exhibit robustness, minimizing the execution of illegal actions. In this
work, we present Egocentric Planning, an innovative approach that combines
symbolic planning and Object-oriented POMDPs to solve tasks in complex
environments, harnessing existing models for visual perception and natural
language processing. We evaluated our approach in ALFRED, a simulated
environment designed for domestic tasks, and demonstrated its high scalability,
achieving an impressive 36.07% unseen success rate in the ALFRED benchmark and
winning the ALFRED challenge at CVPR Embodied AI workshop. Our method requires
reliable perception and the specification or learning of a symbolic description
of the preconditions and effects of the agent's actions, as well as what object
types reveal information about others. It is capable of naturally scaling to
solve new tasks beyond ALFRED, as long as they can be solved using the
available skills. This work offers a solid baseline for studying end-to-end and
hybrid methods that aim to generalize to new tasks, including recent approaches
relying on LLMs, but often struggle to scale to long sequences of actions or
produce robust plans for novel tasks.
Related papers
- ET-Plan-Bench: Embodied Task-level Planning Benchmark Towards Spatial-Temporal Cognition with Foundation Models [39.606908488885125]
ET-Plan-Bench is a benchmark for embodied task planning using Large Language Models (LLMs)
It features a controllable and diverse set of embodied tasks varying in different levels of difficulties and complexities.
Our benchmark distinguishes itself as a large-scale, quantifiable, highly automated, and fine-grained diagnostic framework.
arXiv Detail & Related papers (2024-10-02T19:56:38Z) - Spatial Reasoning and Planning for Deep Embodied Agents [2.7195102129095003]
This thesis explores the development of data-driven techniques for spatial reasoning and planning tasks.
It focuses on enhancing learning efficiency, interpretability, and transferability across novel scenarios.
arXiv Detail & Related papers (2024-09-28T23:05:56Z) - AgentGen: Enhancing Planning Abilities for Large Language Model based Agent via Environment and Task Generation [89.68433168477227]
Large Language Model (LLM) based agents have garnered significant attention and are becoming increasingly popular.
This paper investigates enhancing the planning abilities of LLMs through instruction tuning.
To address this limitation, this paper explores the automated synthesis of diverse environments and a gradual range of planning tasks.
arXiv Detail & Related papers (2024-08-01T17:59:46Z) - Embodied Instruction Following in Unknown Environments [66.60163202450954]
We propose an embodied instruction following (EIF) method for complex tasks in the unknown environment.
We build a hierarchical embodied instruction following framework including the high-level task planner and the low-level exploration controller.
For the task planner, we generate the feasible step-by-step plans for human goal accomplishment according to the task completion process and the known visual clues.
arXiv Detail & Related papers (2024-06-17T17:55:40Z) - Embodied Task Planning with Large Language Models [86.63533340293361]
We propose a TAsk Planing Agent (TaPA) in embodied tasks for grounded planning with physical scene constraint.
During inference, we discover the objects in the scene by extending open-vocabulary object detectors to multi-view RGB images collected in different achievable locations.
Experimental results show that the generated plan from our TaPA framework can achieve higher success rate than LLaVA and GPT-3.5 by a sizable margin.
arXiv Detail & Related papers (2023-07-04T17:58:25Z) - Automaton-Guided Curriculum Generation for Reinforcement Learning Agents [14.20447398253189]
Automaton-guided Curriculum Learning (AGCL) is a novel method for automatically generating curricula for the target task in the form of Directed Acyclic Graphs (DAGs)
AGCL encodes the specification in the form of a deterministic finite automaton (DFA), and then uses the DFA along with the Object-Oriented MDP representation to generate a curriculum as a DAG.
Experiments in gridworld and physics-based simulated robotics domains show that the curricula produced by AGCL achieve improved time-to-threshold performance.
arXiv Detail & Related papers (2023-04-11T15:14:31Z) - Knowledge Retrieval using Functional Object-Oriented Network [0.0]
The functional object-oriented network (FOON) is a knowledge representation for symbolic task planning that takes the shape of a graph.
A graph retrieval methodology is shown to produce manipulation motion sequences from the FOON to accomplish a desired aim.
The outcomes are illustrated using motion sequences created by the FOON to complete the desired objectives in a simulated environment.
arXiv Detail & Related papers (2022-11-06T06:02:29Z) - Reinforcement Learning for Sparse-Reward Object-Interaction Tasks in a
First-person Simulated 3D Environment [73.9469267445146]
First-person object-interaction tasks in high-fidelity, 3D, simulated environments such as the AI2Thor pose significant sample-efficiency challenges for reinforcement learning agents.
We show that one can learn object-interaction tasks from scratch without supervision by learning an attentive object-model as an auxiliary task.
arXiv Detail & Related papers (2020-10-28T19:27:26Z) - Goal-Aware Prediction: Learning to Model What Matters [105.43098326577434]
One of the fundamental challenges in using a learned forward dynamics model is the mismatch between the objective of the learned model and that of the downstream planner or policy.
We propose to direct prediction towards task relevant information, enabling the model to be aware of the current task and encouraging it to only model relevant quantities of the state space.
We find that our method more effectively models the relevant parts of the scene conditioned on the goal, and as a result outperforms standard task-agnostic dynamics models and model-free reinforcement learning.
arXiv Detail & Related papers (2020-07-14T16:42:59Z) - Adaptive Procedural Task Generation for Hard-Exploration Problems [78.20918366839399]
We introduce Adaptive Procedural Task Generation (APT-Gen) to facilitate reinforcement learning in hard-exploration problems.
At the heart of our approach is a task generator that learns to create tasks from a parameterized task space via a black-box procedural generation module.
To enable curriculum learning in the absence of a direct indicator of learning progress, we propose to train the task generator by balancing the agent's performance in the generated tasks and the similarity to the target tasks.
arXiv Detail & Related papers (2020-07-01T09:38:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.