Integrating Action Knowledge and LLMs for Task Planning and Situation
Handling in Open Worlds
- URL: http://arxiv.org/abs/2305.17590v2
- Date: Thu, 5 Oct 2023 18:05:51 GMT
- Title: Integrating Action Knowledge and LLMs for Task Planning and Situation
Handling in Open Worlds
- Authors: Yan Ding, Xiaohan Zhang, Saeid Amiri, Nieqing Cao, Hao Yang, Andy
Kaminski, Chad Esselink, Shiqi Zhang
- Abstract summary: This paper introduces a novel framework, called COWP, for open-world task planning and situation handling.
COWP dynamically augments the robot's action knowledge, including the preconditions and effects of actions, with task-oriented commonsense knowledge.
Experimental results show that our approach outperforms competitive baselines from the literature in the success rate of service tasks.
- Score: 10.077350377962482
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Task planning systems have been developed to help robots use human knowledge
(about actions) to complete long-horizon tasks. Most of them have been
developed for "closed worlds" while assuming the robot is provided with
complete world knowledge. However, the real world is generally open, and the
robots frequently encounter unforeseen situations that can potentially break
the planner's completeness. Could we leverage the recent advances on
pre-trained Large Language Models (LLMs) to enable classical planning systems
to deal with novel situations?
This paper introduces a novel framework, called COWP, for open-world task
planning and situation handling. COWP dynamically augments the robot's action
knowledge, including the preconditions and effects of actions, with
task-oriented commonsense knowledge. COWP embraces the openness from LLMs, and
is grounded to specific domains via action knowledge. For systematic
evaluations, we collected a dataset that includes 1,085 execution-time
situations. Each situation corresponds to a state instance wherein a robot is
potentially unable to complete a task using a solution that normally works.
Experimental results show that our approach outperforms competitive baselines
from the literature in the success rate of service tasks. Additionally, we have
demonstrated COWP using a mobile manipulator. Supplementary materials are
available at: https://cowplanning.github.io/
Related papers
- $π_0$: A Vision-Language-Action Flow Model for General Robot Control [77.32743739202543]
We propose a novel flow matching architecture built on top of a pre-trained vision-language model (VLM) to inherit Internet-scale semantic knowledge.
We evaluate our model in terms of its ability to perform tasks in zero shot after pre-training, follow language instructions from people, and its ability to acquire new skills via fine-tuning.
arXiv Detail & Related papers (2024-10-31T17:22:30Z) - Grounding Language Models in Autonomous Loco-manipulation Tasks [3.8363685417355557]
We propose a novel framework that learns, selects, and plans behaviors based on tasks in different scenarios.
We leverage the planning and reasoning features of the large language model (LLM), constructing a hierarchical task graph.
Experiments in simulation and real-world using the CENTAURO robot show that the language model based planner can efficiently adapt to new loco-manipulation tasks.
arXiv Detail & Related papers (2024-09-02T15:27:48Z) - GRUtopia: Dream General Robots in a City at Scale [65.08318324604116]
This paper introduces project GRUtopia, the first simulated interactive 3D society designed for various robots.
GRScenes includes 100k interactive, finely annotated scenes, which can be freely combined into city-scale environments.
GRResidents is a Large Language Model (LLM) driven Non-Player Character (NPC) system that is responsible for social interaction.
arXiv Detail & Related papers (2024-07-15T17:40:46Z) - WorkArena++: Towards Compositional Planning and Reasoning-based Common Knowledge Work Tasks [85.95607119635102]
Large language models (LLMs) can mimic human-like intelligence.
WorkArena++ is designed to evaluate the planning, problem-solving, logical/arithmetic reasoning, retrieval, and contextual understanding abilities of web agents.
arXiv Detail & Related papers (2024-07-07T07:15:49Z) - Interactive Planning Using Large Language Models for Partially
Observable Robotics Tasks [54.60571399091711]
Large Language Models (LLMs) have achieved impressive results in creating robotic agents for performing open vocabulary tasks.
We present an interactive planning technique for partially observable tasks using LLMs.
arXiv Detail & Related papers (2023-12-11T22:54:44Z) - Look Before You Leap: Unveiling the Power of GPT-4V in Robotic
Vision-Language Planning [32.045840007623276]
We introduce Robotic Vision-Language Planning (ViLa), a novel approach for long-horizon robotic planning.
ViLa directly integrates perceptual data into its reasoning and planning process.
Our evaluation, conducted in both real-robot and simulated environments, demonstrates ViLa's superiority over existing LLM-based planners.
arXiv Detail & Related papers (2023-11-29T17:46:25Z) - Robot Task Planning and Situation Handling in Open Worlds [10.077350377962482]
This paper introduces a novel algorithm for open-world task planning and situation handling.
COWP dynamically augments the robot's action knowledge with task-oriented common sense.
This version has been accepted for publication in Autonomous Robots.
arXiv Detail & Related papers (2022-10-04T00:21:00Z) - ProgPrompt: Generating Situated Robot Task Plans using Large Language
Models [68.57918965060787]
Large language models (LLMs) can be used to score potential next actions during task planning.
We present a programmatic LLM prompt structure that enables plan generation functional across situated environments.
arXiv Detail & Related papers (2022-09-22T20:29:49Z) - COG: Connecting New Skills to Past Experience with Offline Reinforcement
Learning [78.13740204156858]
We show that we can reuse prior data to extend new skills simply through dynamic programming.
We demonstrate the effectiveness of our approach by chaining together several behaviors seen in prior datasets for solving a new task.
We train our policies in an end-to-end fashion, mapping high-dimensional image observations to low-level robot control commands.
arXiv Detail & Related papers (2020-10-27T17:57:29Z) - iCORPP: Interleaved Commonsense Reasoning and Probabilistic Planning on
Robots [46.13039152809055]
We present a novel algorithm, called iCORPP, to simultaneously estimate the current world state, reason about world dynamics, and construct task-oriented controllers.
Results show significant improvements in scalability, efficiency, and adaptiveness, compared to competitive baselines.
arXiv Detail & Related papers (2020-04-18T17:46:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.