Related papers: REI-Bench: Can Embodied Agents Understand Vague Human Instructions in Task Planning?

REI-Bench: Can Embodied Agents Understand Vague Human Instructions in Task Planning?

URL: http://arxiv.org/abs/2505.10872v2
Date: Mon, 19 May 2025 17:21:49 GMT
Title: REI-Bench: Can Embodied Agents Understand Vague Human Instructions in Task Planning?
Authors: Chenxi Jiang, Chuhao Zhou, Jianfei Yang,
Abstract summary: Linguists suggest that such vagueness frequently arises from referring expressions (REs)<n>This paper studies how such vagueness in REs within human instructions affects LLM-based robot task planning.<n>We propose the first robot task planning benchmark with vague REs (REI-Bench), where we discover that the vagueness of REs can severely degrade robot planning performance.
Score: 12.490512012911635
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Robot task planning decomposes human instructions into executable action sequences that enable robots to complete a series of complex tasks. Although recent large language model (LLM)-based task planners achieve amazing performance, they assume that human instructions are clear and straightforward. However, real-world users are not experts, and their instructions to robots often contain significant vagueness. Linguists suggest that such vagueness frequently arises from referring expressions (REs), whose meanings depend heavily on dialogue context and environment. This vagueness is even more prevalent among the elderly and children, who robots should serve more. This paper studies how such vagueness in REs within human instructions affects LLM-based robot task planning and how to overcome this issue. To this end, we propose the first robot task planning benchmark with vague REs (REI-Bench), where we discover that the vagueness of REs can severely degrade robot planning performance, leading to success rate drops of up to 77.9%. We also observe that most failure cases stem from missing objects in planners. To mitigate the REs issue, we propose a simple yet effective approach: task-oriented context cognition, which generates clear instructions for robots, achieving state-of-the-art performance compared to aware prompt and chains of thought. This work contributes to the research community of human-robot interaction (HRI) by making robot task planning more practical, particularly for non-expert users, e.g., the elderly and children.

Related papers

RoBridge: A Hierarchical Architecture Bridging Cognition and Execution for General Robotic Manipulation [90.81956345363355]
RoBridge is a hierarchical intelligent architecture for general robotic manipulation.<n>It consists of a high-level cognitive planner (HCP) based on a large-scale pre-trained vision-language model (VLM)<n>It unleashes the procedural skill of reinforcement learning, effectively bridging the gap between cognition and execution.
arXiv Detail & Related papers (2025-05-03T06:17:18Z)
REMAC: Self-Reflective and Self-Evolving Multi-Agent Collaboration for Long-Horizon Robot Manipulation [57.628771707989166]
We propose an adaptive multi-agent planning framework, termed REMAC, that enables efficient, scene-agnostic multi-robot long-horizon task planning and execution.<n>ReMAC incorporates two key modules: a self-reflection module performing pre-conditions and post-condition checks in the loop to evaluate progress and refine plans, and a self-evolvement module dynamically adapting plans based on scene-specific reasoning.
arXiv Detail & Related papers (2025-03-28T03:51:40Z)
COHERENT: Collaboration of Heterogeneous Multi-Robot System with Large Language Models [49.24666980374751]
COHERENT is a novel LLM-based task planning framework for collaboration of heterogeneous multi-robot systems.<n>A Proposal-Execution-Feedback-Adjustment mechanism is designed to decompose and assign actions for individual robots.<n>The experimental results show that our work surpasses the previous methods by a large margin in terms of success rate and execution efficiency.
arXiv Detail & Related papers (2024-09-23T15:53:41Z)
Autonomous Behavior Planning For Humanoid Loco-manipulation Through Grounded Language Model [6.9268843428933025]
Large language models (LLMs) have demonstrated powerful planning and reasoning capabilities for comprehension and processing of semantic information. We propose a novel language-model based framework that enables robots to autonomously plan behaviors and low-level execution under given textual instructions.
arXiv Detail & Related papers (2024-08-15T17:33:32Z)
Robots Can Multitask Too: Integrating a Memory Architecture and LLMs for Enhanced Cross-Task Robot Action Generation [13.181465089984567]
Large Language Models (LLMs) have been recently used in robot applications for grounding common-sense reasoning with the robot's perception and physical abilities. In this paper, we address incorporating memory processes with LLMs for generating cross-task robot actions, while the robot effectively switches between tasks. Our results show a significant improvement in performance over a baseline of five robotic tasks, demonstrating the potential of integrating memory with LLMs for combining the robot's action and perception for adaptive task execution.
arXiv Detail & Related papers (2024-07-18T13:38:21Z)
Probabilistically Correct Language-based Multi-Robot Planning using Conformal Prediction [11.614036749291216]
We introduce a new distributed multi-robot planner called S-ATLAS for Safe plAnning for Teams of Language-instructed AgentS. We show that the proposed planner can achieve user-specified task success rates, assuming successful plan execution. We provide comparative experiments against related works showing that our method is significantly more computational efficient and achieves lower help rates.
arXiv Detail & Related papers (2024-02-23T15:02:44Z)
ThinkBot: Embodied Instruction Following with Thought Chain Reasoning [66.09880459084901]
Embodied Instruction Following (EIF) requires agents to complete human instruction by interacting objects in complicated surrounding environments. We propose ThinkBot that reasons the thought chain in human instruction to recover the missing action descriptions. Our ThinkBot outperforms the state-of-the-art EIF methods by a sizable margin in both success rate and execution efficiency.
arXiv Detail & Related papers (2023-12-12T08:30:09Z)
Interactive Planning Using Large Language Models for Partially Observable Robotics Tasks [54.60571399091711]
Large Language Models (LLMs) have achieved impressive results in creating robotic agents for performing open vocabulary tasks. We present an interactive planning technique for partially observable tasks using LLMs.
arXiv Detail & Related papers (2023-12-11T22:54:44Z)
ProgPrompt: Generating Situated Robot Task Plans using Large Language Models [68.57918965060787]
Large language models (LLMs) can be used to score potential next actions during task planning. We present a programmatic LLM prompt structure that enables plan generation functional across situated environments.
arXiv Detail & Related papers (2022-09-22T20:29:49Z)
Enabling human-like task identification from natural conversation [7.00597813134145]
We provide a non-trivial method to combine an NLP engine and a planner such that a robot can successfully identify tasks and all the relevant parameters and generate an accurate plan for the task. This work makes a significant stride towards enabling a human-like task understanding capability in a robot.
arXiv Detail & Related papers (2020-08-23T17:19:23Z)

This list is automatically generated from the titles and abstracts of the papers in this site.