Related papers: Multi-Robot Task Planning for Multi-Object Retrieval Tasks with Distributed On-Site Knowledge via Large Language Models

Multi-Robot Task Planning for Multi-Object Retrieval Tasks with Distributed On-Site Knowledge via Large Language Models

URL: http://arxiv.org/abs/2509.12838v2
Date: Tue, 30 Sep 2025 12:31:07 GMT
Title: Multi-Robot Task Planning for Multi-Object Retrieval Tasks with Distributed On-Site Knowledge via Large Language Models
Authors: Kento Murata, Shoichi Hasegawa, Tomochika Ishikawa, Yoshinobu Hagiwara, Akira Taniguchi, Lotfi El Hafi, Tadahiro Taniguchi,
Abstract summary: It is crucial to efficiently execute instructions such as "Find an apple and a banana" or "Get ready for a field trip"<n>This study addresses the challenging problem of determining which robot should be assigned to which part of a task when each robot possesses different situational on-site knowledge.<n>We propose a task planning framework that leverages large language models (LLMs) and spatial concepts to decompose natural language instructions into subtasks.
Score: 6.0783502693642495
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: It is crucial to efficiently execute instructions such as "Find an apple and a banana" or "Get ready for a field trip," which require searching for multiple objects or understanding context-dependent commands. This study addresses the challenging problem of determining which robot should be assigned to which part of a task when each robot possesses different situational on-site knowledge-specifically, spatial concepts learned from the area designated to it by the user. We propose a task planning framework that leverages large language models (LLMs) and spatial concepts to decompose natural language instructions into subtasks and allocate them to multiple robots. We designed a novel few-shot prompting strategy that enables LLMs to infer required objects from ambiguous commands and decompose them into appropriate subtasks. In our experiments, the proposed method achieved 47/50 successful assignments, outperforming random (28/50) and commonsense-based assignment (26/50). Furthermore, we conducted qualitative evaluations using two actual mobile manipulators. The results demonstrated that our framework could handle instructions, including those involving ad hoc categories such as "Get ready for a field trip," by successfully performing task decomposition, assignment, sequential planning, and execution.

Related papers

HELP: Hierarchical Embodied Language Planner for Household Tasks [75.38606213726906]
Embodied agents tasked with complex scenarios rely heavily on robust planning capabilities.<n>Large language models equipped with extensive linguistic knowledge can play this role.<n>We propose a Hierarchical Embodied Language Planner, called HELP, consisting of a set of LLM-based agents.
arXiv Detail & Related papers (2025-12-25T15:54:08Z)
Embodied Instruction Following in Unknown Environments [64.57388036567461]
We propose an embodied instruction following (EIF) method for complex tasks in the unknown environment.<n>We build a hierarchical embodied instruction following framework including the high-level task planner and the low-level exploration controller.<n>For the task planner, we generate the feasible step-by-step plans for human goal accomplishment according to the task completion process and the known visual clues.
arXiv Detail & Related papers (2024-06-17T17:55:40Z)
Probabilistically Correct Language-based Multi-Robot Planning using Conformal Prediction [11.614036749291216]
We introduce a new distributed multi-robot planner called S-ATLAS for Safe plAnning for Teams of Language-instructed AgentS. We show that the proposed planner can achieve user-specified task success rates, assuming successful plan execution. We provide comparative experiments against related works showing that our method is significantly more computational efficient and achieves lower help rates.
arXiv Detail & Related papers (2024-02-23T15:02:44Z)
Interactive Planning Using Large Language Models for Partially Observable Robotics Tasks [54.60571399091711]
Large Language Models (LLMs) have achieved impressive results in creating robotic agents for performing open vocabulary tasks. We present an interactive planning technique for partially observable tasks using LLMs.
arXiv Detail & Related papers (2023-12-11T22:54:44Z)
Fully Automated Task Management for Generation, Execution, and Evaluation: A Framework for Fetch-and-Carry Tasks with Natural Language Instructions in Continuous Space [1.2691047660244337]
This paper aims to develop a framework that enables a robot to execute tasks based on visual information. We propose a framework for the full automation of the generation, execution, and evaluation of FCOG tasks. In addition, we introduce an approach to solving the FCOG tasks by dividing them into four distinct subtasks.
arXiv Detail & Related papers (2023-11-07T15:38:09Z)
LEMMA: Learning Language-Conditioned Multi-Robot Manipulation [21.75163634731677]
LanguagE-Conditioned Multi-robot MAnipulation (LEMMA) LeMMA features 8 types of procedurally generated tasks with varying degree of complexity. For each task, we provide 800 expert demonstrations and human instructions for training and evaluations.
arXiv Detail & Related papers (2023-08-02T04:37:07Z)
Robot Task Planning Based on Large Language Model Representing Knowledge with Directed Graph Structures [2.3698227130544547]
We propose a task planning method that combines human expertise with an LLM and have designed an LLM prompt template, Think_Net_Prompt. We further propose a method to progressively decompose tasks and generate a task tree to reduce the planning volume for each task.
arXiv Detail & Related papers (2023-06-08T13:10:00Z)
Instruct2Act: Mapping Multi-modality Instructions to Robotic Actions with Large Language Model [63.66204449776262]
Instruct2Act is a framework that maps multi-modal instructions to sequential actions for robotic manipulation tasks. Our approach is adjustable and flexible in accommodating various instruction modalities and input types. Our zero-shot method outperformed many state-of-the-art learning-based policies in several tasks.
arXiv Detail & Related papers (2023-05-18T17:59:49Z)
Using Both Demonstrations and Language Instructions to Efficiently Learn Robotic Tasks [21.65346551790888]
DeL-TaCo is a method for conditioning a robotic policy on task embeddings comprised of two components: a visual demonstration and a language instruction. To our knowledge, this is the first work to show that simultaneously conditioning a multi-task robotic manipulation policy on both demonstration and language embeddings improves sample efficiency and generalization over conditioning on either modality alone.
arXiv Detail & Related papers (2022-10-10T08:06:58Z)
ProgPrompt: Generating Situated Robot Task Plans using Large Language Models [68.57918965060787]
Large language models (LLMs) can be used to score potential next actions during task planning. We present a programmatic LLM prompt structure that enables plan generation functional across situated environments.
arXiv Detail & Related papers (2022-09-22T20:29:49Z)
Fast Inference and Transfer of Compositional Task Structures for Few-shot Task Generalization [101.72755769194677]
We formulate it as a few-shot reinforcement learning problem where a task is characterized by a subtask graph. Our multi-task subtask graph inferencer (MTSGI) first infers the common high-level task structure in terms of the subtask graph from the training tasks. Our experiment results on 2D grid-world and complex web navigation domains show that the proposed method can learn and leverage the common underlying structure of the tasks for faster adaptation to the unseen tasks.
arXiv Detail & Related papers (2022-05-25T10:44:25Z)

This list is automatically generated from the titles and abstracts of the papers in this site.