Related papers: Interaction is all You Need? A Study of Robots Ability to Understand and Execute

Interaction is all You Need? A Study of Robots Ability to Understand and Execute

URL: http://arxiv.org/abs/2311.07150v1
Date: Mon, 13 Nov 2023 08:39:06 GMT
Title: Interaction is all You Need? A Study of Robots Ability to Understand and Execute
Authors: Kushal Koshti and Nidhir Bhavsar
Abstract summary: We equip robots with the ability to understand and execute complex instructions in coherent dialogs. We observe that our best configuration outperforms the baseline with a success rate score of 8.85. We introduce a new task by expanding the EDH task and making predictions about game plans instead of individual actions.
Score: 0.5439020425819
License: http://creativecommons.org/licenses/by/4.0/
Abstract: This paper aims to address a critical challenge in robotics, which is enabling them to operate seamlessly in human environments through natural language interactions. Our primary focus is to equip robots with the ability to understand and execute complex instructions in coherent dialogs to facilitate intricate task-solving scenarios. To explore this, we build upon the Execution from Dialog History (EDH) task from the Teach benchmark. We employ a multi-transformer model with BART LM. We observe that our best configuration outperforms the baseline with a success rate score of 8.85 and a goal-conditioned success rate score of 14.02. In addition, we suggest an alternative methodology for completing this task. Moreover, we introduce a new task by expanding the EDH task and making predictions about game plans instead of individual actions. We have evaluated multiple BART models and an LLaMA2 LLM, which has achieved a ROGUE-L score of 46.77 for this task.

Related papers

Plan-and-Act using Large Language Models for Interactive Agreement [8.07285448283823]
Recent large language models (LLMs) are capable of planning robot actions. Key problem of applying LLMs in situational HRI is balancing between "respecting the current human's activity" and "prioritizing the robot's task"
arXiv Detail & Related papers (2025-04-01T23:41:05Z)
COHERENT: Collaboration of Heterogeneous Multi-Robot System with Large Language Models [49.24666980374751]
COHERENT is a novel LLM-based task planning framework for collaboration of heterogeneous multi-robot systems. A Proposal-Execution-Feedback-Adjustment mechanism is designed to decompose and assign actions for individual robots. The experimental results show that our work surpasses the previous methods by a large margin in terms of success rate and execution efficiency.
arXiv Detail & Related papers (2024-09-23T15:53:41Z)
Continual Skill and Task Learning via Dialogue [3.3511259017219297]
Continual and interactive robot learning is a challenging problem as the robot is present with human users. We present a framework for robots to query and learn visuo-motor robot skills and task relevant information via natural language dialog interactions with human users.
arXiv Detail & Related papers (2024-09-05T01:51:54Z)
WorkArena++: Towards Compositional Planning and Reasoning-based Common Knowledge Work Tasks [85.95607119635102]
Large language models (LLMs) can mimic human-like intelligence. WorkArena++ is designed to evaluate the planning, problem-solving, logical/arithmetic reasoning, retrieval, and contextual understanding abilities of web agents.
arXiv Detail & Related papers (2024-07-07T07:15:49Z)
Large Language Models for Orchestrating Bimanual Robots [19.60907949776435]
We present LAnguage-model-based Bimanual ORchestration (LABOR) to analyze task configurations and devise coordination control policies. We evaluate our method through simulated experiments involving two classes of long-horizon tasks using the NICOL humanoid robot.
arXiv Detail & Related papers (2024-04-02T15:08:35Z)
Interactive Planning Using Large Language Models for Partially Observable Robotics Tasks [54.60571399091711]
Large Language Models (LLMs) have achieved impressive results in creating robotic agents for performing open vocabulary tasks. We present an interactive planning technique for partially observable tasks using LLMs.
arXiv Detail & Related papers (2023-12-11T22:54:44Z)
TaskBench: Benchmarking Large Language Models for Task Automation [82.2932794189585]
We introduce TaskBench, a framework to evaluate the capability of large language models (LLMs) in task automation. Specifically, task decomposition, tool selection, and parameter prediction are assessed. Our approach combines automated construction with rigorous human verification, ensuring high consistency with human evaluation.
arXiv Detail & Related papers (2023-11-30T18:02:44Z)
Mastering Robot Manipulation with Multimodal Prompts through Pretraining and Multi-task Fine-tuning [49.92517970237088]
We tackle the problem of training a robot to understand multimodal prompts. This type of task poses a major challenge to robots' capability to understand the interconnection and complementarity between vision and language signals. We introduce an effective framework that learns a policy to perform robot manipulation with multimodal prompts.
arXiv Detail & Related papers (2023-10-14T22:24:58Z)
Interactively Robot Action Planning with Uncertainty Analysis and Active Questioning by Large Language Model [6.695536752781623]
The Large Language Model (LLM) to robot action planning has been actively studied. The instructions given to the LLM by natural language may include ambiguity and lack of information depending on the task context. We propose an interactive robot action planning method that allows the LLM to analyze and gather missing information by asking questions to humans.
arXiv Detail & Related papers (2023-08-30T00:54:44Z)
AlphaBlock: Embodied Finetuning for Vision-Language Reasoning in Robot Manipulation [50.737355245505334]
We propose a novel framework for learning high-level cognitive capabilities in robot manipulation tasks. The resulting dataset AlphaBlock consists of 35 comprehensive high-level tasks of multi-step text plans and paired observation.
arXiv Detail & Related papers (2023-05-30T09:54:20Z)

This list is automatically generated from the titles and abstracts of the papers in this site.