Dobby: A Conversational Service Robot Driven by GPT-4
- URL: http://arxiv.org/abs/2310.06303v1
- Date: Tue, 10 Oct 2023 04:34:00 GMT
- Title: Dobby: A Conversational Service Robot Driven by GPT-4
- Authors: Carson Stark, Bohkyung Chun, Casey Charleston, Varsha Ravi, Luis
Pabon, Surya Sunkari, Tarun Mohan, Peter Stone, and Justin Hart
- Abstract summary: This work introduces a robotics platform which embeds a conversational AI agent in an embodied system for service tasks.
The agent is derived from a large language model, which has learned from a vast corpus of general knowledge.
In addition to generating dialogue, this agent can interface with the physical world by invoking commands on the robot.
- Score: 22.701223191699412
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This work introduces a robotics platform which embeds a conversational AI
agent in an embodied system for natural language understanding and intelligent
decision-making for service tasks; integrating task planning and human-like
conversation. The agent is derived from a large language model, which has
learned from a vast corpus of general knowledge. In addition to generating
dialogue, this agent can interface with the physical world by invoking commands
on the robot; seamlessly merging communication and behavior. This system is
demonstrated in a free-form tour-guide scenario, in an HRI study combining
robots with and without conversational AI capabilities. Performance is measured
along five dimensions: overall effectiveness, exploration abilities,
scrutinization abilities, receptiveness to personification, and adaptability.
Related papers
- Dialogue with Robots: Proposals for Broadening Participation and Research in the SLIVAR Community [57.56212633174706]
The ability to interact with machines using natural human language is becoming commonplace, but expected.
In this paper, we chronicle the recent history of this growing field of spoken dialogue with robots.
We offer the community three proposals, the first focused on education, the second on benchmarks, and the third on the modeling of language when it comes to spoken interaction with robots.
arXiv Detail & Related papers (2024-04-01T15:03:27Z) - RoboScript: Code Generation for Free-Form Manipulation Tasks across Real
and Simulation [77.41969287400977]
This paper presents textbfRobotScript, a platform for a deployable robot manipulation pipeline powered by code generation.
We also present a benchmark for a code generation benchmark for robot manipulation tasks in free-form natural language.
We demonstrate the adaptability of our code generation framework across multiple robot embodiments, including the Franka and UR5 robot arms.
arXiv Detail & Related papers (2024-02-22T15:12:00Z) - Exploring Large Language Models to Facilitate Variable Autonomy for Human-Robot Teaming [4.779196219827508]
We introduce a novel framework for a GPT-powered multi-robot testbed environment, based on a Unity Virtual Reality (VR) setting.
This system allows users to interact with robot agents through natural language, each powered by individual GPT cores.
A user study with 12 participants explores the effectiveness of GPT-4 and, more importantly, user strategies when being given the opportunity to converse in natural language within a multi-robot environment.
arXiv Detail & Related papers (2023-12-12T12:26:48Z) - A Sign Language Recognition System with Pepper, Lightweight-Transformer,
and LLM [0.9775599530257609]
This research explores using lightweight deep neural network architectures to enable the humanoid robot Pepper to understand American Sign Language (ASL)
We introduce a lightweight and efficient model for ASL understanding optimized for embedded systems, ensuring rapid sign recognition while conserving computational resources.
We tailor interactions to allow the Pepper Robot to generate natural Co-Speech Gesture responses, laying the foundation for more organic and intuitive humanoid-robot dialogues.
arXiv Detail & Related papers (2023-09-28T23:54:41Z) - WALL-E: Embodied Robotic WAiter Load Lifting with Large Language Model [92.90127398282209]
This paper investigates the potential of integrating the most recent Large Language Models (LLMs) and existing visual grounding and robotic grasping system.
We introduce the WALL-E (Embodied Robotic WAiter load lifting with Large Language model) as an example of this integration.
We deploy this LLM-empowered system on the physical robot to provide a more user-friendly interface for the instruction-guided grasping task.
arXiv Detail & Related papers (2023-08-30T11:35:21Z) - Natural Language Instructions for Intuitive Human Interaction with
Robotic Assistants in Field Construction Work [4.223718588030052]
This paper proposes a framework to allow human workers to interact with construction robots based on natural language instructions.
The proposed method consists of three stages: Natural Language Understanding (NLU), Information Mapping (IM), and Robot Control (RC)
arXiv Detail & Related papers (2023-07-09T15:02:34Z) - Surfer: Progressive Reasoning with World Models for Robotic Manipulation [51.26109827779267]
We introduce a novel and simple robot manipulation framework, called Surfer.
Surfer treats robot manipulation as a state transfer of the visual scene, and decouples it into two parts: action and scene.
It is based on the world model, treats robot manipulation as a state transfer of the visual scene, and decouples it into two parts: action and scene.
arXiv Detail & Related papers (2023-06-20T07:06:04Z) - Understanding Natural Language in Context [13.112390442564442]
We focus on cognitive robots, which have some knowledge-based models of the world and operate by reasoning and planning with this model.
Our goal in this research is to translate natural language utterances into this robot's formalism.
We do so by combining off-the-shelf SOTA language models, planning tools, and the robot's knowledge-base for better communication.
arXiv Detail & Related papers (2022-05-25T11:52:16Z) - Building Human-like Communicative Intelligence: A Grounded Perspective [1.0152838128195465]
After making astounding progress in language learning, AI systems seem to approach the ceiling that does not reflect important aspects of human communicative capacities.
This paper suggests that the dominant cognitively-inspired AI directions, based on nativist and symbolic paradigms, lack necessary substantiation and concreteness to guide progress in modern AI.
I propose a list of concrete, implementable components for building "grounded" linguistic intelligence.
arXiv Detail & Related papers (2022-01-02T01:43:24Z) - Self-supervised reinforcement learning for speaker localisation with the
iCub humanoid robot [58.2026611111328]
Looking at a person's face is one of the mechanisms that humans rely on when it comes to filtering speech in noisy environments.
Having a robot that can look toward a speaker could benefit ASR performance in challenging environments.
We propose a self-supervised reinforcement learning-based framework inspired by the early development of humans.
arXiv Detail & Related papers (2020-11-12T18:02:15Z) - Joint Mind Modeling for Explanation Generation in Complex Human-Robot
Collaborative Tasks [83.37025218216888]
We propose a novel explainable AI (XAI) framework for achieving human-like communication in human-robot collaborations.
The robot builds a hierarchical mind model of the human user and generates explanations of its own mind as a form of communications.
Results show that the generated explanations of our approach significantly improves the collaboration performance and user perception of the robot.
arXiv Detail & Related papers (2020-07-24T23:35:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.