Spoken Language Interaction with Robots: Research Issues and
Recommendations, Report from the NSF Future Directions Workshop
- URL: http://arxiv.org/abs/2011.05533v1
- Date: Wed, 11 Nov 2020 03:45:34 GMT
- Title: Spoken Language Interaction with Robots: Research Issues and
Recommendations, Report from the NSF Future Directions Workshop
- Authors: Matthew Marge, Carol Espy-Wilson, Nigel Ward
- Abstract summary: Meeting human needs requires addressing new challenges in speech technology and user experience design.
More powerful adaptation methods are needed, without extensive re-engineering or the collection of massive training data.
Since robots operate in real time, their speech and language processing components must also.
- Score: 0.819605661841562
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: With robotics rapidly advancing, more effective human-robot interaction is
increasingly needed to realize the full potential of robots for society. While
spoken language must be part of the solution, our ability to provide spoken
language interaction capabilities is still very limited. The National Science
Foundation accordingly convened a workshop, bringing together speech, language,
and robotics researchers to discuss what needs to be done. The result is this
report, in which we identify key scientific and engineering advances needed.
Our recommendations broadly relate to eight general themes. First, meeting
human needs requires addressing new challenges in speech technology and user
experience design. Second, this requires better models of the social and
interactive aspects of language use. Third, for robustness, robots need
higher-bandwidth communication with users and better handling of uncertainty,
including simultaneous consideration of multiple hypotheses and goals. Fourth,
more powerful adaptation methods are needed, to enable robots to communicate in
new environments, for new tasks, and with diverse user populations, without
extensive re-engineering or the collection of massive training data. Fifth,
since robots are embodied, speech should function together with other
communication modalities, such as gaze, gesture, posture, and motion. Sixth,
since robots operate in complex environments, speech components need access to
rich yet efficient representations of what the robot knows about objects,
locations, noise sources, the user, and other humans. Seventh, since robots
operate in real time, their speech and language processing components must
also. Eighth, in addition to more research, we need more work on infrastructure
and resources, including shareable software modules and internal interfaces,
inexpensive hardware, baseline systems, and diverse corpora.
Related papers
- $π_0$: A Vision-Language-Action Flow Model for General Robot Control [77.32743739202543]
We propose a novel flow matching architecture built on top of a pre-trained vision-language model (VLM) to inherit Internet-scale semantic knowledge.
We evaluate our model in terms of its ability to perform tasks in zero shot after pre-training, follow language instructions from people, and its ability to acquire new skills via fine-tuning.
arXiv Detail & Related papers (2024-10-31T17:22:30Z) - Human-Robot Mutual Learning through Affective-Linguistic Interaction and Differential Outcomes Training [Pre-Print] [0.3811184252495269]
We test how affective-linguistic communication, in combination with differential outcomes training, affects mutual learning in a human-robot context.
Taking inspiration from child- caregiver dynamics, our human-robot interaction setup consists of a (simulated) robot attempting to learn how best to communicate internal, homeostatically-controlled needs.
arXiv Detail & Related papers (2024-07-01T13:35:08Z) - Socially Pertinent Robots in Gerontological Healthcare [78.35311825198136]
This paper is an attempt to partially answer the question, via two waves of experiments with patients and companions in a day-care gerontological facility in Paris with a full-sized humanoid robot endowed with social and conversational interaction capabilities.
Overall, the users are receptive to this technology, especially when the robot perception and action skills are robust to environmental clutter and flexible to handle a plethora of different interactions.
arXiv Detail & Related papers (2024-04-11T08:43:37Z) - Dialogue with Robots: Proposals for Broadening Participation and Research in the SLIVAR Community [57.56212633174706]
The ability to interact with machines using natural human language is becoming commonplace, but expected.
In this paper, we chronicle the recent history of this growing field of spoken dialogue with robots.
We offer the community three proposals, the first focused on education, the second on benchmarks, and the third on the modeling of language when it comes to spoken interaction with robots.
arXiv Detail & Related papers (2024-04-01T15:03:27Z) - Exploring Large Language Models to Facilitate Variable Autonomy for Human-Robot Teaming [4.779196219827508]
We introduce a novel framework for a GPT-powered multi-robot testbed environment, based on a Unity Virtual Reality (VR) setting.
This system allows users to interact with robot agents through natural language, each powered by individual GPT cores.
A user study with 12 participants explores the effectiveness of GPT-4 and, more importantly, user strategies when being given the opportunity to converse in natural language within a multi-robot environment.
arXiv Detail & Related papers (2023-12-12T12:26:48Z) - A Human-Robot Mutual Learning System with Affect-Grounded Language
Acquisition and Differential Outcomes Training [0.1812164955222814]
The paper presents a novel human-robot interaction setup for identifying robot homeostatic needs.
We adopted a differential outcomes training protocol whereby the robot provides feedback specific to its internal needs.
We found evidence that DOT can enhance the human's learning efficiency, which in turn enables more efficient robot language acquisition.
arXiv Detail & Related papers (2023-10-20T09:41:31Z) - WALL-E: Embodied Robotic WAiter Load Lifting with Large Language Model [92.90127398282209]
This paper investigates the potential of integrating the most recent Large Language Models (LLMs) and existing visual grounding and robotic grasping system.
We introduce the WALL-E (Embodied Robotic WAiter load lifting with Large Language model) as an example of this integration.
We deploy this LLM-empowered system on the physical robot to provide a more user-friendly interface for the instruction-guided grasping task.
arXiv Detail & Related papers (2023-08-30T11:35:21Z) - Semantic-Aware Environment Perception for Mobile Human-Robot Interaction [2.309914459672557]
We present a vision-based system for mobile robots to enable a semantic-aware environment without additional a-priori knowledge.
We deploy our system on a mobile humanoid robot that enables us to test our methods in real-world applications.
arXiv Detail & Related papers (2022-11-07T08:49:45Z) - Understanding Natural Language in Context [13.112390442564442]
We focus on cognitive robots, which have some knowledge-based models of the world and operate by reasoning and planning with this model.
Our goal in this research is to translate natural language utterances into this robot's formalism.
We do so by combining off-the-shelf SOTA language models, planning tools, and the robot's knowledge-base for better communication.
arXiv Detail & Related papers (2022-05-25T11:52:16Z) - Semantics for Robotic Mapping, Perception and Interaction: A Survey [93.93587844202534]
Study of understanding dictates what does the world "mean" to a robot.
With humans and robots increasingly operating in the same world, the prospects of human-robot interaction also bring semantics into the picture.
Driven by need, as well as by enablers like increasing availability of training data and computational resources, semantics is a rapidly growing research area in robotics.
arXiv Detail & Related papers (2021-01-02T12:34:39Z) - SAPIEN: A SimulAted Part-based Interactive ENvironment [77.4739790629284]
SAPIEN is a realistic and physics-rich simulated environment that hosts a large-scale set for articulated objects.
We evaluate state-of-the-art vision algorithms for part detection and motion attribute recognition as well as demonstrate robotic interaction tasks.
arXiv Detail & Related papers (2020-03-19T00:11:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.