Related papers: Whither the Priors for (Vocal) Interactivity?

Whither the Priors for (Vocal) Interactivity?

URL: http://arxiv.org/abs/2203.08578v1
Date: Wed, 16 Mar 2022 12:06:46 GMT
Title: Whither the Priors for (Vocal) Interactivity?
Authors: Roger K. Moore
Abstract summary: Speech-based communication is often cited as one of the most natural' ways in which humans and robots might interact. Despite this, the resulting interactions are anything but natural' It is argued here that such communication failures are indicative of a deeper malaise.
Score: 6.709659274527638
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Voice-based communication is often cited as one of the most `natural' ways in which humans and robots might interact, and the recent availability of accurate automatic speech recognition and intelligible speech synthesis has enabled researchers to integrate advanced off-the-shelf spoken language technology components into their robot platforms. Despite this, the resulting interactions are anything but `natural'. It transpires that simply giving a robot a voice doesn't mean that a user will know how (or when) to talk to it, and the resulting `conversations' tend to be stilted, one-sided and short. On the surface, these difficulties might appear to be fairly trivial consequences of users' unfamiliarity with robots (and \emph{vice versa}), and that any problems would be mitigated by long-term use by the human, coupled with `deep learning' by the robot. However, it is argued here that such communication failures are indicative of a deeper malaise: a fundamental lack of basic principles -- \emph{priors} -- underpinning not only speech-based interaction in particular, but (vocal) interactivity in general. This is evidenced not only by the fact that contemporary spoken language systems already require training data sets that are orders-of-magnitude greater than that experienced by a young child, but also by the lack of design principles for creating effective communicative human-robot interaction. This short position paper identifies some of the key areas where theoretical insights might help overcome these shortfalls.

Related papers

No More Mumbles: Enhancing Robot Intelligibility through Speech Adaptation [7.675340768192281]
We conduct a speech comprehension study involving 39 participants. Experiment's primary outcome shows that spaces with good acoustic quality positively correlate with intelligibility and user experience. We develop a convolutional neural network model to adapt the robot's speech parameters to different users and spaces.
arXiv Detail & Related papers (2024-05-15T21:28:55Z)
Dialogue with Robots: Proposals for Broadening Participation and Research in the SLIVAR Community [57.56212633174706]
The ability to interact with machines using natural human language is becoming commonplace, but expected. In this paper, we chronicle the recent history of this growing field of spoken dialogue with robots. We offer the community three proposals, the first focused on education, the second on benchmarks, and the third on the modeling of language when it comes to spoken interaction with robots.
arXiv Detail & Related papers (2024-04-01T15:03:27Z)
Humane Speech Synthesis through Zero-Shot Emotion and Disfluency Generation [0.6964027823688135]
Modern conversational systems lack emotional depth and disfluent characteristic of human interactions. To address this shortcoming, we have designed an innovative speech synthesis pipeline. Within this framework, a cutting-edge language model introduces both human-like emotion and disfluencies in a zero-shot setting.
arXiv Detail & Related papers (2024-03-31T00:38:02Z)
Real-time Addressee Estimation: Deployment of a Deep-Learning Model on the iCub Robot [52.277579221741746]
Addressee Estimation is a skill essential for social robots to interact smoothly with humans. Inspired by human perceptual skills, a deep-learning model for Addressee Estimation is designed, trained, and deployed on an iCub robot. The study presents the procedure of such implementation and the performance of the model deployed in real-time human-robot interaction.
arXiv Detail & Related papers (2023-11-09T13:01:21Z)
A Human-Robot Mutual Learning System with Affect-Grounded Language Acquisition and Differential Outcomes Training [0.1812164955222814]
The paper presents a novel human-robot interaction setup for identifying robot homeostatic needs. We adopted a differential outcomes training protocol whereby the robot provides feedback specific to its internal needs. We found evidence that DOT can enhance the human's learning efficiency, which in turn enables more efficient robot language acquisition.
arXiv Detail & Related papers (2023-10-20T09:41:31Z)
SACSoN: Scalable Autonomous Control for Social Navigation [62.59274275261392]
We develop methods for training policies for socially unobtrusive navigation. By minimizing this counterfactual perturbation, we can induce robots to behave in ways that do not alter the natural behavior of humans in the shared space. We collect a large dataset where an indoor mobile robot interacts with human bystanders.
arXiv Detail & Related papers (2023-06-02T19:07:52Z)
"No, to the Right" -- Online Language Corrections for Robotic Manipulation via Shared Autonomy [70.45420918526926]
We present LILAC, a framework for incorporating and adapting to natural language corrections online during execution. Instead of discrete turn-taking between a human and robot, LILAC splits agency between the human and robot. We show that our corrections-aware approach obtains higher task completion rates, and is subjectively preferred by users.
arXiv Detail & Related papers (2023-01-06T15:03:27Z)
Robots with Different Embodiments Can Express and Influence Carefulness in Object Manipulation [104.5440430194206]
This work investigates the perception of object manipulations performed with a communicative intent by two robots. We designed the robots' movements to communicate carefulness or not during the transportation of objects.
arXiv Detail & Related papers (2022-08-03T13:26:52Z)
Understanding Natural Language in Context [13.112390442564442]
We focus on cognitive robots, which have some knowledge-based models of the world and operate by reasoning and planning with this model. Our goal in this research is to translate natural language utterances into this robot's formalism. We do so by combining off-the-shelf SOTA language models, planning tools, and the robot's knowledge-base for better communication.
arXiv Detail & Related papers (2022-05-25T11:52:16Z)
Introducing the Talk Markup Language (TalkML):Adding a little social intelligence to industrial speech interfaces [0.0]
Natural language understanding is one of the more disappointing failures of AI research. This paper describes how we have taken ideas from other disciplines and implemented them.
arXiv Detail & Related papers (2021-05-24T14:25:35Z)
Self-supervised reinforcement learning for speaker localisation with the iCub humanoid robot [58.2026611111328]
Looking at a person's face is one of the mechanisms that humans rely on when it comes to filtering speech in noisy environments. Having a robot that can look toward a speaker could benefit ASR performance in challenging environments. We propose a self-supervised reinforcement learning-based framework inspired by the early development of humans.
arXiv Detail & Related papers (2020-11-12T18:02:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.