Introducing the Talk Markup Language (TalkML):Adding a little social
intelligence to industrial speech interfaces
- URL: http://arxiv.org/abs/2105.11294v1
- Date: Mon, 24 May 2021 14:25:35 GMT
- Title: Introducing the Talk Markup Language (TalkML):Adding a little social
intelligence to industrial speech interfaces
- Authors: Peter Wallis
- Abstract summary: Natural language understanding is one of the more disappointing failures of AI research.
This paper describes how we have taken ideas from other disciplines and implemented them.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Virtual Personal Assistants like Siri have great potential but such
developments hit the fundamental problem of how to make computational devices
that understand human speech. Natural language understanding is one of the more
disappointing failures of AI research and it seems there is something we
computer scientists don't get about the nature of language. Of course
philosophers and linguists think quite differently about language and this
paper describes how we have taken ideas from other disciplines and implemented
them. The background to the work is to take seriously the notion of language as
action and look at what people actually do with language using the techniques
of Conversation Analysis. The observation has been that human communication is
(behind the scenes) about the management of social relations as well as the
(foregrounded) passing of information. To claim this is one thing but to
implement it requires a mechanism. The mechanism described here is based on the
notion of language being intentional - we think intentionally, talk about them
and recognise them in others - and cooperative in that we are compelled to help
out. The way we are compelled points to a solution to the ever present problem
of keeping the human on topic. The approach has led to a recent success in
which we significantly improve user satisfaction independent of task
completion. Talk Markup Language (TalkML) is a draft alternative to VoiceXML
that, we propose, greatly simplifies the scripting of interaction by providing
default behaviours for no input and not recognised speech events.
Related papers
- Interactive Dialogue Agents via Reinforcement Learning on Hindsight Regenerations [58.65755268815283]
Many real dialogues are interactive, meaning an agent's utterances will influence their conversational partner, elicit information, or change their opinion.
We use this fact to rewrite and augment existing suboptimal data, and train via offline reinforcement learning (RL) an agent that outperforms both prompting and learning from unaltered human demonstrations.
Our results in a user study with real humans show that our approach greatly outperforms existing state-of-the-art dialogue agents.
arXiv Detail & Related papers (2024-11-07T21:37:51Z) - SIFToM: Robust Spoken Instruction Following through Theory of Mind [51.326266354164716]
We present a cognitively inspired model, Speech Instruction Following through Theory of Mind (SIFToM), to enable robots to pragmatically follow human instructions under diverse speech conditions.
Results show that the SIFToM model outperforms state-of-the-art speech and language models, approaching human-level accuracy on challenging speech instruction following tasks.
arXiv Detail & Related papers (2024-09-17T02:36:10Z) - Revisiting the DARPA Communicator Data using Conversation Analysis [0.0]
This paper describes an approach to identifying opportunities for improvement'' in computer systems by looking for abuse in the form of swear words.
The premise is that humans swear at computers as a sanction and, as such, swear words represent a point of failure where the system did not behave as it should.
I hope to demonstrate that there is an alternative future for computational linguistics that does not rely on larger and larger text corpora.
arXiv Detail & Related papers (2023-07-13T15:33:01Z) - Computational Language Acquisition with Theory of Mind [84.2267302901888]
We build language-learning agents equipped with Theory of Mind (ToM) and measure its effects on the learning process.
We find that training speakers with a highly weighted ToM listener component leads to performance gains in our image referential game setting.
arXiv Detail & Related papers (2023-03-02T18:59:46Z) - Whither the Priors for (Vocal) Interactivity? [6.709659274527638]
Speech-based communication is often cited as one of the most natural' ways in which humans and robots might interact.
Despite this, the resulting interactions are anything but natural'
It is argued here that such communication failures are indicative of a deeper malaise.
arXiv Detail & Related papers (2022-03-16T12:06:46Z) - Learning Language-Conditioned Robot Behavior from Offline Data and
Crowd-Sourced Annotation [80.29069988090912]
We study the problem of learning a range of vision-based manipulation tasks from a large offline dataset of robot interaction.
We propose to leverage offline robot datasets with crowd-sourced natural language labels.
We find that our approach outperforms both goal-image specifications and language conditioned imitation techniques by more than 25%.
arXiv Detail & Related papers (2021-09-02T17:42:13Z) - Few-shot Language Coordination by Modeling Theory of Mind [95.54446989205117]
We study the task of few-shot $textitlanguage coordination$.
We require the lead agent to coordinate with a $textitpopulation$ of agents with different linguistic abilities.
This requires the ability to model the partner's beliefs, a vital component of human communication.
arXiv Detail & Related papers (2021-07-12T19:26:11Z) - SocialAI 0.1: Towards a Benchmark to Stimulate Research on
Socio-Cognitive Abilities in Deep Reinforcement Learning Agents [23.719833581321033]
Building embodied autonomous agents capable of participating in social interactions with humans is one of the main challenges in AI.
Current approaches focus on language as a communication tool in very simplified and non diverse social situations.
We argue that aiming towards human-level AI requires a broader set of key social skills.
arXiv Detail & Related papers (2021-04-27T14:16:29Z) - Self-play for Data Efficient Language Acquisition [20.86261546611472]
We exploit the symmetric nature of communication in order to improve the efficiency and quality of language acquisition in learning agents.
We show that using self-play as a substitute for direct supervision enables the agent to transfer its knowledge across roles.
arXiv Detail & Related papers (2020-10-10T02:09:19Z) - Speaker Independent and Multilingual/Mixlingual Speech-Driven Talking
Head Generation Using Phonetic Posteriorgrams [58.617181880383605]
In this work, we propose a novel approach using phonetic posteriorgrams.
Our method doesn't need hand-crafted features and is more robust to noise compared to recent approaches.
Our model is the first to support multilingual/mixlingual speech as input with convincing results.
arXiv Detail & Related papers (2020-06-20T16:32:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.