Related papers: Introducing the Talk Markup Language (TalkML):Adding a little social intelligence to industrial speech interfaces

Introducing the Talk Markup Language (TalkML):Adding a little social intelligence to industrial speech interfaces

URL: http://arxiv.org/abs/2105.11294v1
Date: Mon, 24 May 2021 14:25:35 GMT
Title: Introducing the Talk Markup Language (TalkML):Adding a little social intelligence to industrial speech interfaces
Authors: Peter Wallis
Abstract summary: Natural language understanding is one of the more disappointing failures of AI research. This paper describes how we have taken ideas from other disciplines and implemented them.
Score: 0.0
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: Virtual Personal Assistants like Siri have great potential but such developments hit the fundamental problem of how to make computational devices that understand human speech. Natural language understanding is one of the more disappointing failures of AI research and it seems there is something we computer scientists don't get about the nature of language. Of course philosophers and linguists think quite differently about language and this paper describes how we have taken ideas from other disciplines and implemented them. The background to the work is to take seriously the notion of language as action and look at what people actually do with language using the techniques of Conversation Analysis. The observation has been that human communication is (behind the scenes) about the management of social relations as well as the (foregrounded) passing of information. To claim this is one thing but to implement it requires a mechanism. The mechanism described here is based on the notion of language being intentional - we think intentionally, talk about them and recognise them in others - and cooperative in that we are compelled to help out. The way we are compelled points to a solution to the ever present problem of keeping the human on topic. The approach has led to a recent success in which we significantly improve user satisfaction independent of task completion. Talk Markup Language (TalkML) is a draft alternative to VoiceXML that, we propose, greatly simplifies the scripting of interaction by providing default behaviours for no input and not recognised speech events.

Related papers

Interactive Dialogue Agents via Reinforcement Learning on Hindsight Regenerations [58.65755268815283]
Many real dialogues are interactive, meaning an agent's utterances will influence their conversational partner, elicit information, or change their opinion. We use this fact to rewrite and augment existing suboptimal data, and train via offline reinforcement learning (RL) an agent that outperforms both prompting and learning from unaltered human demonstrations. Our results in a user study with real humans show that our approach greatly outperforms existing state-of-the-art dialogue agents.
arXiv Detail & Related papers (2024-11-07T21:37:51Z)
SIFToM: Robust Spoken Instruction Following through Theory of Mind [51.326266354164716]
We present a cognitively inspired model, Speech Instruction Following through Theory of Mind (SIFToM), to enable robots to pragmatically follow human instructions under diverse speech conditions. Results show that the SIFToM model outperforms state-of-the-art speech and language models, approaching human-level accuracy on challenging speech instruction following tasks.
arXiv Detail & Related papers (2024-09-17T02:36:10Z)
Revisiting the DARPA Communicator Data using Conversation Analysis [0.0]
This paper describes an approach to identifying opportunities for improvement'' in computer systems by looking for abuse in the form of swear words. The premise is that humans swear at computers as a sanction and, as such, swear words represent a point of failure where the system did not behave as it should. I hope to demonstrate that there is an alternative future for computational linguistics that does not rely on larger and larger text corpora.
arXiv Detail & Related papers (2023-07-13T15:33:01Z)
Computational Language Acquisition with Theory of Mind [84.2267302901888]
We build language-learning agents equipped with Theory of Mind (ToM) and measure its effects on the learning process. We find that training speakers with a highly weighted ToM listener component leads to performance gains in our image referential game setting.
arXiv Detail & Related papers (2023-03-02T18:59:46Z)
Whither the Priors for (Vocal) Interactivity? [6.709659274527638]
Speech-based communication is often cited as one of the most natural' ways in which humans and robots might interact. Despite this, the resulting interactions are anything but natural' It is argued here that such communication failures are indicative of a deeper malaise.
arXiv Detail & Related papers (2022-03-16T12:06:46Z)
Learning Language-Conditioned Robot Behavior from Offline Data and Crowd-Sourced Annotation [80.29069988090912]
We study the problem of learning a range of vision-based manipulation tasks from a large offline dataset of robot interaction. We propose to leverage offline robot datasets with crowd-sourced natural language labels. We find that our approach outperforms both goal-image specifications and language conditioned imitation techniques by more than 25%.
arXiv Detail & Related papers (2021-09-02T17:42:13Z)
Few-shot Language Coordination by Modeling Theory of Mind [95.54446989205117]
We study the task of few-shot $textitlanguage coordination$. We require the lead agent to coordinate with a $textitpopulation$ of agents with different linguistic abilities. This requires the ability to model the partner's beliefs, a vital component of human communication.
arXiv Detail & Related papers (2021-07-12T19:26:11Z)
SocialAI 0.1: Towards a Benchmark to Stimulate Research on Socio-Cognitive Abilities in Deep Reinforcement Learning Agents [23.719833581321033]
Building embodied autonomous agents capable of participating in social interactions with humans is one of the main challenges in AI. Current approaches focus on language as a communication tool in very simplified and non diverse social situations. We argue that aiming towards human-level AI requires a broader set of key social skills.
arXiv Detail & Related papers (2021-04-27T14:16:29Z)
Self-play for Data Efficient Language Acquisition [20.86261546611472]
We exploit the symmetric nature of communication in order to improve the efficiency and quality of language acquisition in learning agents. We show that using self-play as a substitute for direct supervision enables the agent to transfer its knowledge across roles.
arXiv Detail & Related papers (2020-10-10T02:09:19Z)
Speaker Independent and Multilingual/Mixlingual Speech-Driven Talking Head Generation Using Phonetic Posteriorgrams [58.617181880383605]
In this work, we propose a novel approach using phonetic posteriorgrams. Our method doesn't need hand-crafted features and is more robust to noise compared to recent approaches. Our model is the first to support multilingual/mixlingual speech as input with convincing results.
arXiv Detail & Related papers (2020-06-20T16:32:43Z)

This list is automatically generated from the titles and abstracts of the papers in this site.