Detecting Distrust Towards the Skills of a Virtual Assistant Using
Speech
- URL: http://arxiv.org/abs/2007.15711v1
- Date: Thu, 30 Jul 2020 19:56:17 GMT
- Title: Detecting Distrust Towards the Skills of a Virtual Assistant Using
Speech
- Authors: Leonardo Pepino, Pablo Riera, Lara Gauder, Agust\'in Gravano, Luciana
Ferrer
- Abstract summary: We study the feasibility of automatically detecting the level of trust that a user has on a virtual assistant (VA) based on their speech.
We find that the subject's speech can be used to detect which type of VA they were using, which could be considered a proxy for the user's trust toward the VA's abilities.
- Score: 8.992916975952477
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Research has shown that trust is an essential aspect of human-computer
interaction directly determining the degree to which the person is willing to
use the system. An automatic prediction of the level of trust that a user has
on a certain system could be used to attempt to correct potential distrust by
having the system take relevant actions like, for example, explaining its
actions more thoroughly. In this work, we explore the feasibility of
automatically detecting the level of trust that a user has on a virtual
assistant (VA) based on their speech. We use a dataset collected for this
purpose, containing human-computer speech interactions where subjects were
asked to answer various factual questions with the help of a virtual assistant,
which they were led to believe was either very reliable or unreliable. We find
that the subject's speech can be used to detect which type of VA they were
using, which could be considered a proxy for the user's trust toward the VA's
abilities, with an accuracy up to 76\%, compared to a random baseline of 50\%.
These results are obtained using features that have been previously found
useful for detecting speech directed to infants and non-native speakers.
Related papers
- Predictive Speech Recognition and End-of-Utterance Detection Towards Spoken Dialog Systems [55.99999020778169]
We study a function that can predict the forthcoming words and estimate the time remaining until the end of an utterance.
We develop a cross-attention-based algorithm that incorporates both acoustic and linguistic information.
Results demonstrate the proposed model's ability to predict upcoming words and estimate future EOU events up to 300ms prior to the actual EOU.
arXiv Detail & Related papers (2024-09-30T06:29:58Z) - An Evaluation of Explanation Methods for Black-Box Detectors of Machine-Generated Text [2.1439084103679273]
This study conducts the first systematic evaluation of explanation quality for detectors of machine-generated text.
We use a dataset of ChatGPT-generated and human-written documents, and pair predictions of three existing language-model-based detectors with the corresponding explanations.
We find that SHAP performs best in terms of faithfulness, stability, and in helping users to predict the detector's behavior.
arXiv Detail & Related papers (2024-08-26T13:14:26Z) - Explainable Attribute-Based Speaker Verification [12.941187430993796]
We propose an attribute-based explainable speaker verification (SV) system.
It identifies speakers by comparing personal attributes such as gender, nationality, and age extracted automatically from voice recordings.
We believe this approach better aligns with human reasoning, making it more understandable than traditional methods.
arXiv Detail & Related papers (2024-05-30T08:04:28Z) - User-Centered Security in Natural Language Processing [0.7106986689736825]
dissertation proposes a framework of user-centered security in Natural Language Processing (NLP)
It focuses on two security domains within NLP with great public interest.
arXiv Detail & Related papers (2023-01-10T22:34:19Z) - Attribute Inference Attack of Speech Emotion Recognition in Federated
Learning Settings [56.93025161787725]
Federated learning (FL) is a distributed machine learning paradigm that coordinates clients to train a model collaboratively without sharing local data.
We propose an attribute inference attack framework that infers sensitive attribute information of the clients from shared gradients or model parameters.
We show that the attribute inference attack is achievable for SER systems trained using FL.
arXiv Detail & Related papers (2021-12-26T16:50:42Z) - Spotting adversarial samples for speaker verification by neural vocoders [102.1486475058963]
We adopt neural vocoders to spot adversarial samples for automatic speaker verification (ASV)
We find that the difference between the ASV scores for the original and re-synthesize audio is a good indicator for discrimination between genuine and adversarial samples.
Our codes will be made open-source for future works to do comparison.
arXiv Detail & Related papers (2021-07-01T08:58:16Z) - Voting for the right answer: Adversarial defense for speaker
verification [79.10523688806852]
ASV is under the radar of adversarial attacks, which are similar to their original counterparts from human's perception.
We propose the idea of "voting for the right answer" to prevent risky decisions of ASV in blind spot areas.
Experimental results show that our proposed method improves the robustness against both the limited-knowledge attackers.
arXiv Detail & Related papers (2021-06-15T04:05:28Z) - A Study on the Manifestation of Trust in Speech [12.057694908317991]
We explore the feasibility of automatically detecting the level of trust that a user has on a virtual assistant (VA) based on their speech.
We developed a novel protocol for collecting speech data from subjects induced to have different degrees of trust in the skills of a VA.
We show clear evidence that the protocol effectively succeeded in influencing subjects into the desired mental state of either trusting or distrusting the agent's skills.
arXiv Detail & Related papers (2021-02-09T13:08:54Z) - Adversarial Disentanglement of Speaker Representation for
Attribute-Driven Privacy Preservation [17.344080729609026]
We introduce the concept of attribute-driven privacy preservation in speaker voice representation.
It allows a person to hide one or more personal aspects to a potential malicious interceptor and to the application provider.
We propose an adversarial autoencoding method that disentangles in the voice representation a given speaker attribute thus allowing its concealment.
arXiv Detail & Related papers (2020-12-08T14:47:23Z) - Speaker De-identification System using Autoencoders and Adversarial
Training [58.720142291102135]
We propose a speaker de-identification system based on adversarial training and autoencoders.
Experimental results show that combining adversarial learning and autoencoders increase the equal error rate of a speaker verification system.
arXiv Detail & Related papers (2020-11-09T19:22:05Z) - Speech Enhancement using Self-Adaptation and Multi-Head Self-Attention [70.82604384963679]
This paper investigates a self-adaptation method for speech enhancement using auxiliary speaker-aware features.
We extract a speaker representation used for adaptation directly from the test utterance.
arXiv Detail & Related papers (2020-02-14T05:05:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.