On the human evaluation of audio adversarial examples
- URL: http://arxiv.org/abs/2001.08444v2
- Date: Fri, 12 Feb 2021 14:27:20 GMT
- Title: On the human evaluation of audio adversarial examples
- Authors: Jon Vadillo and Roberto Santana
- Abstract summary: adversarial examples are inputs intentionally perturbed to produce a wrong prediction without being noticed.
High fooling rates of proposed adversarial perturbation strategies are only valuable if the perturbations are not detectable.
We demonstrate that the metrics employed by convention are not a reliable measure of the perceptual similarity of adversarial examples in the audio domain.
- Score: 1.7006003864727404
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Human-machine interaction is increasingly dependent on speech communication.
Machine Learning models are usually applied to interpret human speech commands.
However, these models can be fooled by adversarial examples, which are inputs
intentionally perturbed to produce a wrong prediction without being noticed.
While much research has been focused on developing new techniques to generate
adversarial perturbations, less attention has been given to aspects that
determine whether and how the perturbations are noticed by humans. This
question is relevant since high fooling rates of proposed adversarial
perturbation strategies are only valuable if the perturbations are not
detectable. In this paper we investigate to which extent the distortion metrics
proposed in the literature for audio adversarial examples, and which are
commonly applied to evaluate the effectiveness of methods for generating these
attacks, are a reliable measure of the human perception of the perturbations.
Using an analytical framework, and an experiment in which 18 subjects evaluate
audio adversarial examples, we demonstrate that the metrics employed by
convention are not a reliable measure of the perceptual similarity of
adversarial examples in the audio domain.
Related papers
- On the Effect of Adversarial Training Against Invariance-based
Adversarial Examples [0.23624125155742057]
This work addresses the impact of adversarial training with invariance-based adversarial examples on a convolutional neural network (CNN)
We show that when adversarial training with invariance-based and perturbation-based adversarial examples is applied, it should be conducted simultaneously and not consecutively.
arXiv Detail & Related papers (2023-02-16T12:35:37Z) - Empirical Estimates on Hand Manipulation are Recoverable: A Step Towards
Individualized and Explainable Robotic Support in Everyday Activities [80.37857025201036]
Key challenge for robotic systems is to figure out the behavior of another agent.
Processing correct inferences is especially challenging when (confounding) factors are not controlled experimentally.
We propose equipping robots with the necessary tools to conduct observational studies on people.
arXiv Detail & Related papers (2022-01-27T22:15:56Z) - A Frequency Perspective of Adversarial Robustness [72.48178241090149]
We present a frequency-based understanding of adversarial examples, supported by theoretical and empirical findings.
Our analysis shows that adversarial examples are neither in high-frequency nor in low-frequency components, but are simply dataset dependent.
We propose a frequency-based explanation for the commonly observed accuracy vs. robustness trade-off.
arXiv Detail & Related papers (2021-10-26T19:12:34Z) - TREATED:Towards Universal Defense against Textual Adversarial Attacks [28.454310179377302]
We propose TREATED, a universal adversarial detection method that can defend against attacks of various perturbation levels without making any assumptions.
Extensive experiments on three competitive neural networks and two widely used datasets show that our method achieves better detection performance than baselines.
arXiv Detail & Related papers (2021-09-13T03:31:20Z) - On the Exploitability of Audio Machine Learning Pipelines to
Surreptitious Adversarial Examples [19.433014444284595]
We introduce surreptitious adversarial examples, a new class of attacks that evades both human and pipeline controls.
We show that this attack produces audio samples that are more surreptitious than previous attacks that aim solely for imperceptibility.
arXiv Detail & Related papers (2021-08-03T16:21:08Z) - Residual Error: a New Performance Measure for Adversarial Robustness [85.0371352689919]
A major challenge that limits the wide-spread adoption of deep learning has been their fragility to adversarial attacks.
This study presents the concept of residual error, a new performance measure for assessing the adversarial robustness of a deep neural network.
Experimental results using the case of image classification demonstrate the effectiveness and efficacy of the proposed residual error metric.
arXiv Detail & Related papers (2021-06-18T16:34:23Z) - Towards Defending against Adversarial Examples via Attack-Invariant
Features [147.85346057241605]
Deep neural networks (DNNs) are vulnerable to adversarial noise.
adversarial robustness can be improved by exploiting adversarial examples.
Models trained on seen types of adversarial examples generally cannot generalize well to unseen types of adversarial examples.
arXiv Detail & Related papers (2021-06-09T12:49:54Z) - Removing Adversarial Noise in Class Activation Feature Space [160.78488162713498]
We propose to remove adversarial noise by implementing a self-supervised adversarial training mechanism in a class activation feature space.
We train a denoising model to minimize the distances between the adversarial examples and the natural examples in the class activation feature space.
Empirical evaluations demonstrate that our method could significantly enhance adversarial robustness in comparison to previous state-of-the-art approaches.
arXiv Detail & Related papers (2021-04-19T10:42:24Z) - Adversarial Examples Detection beyond Image Space [88.7651422751216]
We find that there exists compliance between perturbations and prediction confidence, which guides us to detect few-perturbation attacks from the aspect of prediction confidence.
We propose a method beyond image space by a two-stream architecture, in which the image stream focuses on the pixel artifacts and the gradient stream copes with the confidence artifacts.
arXiv Detail & Related papers (2021-02-23T09:55:03Z) - Decoupling entrainment from consistency using deep neural networks [14.823143667165382]
Isolating the effect of consistency, i.e., speakers adhering to their individual styles, is a critical part of the analysis of entrainment.
We propose to treat speakers' initial vocal features as confounds for the prediction of subsequent outputs.
Using two existing neural approaches to deconfounding, we define new measures of entrainment that control for consistency.
arXiv Detail & Related papers (2020-11-03T17:30:05Z) - Metrics and methods for robustness evaluation of neural networks with
generative models [0.07366405857677225]
Recently, especially in computer vision, researchers discovered "natural" or "semantic" perturbations, such as rotations, changes of brightness, or more high-level changes.
We propose several metrics to measure robustness of classifiers to natural adversarial examples, and methods to evaluate them.
arXiv Detail & Related papers (2020-03-04T10:58:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.