Dompteur: Taming Audio Adversarial Examples
- URL: http://arxiv.org/abs/2102.05431v1
- Date: Wed, 10 Feb 2021 13:53:32 GMT
- Title: Dompteur: Taming Audio Adversarial Examples
- Authors: Thorsten Eisenhofer, Lea Sch\"onherr, Joel Frank, Lars Speckemeier,
Dorothea Kolossa, Thorsten Holz
- Abstract summary: Adversarial examples allow attackers to arbitrarily manipulate machine learning systems.
In this paper we propose a different perspective: We accept the presence of adversarial examples against ASR systems, but we require them to be perceivable by human listeners.
By applying the principles of psychoacoustics, we can remove semantically irrelevant information from the ASR input and train a model that resembles human perception more closely.
- Score: 28.54699912239861
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Adversarial examples seem to be inevitable. These specifically crafted inputs
allow attackers to arbitrarily manipulate machine learning systems. Even worse,
they often seem harmless to human observers. In our digital society, this poses
a significant threat. For example, Automatic Speech Recognition (ASR) systems,
which serve as hands-free interfaces to many kinds of systems, can be attacked
with inputs incomprehensible for human listeners. The research community has
unsuccessfully tried several approaches to tackle this problem.
In this paper we propose a different perspective: We accept the presence of
adversarial examples against ASR systems, but we require them to be perceivable
by human listeners. By applying the principles of psychoacoustics, we can
remove semantically irrelevant information from the ASR input and train a model
that resembles human perception more closely. We implement our idea in a tool
named Dompteur and demonstrate that our augmented system, in contrast to an
unmodified baseline, successfully focuses on perceptible ranges of the input
signal. This change forces adversarial examples into the audible range, while
using minimal computational overhead and preserving benign performance. To
evaluate our approach, we construct an adaptive attacker, which actively tries
to avoid our augmentations and demonstrate that adversarial examples from this
attacker remain clearly perceivable. Finally, we substantiate our claims by
performing a hearing test with crowd-sourced human listeners.
Related papers
- Among Us: Adversarially Robust Collaborative Perception by Consensus [50.73128191202585]
Multiple robots could perceive a scene (e.g., detect objects) collaboratively better than individuals.
We propose ROBOSAC, a novel sampling-based defense strategy generalizable to unseen attackers.
We validate our method on the task of collaborative 3D object detection in autonomous driving scenarios.
arXiv Detail & Related papers (2023-03-16T17:15:25Z) - Robots-Dont-Cry: Understanding Falsely Anthropomorphic Utterances in
Dialog Systems [64.10696852552103]
Highly anthropomorphic responses might make users uncomfortable or implicitly deceive them into thinking they are interacting with a human.
We collect human ratings on the feasibility of approximately 900 two-turn dialogs sampled from 9 diverse data sources.
arXiv Detail & Related papers (2022-10-22T12:10:44Z) - Push-Pull: Characterizing the Adversarial Robustness for Audio-Visual
Active Speaker Detection [88.74863771919445]
We reveal the vulnerability of AVASD models under audio-only, visual-only, and audio-visual adversarial attacks.
We also propose a novel audio-visual interaction loss (AVIL) for making attackers difficult to find feasible adversarial examples.
arXiv Detail & Related papers (2022-10-03T08:10:12Z) - Deepfake audio detection by speaker verification [79.99653758293277]
We propose a new detection approach that leverages only the biometric characteristics of the speaker, with no reference to specific manipulations.
The proposed approach can be implemented based on off-the-shelf speaker verification tools.
We test several such solutions on three popular test sets, obtaining good performance, high generalization ability, and high robustness to audio impairment.
arXiv Detail & Related papers (2022-09-28T13:46:29Z) - Illusory Attacks: Information-Theoretic Detectability Matters in Adversarial Attacks [76.35478518372692]
We introduce epsilon-illusory, a novel form of adversarial attack on sequential decision-makers.
Compared to existing attacks, we empirically find epsilon-illusory to be significantly harder to detect with automated methods.
Our findings suggest the need for better anomaly detectors, as well as effective hardware- and system-level defenses.
arXiv Detail & Related papers (2022-07-20T19:49:09Z) - Tubes Among Us: Analog Attack on Automatic Speaker Identification [37.42266692664095]
We show that a human is capable of producing analog adversarial examples directly with little cost and supervision.
Our findings extend to a range of other acoustic-biometric tasks such as liveness detection, bringing into question their use in security-critical settings in real life.
arXiv Detail & Related papers (2022-02-06T10:33:13Z) - Adversarial Robustness of Deep Reinforcement Learning based Dynamic
Recommender Systems [50.758281304737444]
We propose to explore adversarial examples and attack detection on reinforcement learning-based interactive recommendation systems.
We first craft different types of adversarial examples by adding perturbations to the input and intervening on the casual factors.
Then, we augment recommendation systems by detecting potential attacks with a deep learning-based classifier based on the crafted data.
arXiv Detail & Related papers (2021-12-02T04:12:24Z) - On the Exploitability of Audio Machine Learning Pipelines to
Surreptitious Adversarial Examples [19.433014444284595]
We introduce surreptitious adversarial examples, a new class of attacks that evades both human and pipeline controls.
We show that this attack produces audio samples that are more surreptitious than previous attacks that aim solely for imperceptibility.
arXiv Detail & Related papers (2021-08-03T16:21:08Z) - Audio Attacks and Defenses against AED Systems - A Practical Study [2.365611283869544]
We evaluate deep learning-based Audio Event Detection (AED) systems against evasion attacks through adversarial examples.
We generate audio adversarial examples using two different types of noise, namely background and white noise, that can be used by the adversary to evade detection.
We show that these countermeasures, when applied to audio input, can be successful.
arXiv Detail & Related papers (2021-06-14T13:42:49Z) - Towards Resistant Audio Adversarial Examples [0.0]
We find that due to flaws in the generation process, state-of-the-art adversarial example generation methods cause overfitting.
We devise an approach to mitigate this flaw and find that our method improves generation of adversarial examples with varying offsets.
arXiv Detail & Related papers (2020-10-14T16:04:02Z) - Can you hear me $\textit{now}$? Sensitive comparisons of human and
machine perception [3.8580784887142774]
We explore how this asymmetry can cause comparisons to misestimate the overlap in human and machine perception.
In five experiments, we adapt task designs from the human psychophysics literature to show that even when subjects cannot freely transcribe such speech commands, they often can demonstrate other forms of understanding.
We recommend the adoption of such "sensitive tests" when comparing human and machine perception.
arXiv Detail & Related papers (2020-03-27T16:24:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.