Hidden in Plain Sound: Environmental Backdoor Poisoning Attacks on Whisper, and Mitigations
- URL: http://arxiv.org/abs/2409.12553v1
- Date: Thu, 19 Sep 2024 08:21:52 GMT
- Title: Hidden in Plain Sound: Environmental Backdoor Poisoning Attacks on Whisper, and Mitigations
- Authors: Jonatan Bartolini, Todor Stoyanov, Alberto Giaretta,
- Abstract summary: We propose a new poisoning approach that maps different environmental trigger sounds to target phrases of different lengths.
We test our approach on Whisper, one of the most popular transformer-based SR model, showing that it is highly vulnerable to our attack.
To mitigate the attack proposed in this paper, we investigate the use of Silero VAD, a state-of-the-art voice activity detection (VAD) model, as a defence mechanism.
- Score: 3.5639148953570836
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Thanks to the popularisation of transformer-based models, speech recognition (SR) is gaining traction in various application fields, such as industrial and robotics environments populated with mission-critical devices. While transformer-based SR can provide various benefits for simplifying human-machine interfacing, the research on the cybersecurity aspects of these models is lacklustre. In particular, concerning backdoor poisoning attacks. In this paper, we propose a new poisoning approach that maps different environmental trigger sounds to target phrases of different lengths, during the fine-tuning phase. We test our approach on Whisper, one of the most popular transformer-based SR model, showing that it is highly vulnerable to our attack, under several testing conditions. To mitigate the attack proposed in this paper, we investigate the use of Silero VAD, a state-of-the-art voice activity detection (VAD) model, as a defence mechanism. Our experiments show that it is possible to use VAD models to filter out malicious triggers and mitigate our attacks, with a varying degree of success, depending on the type of trigger sound and testing conditions.
Related papers
- Breaking Free: How to Hack Safety Guardrails in Black-Box Diffusion Models! [52.0855711767075]
EvoSeed is an evolutionary strategy-based algorithmic framework for generating photo-realistic natural adversarial samples.
We employ CMA-ES to optimize the search for an initial seed vector, which, when processed by the Conditional Diffusion Model, results in the natural adversarial sample misclassified by the Model.
Experiments show that generated adversarial images are of high image quality, raising concerns about generating harmful content bypassing safety classifiers.
arXiv Detail & Related papers (2024-02-07T09:39:29Z) - The Art of Deception: Robust Backdoor Attack using Dynamic Stacking of Triggers [0.0]
Recent research has uncovered that auditory backdoors may use certain modifications as their initiating mechanism.
DynamicTrigger is introduced as a methodology for carrying out dynamic backdoor attacks.
By utilizing fluctuating signal sampling rates and masking speaker identities through dynamic sound triggers, it is possible to deceive speech recognition systems.
arXiv Detail & Related papers (2024-01-03T04:31:59Z) - Exploring Model Dynamics for Accumulative Poisoning Discovery [62.08553134316483]
We propose a novel information measure, namely, Memorization Discrepancy, to explore the defense via the model-level information.
By implicitly transferring the changes in the data manipulation to that in the model outputs, Memorization Discrepancy can discover the imperceptible poison samples.
We thoroughly explore its properties and propose Discrepancy-aware Sample Correction (DSC) to defend against accumulative poisoning attacks.
arXiv Detail & Related papers (2023-06-06T14:45:24Z) - Backdoor Attacks Against Deep Image Compression via Adaptive Frequency
Trigger [106.10954454667757]
We present a novel backdoor attack with multiple triggers against learned image compression models.
Motivated by the widely used discrete cosine transform (DCT) in existing compression systems and standards, we propose a frequency-based trigger injection model.
arXiv Detail & Related papers (2023-02-28T15:39:31Z) - Backdoor Attacks for Remote Sensing Data with Wavelet Transform [14.50261153230204]
In this paper, we provide a systematic analysis of backdoor attacks for remote sensing data.
We propose a novel wavelet transform-based attack (WABA) method, which can achieve invisible attacks by injecting the trigger image into the poisoned image.
Despite its simplicity, the proposed method can significantly cheat the current state-of-the-art deep learning models with a high attack success rate.
arXiv Detail & Related papers (2022-11-15T10:49:49Z) - Leveraging Domain Features for Detecting Adversarial Attacks Against
Deep Speech Recognition in Noise [18.19207291891767]
adversarial attacks against deep ASR systems are highly successful.
This work leverages filter bank-based features to better capture the characteristics of attacks for improved detection.
Inverse filter bank features generally perform better in both clean and noisy environments.
arXiv Detail & Related papers (2022-11-03T07:25:45Z) - Push-Pull: Characterizing the Adversarial Robustness for Audio-Visual
Active Speaker Detection [88.74863771919445]
We reveal the vulnerability of AVASD models under audio-only, visual-only, and audio-visual adversarial attacks.
We also propose a novel audio-visual interaction loss (AVIL) for making attackers difficult to find feasible adversarial examples.
arXiv Detail & Related papers (2022-10-03T08:10:12Z) - Imperceptible and Robust Backdoor Attack in 3D Point Cloud [62.992167285646275]
We propose a novel imperceptible and robust backdoor attack (IRBA) to tackle this challenge.
We utilize a nonlinear and local transformation, called weighted local transformation (WLT), to construct poisoned samples with unique transformations.
Experiments on three benchmark datasets and four models show that IRBA achieves 80%+ ASR in most cases even with pre-processing techniques.
arXiv Detail & Related papers (2022-08-17T03:53:10Z) - Robustifying automatic speech recognition by extracting slowly varying features [16.74051650034954]
We propose a defense mechanism against targeted adversarial attacks.
We use hybrid ASR models trained on data pre-processed in such a way.
Our model shows a performance on clean data similar to the baseline model, while being more than four times more robust.
arXiv Detail & Related papers (2021-12-14T13:50:23Z) - Defense for Black-box Attacks on Anti-spoofing Models by Self-Supervised
Learning [71.17774313301753]
We explore the robustness of self-supervised learned high-level representations by using them in the defense against adversarial attacks.
Experimental results on the ASVspoof 2019 dataset demonstrate that high-level representations extracted by Mockingjay can prevent the transferability of adversarial examples.
arXiv Detail & Related papers (2020-06-05T03:03:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.