Mel Frequency Spectral Domain Defenses against Adversarial Attacks on
Speech Recognition Systems
- URL: http://arxiv.org/abs/2203.15283v1
- Date: Tue, 29 Mar 2022 06:58:26 GMT
- Title: Mel Frequency Spectral Domain Defenses against Adversarial Attacks on
Speech Recognition Systems
- Authors: Nicholas Mehlman, Anirudh Sreeram, Raghuveer Peri, Shrikanth Narayanan
- Abstract summary: This paper explores speech specific defenses using the mel spectral domain, and introduces a novel defense method called'mel domain noise flooding' (MDNF)
MDNF applies additive noise to the mel spectrogram of a speech utterance prior to re-synthesising the audio signal.
We test the defenses against strong white-box adversarial attacks such as projected gradient descent (PGD) and Carlini-Wagner (CW) attacks.
- Score: 33.21836814000979
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: A variety of recent works have looked into defenses for deep neural networks
against adversarial attacks particularly within the image processing domain.
Speech processing applications such as automatic speech recognition (ASR) are
increasingly relying on deep learning models, and so are also prone to
adversarial attacks. However, many of the defenses explored for ASR simply
adapt the image-domain defenses, which may not provide optimal robustness. This
paper explores speech specific defenses using the mel spectral domain, and
introduces a novel defense method called 'mel domain noise flooding' (MDNF).
MDNF applies additive noise to the mel spectrogram of a speech utterance prior
to re-synthesising the audio signal. We test the defenses against strong
white-box adversarial attacks such as projected gradient descent (PGD) and
Carlini-Wagner (CW) attacks, and show better robustness compared to a
randomized smoothing baseline across strong threat models.
Related papers
- Baseline Defenses for Adversarial Attacks Against Aligned Language
Models [109.75753454188705]
Recent work shows that text moderations can produce jailbreaking prompts that bypass defenses.
We look at three types of defenses: detection (perplexity based), input preprocessing (paraphrase and retokenization), and adversarial training.
We find that the weakness of existing discretes for text, combined with the relatively high costs of optimization, makes standard adaptive attacks more challenging for LLMs.
arXiv Detail & Related papers (2023-09-01T17:59:44Z) - The defender's perspective on automatic speaker verification: An
overview [87.83259209657292]
The reliability of automatic speaker verification (ASV) has been undermined by the emergence of spoofing attacks.
The aim of this paper is to provide a thorough and systematic overview of the defense methods used against these types of attacks.
arXiv Detail & Related papers (2023-05-22T08:01:59Z) - Interpretable Spectrum Transformation Attacks to Speaker Recognition [8.770780902627441]
A general framework is proposed to improve the transferability of adversarial voices to a black-box victim model.
The proposed framework operates voices in the time-frequency domain, which improves the interpretability, transferability, and imperceptibility of the attack.
arXiv Detail & Related papers (2023-02-21T14:12:29Z) - Perceptual-based deep-learning denoiser as a defense against adversarial
attacks on ASR systems [26.519207339530478]
Adversarial attacks attempt to force misclassification by adding small perturbations to the original speech signal.
We propose to counteract this by employing a neural-network based denoiser as a pre-processor in the ASR pipeline.
We found that training the denoisier using a perceptually motivated loss function resulted in increased adversarial robustness.
arXiv Detail & Related papers (2021-07-12T07:00:06Z) - Multi-Discriminator Sobolev Defense-GAN Against Adversarial Attacks for
End-to-End Speech Systems [78.5097679815944]
This paper introduces a defense approach against end-to-end adversarial attacks developed for cutting-edge speech-to-text systems.
First, we represent speech signals with 2D spectrograms using the short-time Fourier transform.
Second, we iteratively find a safe vector using a spectrogram subspace projection operation.
Third, we synthesize a spectrogram with such a safe vector using a novel GAN architecture trained with Sobolev integral probability metric.
arXiv Detail & Related papers (2021-03-15T01:11:13Z) - WaveGuard: Understanding and Mitigating Audio Adversarial Examples [12.010555227327743]
We introduce WaveGuard: a framework for detecting adversarial inputs crafted to attack ASR systems.
Our framework incorporates audio transformation functions and analyses the ASR transcriptions of the original and transformed audio to detect adversarial inputs.
arXiv Detail & Related papers (2021-03-04T21:44:37Z) - Cortical Features for Defense Against Adversarial Audio Attacks [55.61885805423492]
We propose using a computational model of the auditory cortex as a defense against adversarial attacks on audio.
We show that the cortical features help defend against universal adversarial examples.
arXiv Detail & Related papers (2021-01-30T21:21:46Z) - Online Alternate Generator against Adversarial Attacks [144.45529828523408]
Deep learning models are notoriously sensitive to adversarial examples which are synthesized by adding quasi-perceptible noises on real images.
We propose a portable defense method, online alternate generator, which does not need to access or modify the parameters of the target networks.
The proposed method works by online synthesizing another image from scratch for an input image, instead of removing or destroying adversarial noises.
arXiv Detail & Related papers (2020-09-17T07:11:16Z) - Defense for Black-box Attacks on Anti-spoofing Models by Self-Supervised
Learning [71.17774313301753]
We explore the robustness of self-supervised learned high-level representations by using them in the defense against adversarial attacks.
Experimental results on the ASVspoof 2019 dataset demonstrate that high-level representations extracted by Mockingjay can prevent the transferability of adversarial examples.
arXiv Detail & Related papers (2020-06-05T03:03:06Z) - Detecting Audio Attacks on ASR Systems with Dropout Uncertainty [40.9172128924305]
We show that our defense is able to detect attacks created through optimized perturbations and frequency masking.
We test our defense on Mozilla's CommonVoice dataset, the UrbanSound dataset, and an excerpt of the LibriSpeech dataset.
arXiv Detail & Related papers (2020-06-02T19:40:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.