Towards Robust Speech-to-Text Adversarial Attack
- URL: http://arxiv.org/abs/2103.08095v1
- Date: Mon, 15 Mar 2021 01:51:41 GMT
- Title: Towards Robust Speech-to-Text Adversarial Attack
- Authors: Mohammad Esmaeilpour and Patrick Cardinal and Alessandro Lameiras
Koerich
- Abstract summary: This paper introduces a novel adversarial algorithm for attacking the state-of-the-art speech-to-text systems, namely DeepSpeech, Kaldi, and Lingvo.
Our approach is based on developing an extension for the conventional distortion condition of the adversarial optimization formulation.
Minimizing over this metric, which measures the discrepancies between original and adversarial samples' distributions, contributes to crafting signals very close to the subspace of legitimate speech recordings.
- Score: 78.5097679815944
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper introduces a novel adversarial algorithm for attacking the
state-of-the-art speech-to-text systems, namely DeepSpeech, Kaldi, and Lingvo.
Our approach is based on developing an extension for the conventional
distortion condition of the adversarial optimization formulation using the
Cram\`er integral probability metric. Minimizing over this metric, which
measures the discrepancies between original and adversarial samples'
distributions, contributes to crafting signals very close to the subspace of
legitimate speech recordings. This helps to yield more robust adversarial
signals against playback over-the-air without employing neither costly
expectation over transformation operations nor static room impulse response
simulations. Our approach outperforms other targeted and non-targeted
algorithms in terms of word error rate and sentence-level-accuracy with
competitive performance on the crafted adversarial signals' quality. Compared
to seven other strong white and black-box adversarial attacks, our proposed
approach is considerably more resilient against multiple consecutive playbacks
over-the-air, corroborating its higher robustness in noisy environments.
Related papers
- DiffuseDef: Improved Robustness to Adversarial Attacks [38.34642687239535]
adversarial attacks pose a critical challenge to system built using pretrained language models.
We propose DiffuseDef, which incorporates a diffusion layer as a denoiser between the encoder and the classifier.
During inference, the adversarial hidden state is first combined with sampled noise, then denoised iteratively and finally ensembled to produce a robust text representation.
arXiv Detail & Related papers (2024-06-28T22:36:17Z) - Saliency Attention and Semantic Similarity-Driven Adversarial Perturbation [0.0]
Saliency Attention and Semantic Similarity driven adversarial Perturbation (SASSP) is designed to improve the effectiveness of contextual perturbations.
Our proposed approach incorporates a three-pronged strategy for word selection and perturbation.
SASSP has yielded a higher attack success rate and lower word perturbation rate.
arXiv Detail & Related papers (2024-06-18T14:07:27Z) - Mutual-modality Adversarial Attack with Semantic Perturbation [81.66172089175346]
We propose a novel approach that generates adversarial attacks in a mutual-modality optimization scheme.
Our approach outperforms state-of-the-art attack methods and can be readily deployed as a plug-and-play solution.
arXiv Detail & Related papers (2023-12-20T05:06:01Z) - RSD-GAN: Regularized Sobolev Defense GAN Against Speech-to-Text
Adversarial Attacks [9.868221447090853]
This paper introduces a new synthesis-based defense algorithm for counteracting adversarial attacks developed for challenging the performance of speech-to-text transcription systems.
Our algorithm implements a Sobolev-based GAN and proposes a novel regularizer for effectively controlling over the functionality of the entire generative model.
arXiv Detail & Related papers (2022-07-14T12:22:19Z) - Removing Adversarial Noise in Class Activation Feature Space [160.78488162713498]
We propose to remove adversarial noise by implementing a self-supervised adversarial training mechanism in a class activation feature space.
We train a denoising model to minimize the distances between the adversarial examples and the natural examples in the class activation feature space.
Empirical evaluations demonstrate that our method could significantly enhance adversarial robustness in comparison to previous state-of-the-art approaches.
arXiv Detail & Related papers (2021-04-19T10:42:24Z) - Multi-Discriminator Sobolev Defense-GAN Against Adversarial Attacks for
End-to-End Speech Systems [78.5097679815944]
This paper introduces a defense approach against end-to-end adversarial attacks developed for cutting-edge speech-to-text systems.
First, we represent speech signals with 2D spectrograms using the short-time Fourier transform.
Second, we iteratively find a safe vector using a spectrogram subspace projection operation.
Third, we synthesize a spectrogram with such a safe vector using a novel GAN architecture trained with Sobolev integral probability metric.
arXiv Detail & Related papers (2021-03-15T01:11:13Z) - Adjust-free adversarial example generation in speech recognition using
evolutionary multi-objective optimization under black-box condition [1.2944868613449219]
This paper proposes a black-box adversarial attack method to automatic speech recognition systems.
Experimental results showed that the proposed method successfully generated adjust-free adversarial examples.
arXiv Detail & Related papers (2020-12-21T06:35:52Z) - Class-Conditional Defense GAN Against End-to-End Speech Attacks [82.21746840893658]
We propose a novel approach against end-to-end adversarial attacks developed to fool advanced speech-to-text systems such as DeepSpeech and Lingvo.
Unlike conventional defense approaches, the proposed approach does not directly employ low-level transformations such as autoencoding a given input signal.
Our defense-GAN considerably outperforms conventional defense algorithms in terms of word error rate and sentence level recognition accuracy.
arXiv Detail & Related papers (2020-10-22T00:02:02Z) - Temporal Sparse Adversarial Attack on Sequence-based Gait Recognition [56.844587127848854]
We demonstrate that the state-of-the-art gait recognition model is vulnerable to such attacks.
We employ a generative adversarial network based architecture to semantically generate adversarial high-quality gait silhouettes or video frames.
The experimental results show that if only one-fortieth of the frames are attacked, the accuracy of the target model drops dramatically.
arXiv Detail & Related papers (2020-02-22T10:08:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.