Attack on practical speaker verification system using universal
adversarial perturbations
- URL: http://arxiv.org/abs/2105.09022v1
- Date: Wed, 19 May 2021 09:43:34 GMT
- Title: Attack on practical speaker verification system using universal
adversarial perturbations
- Authors: Weiyi Zhang, Shuning Zhao, Le Liu, Jianmin Li, Xingliang Cheng, Thomas
Fang Zheng, Xiaolin Hu
- Abstract summary: This work shows that by playing our crafted adversarial perturbation as a separate source when the adversary is speaking, the practical speaker verification system will misjudge the adversary as a target speaker.
A two-step algorithm is proposed to optimize the universal adversarial perturbation to be text-independent and has little effect on the authentication text recognition.
- Score: 20.38185341318529
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In authentication scenarios, applications of practical speaker verification
systems usually require a person to read a dynamic authentication text.
Previous studies played an audio adversarial example as a digital signal to
perform physical attacks, which would be easily rejected by audio replay
detection modules. This work shows that by playing our crafted adversarial
perturbation as a separate source when the adversary is speaking, the practical
speaker verification system will misjudge the adversary as a target speaker. A
two-step algorithm is proposed to optimize the universal adversarial
perturbation to be text-independent and has little effect on the authentication
text recognition. We also estimated room impulse response (RIR) in the
algorithm which allowed the perturbation to be effective after being played
over the air. In the physical experiment, we achieved targeted attacks with
success rate of 100%, while the word error rate (WER) on speech recognition was
only increased by 3.55%. And recorded audios could pass replay detection for
the live person speaking.
Related papers
- Deepfake audio detection by speaker verification [79.99653758293277]
We propose a new detection approach that leverages only the biometric characteristics of the speaker, with no reference to specific manipulations.
The proposed approach can be implemented based on off-the-shelf speaker verification tools.
We test several such solutions on three popular test sets, obtaining good performance, high generalization ability, and high robustness to audio impairment.
arXiv Detail & Related papers (2022-09-28T13:46:29Z) - On the Detection of Adaptive Adversarial Attacks in Speaker Verification
Systems [0.0]
adversarial attacks, such as FAKEBOB, can work effectively against speaker verification systems.
The goal of this paper is to design a detector that can distinguish an original audio from an audio contaminated by adversarial attacks.
We show that our proposed detector is easy to implement, fast to process an input audio, and effective in determining whether an audio is corrupted by FAKEBOB attacks.
arXiv Detail & Related papers (2022-02-11T16:02:06Z) - Personalized Keyphrase Detection using Speaker and Environment
Information [24.766475943042202]
We introduce a streaming keyphrase detection system that can be easily customized to accurately detect any phrase composed of words from a large vocabulary.
The system is implemented with an end-to-end trained automatic speech recognition (ASR) model and a text-independent speaker verification model.
arXiv Detail & Related papers (2021-04-28T18:50:19Z) - WaveGuard: Understanding and Mitigating Audio Adversarial Examples [12.010555227327743]
We introduce WaveGuard: a framework for detecting adversarial inputs crafted to attack ASR systems.
Our framework incorporates audio transformation functions and analyses the ASR transcriptions of the original and transformed audio to detect adversarial inputs.
arXiv Detail & Related papers (2021-03-04T21:44:37Z) - Multimodal Attention Fusion for Target Speaker Extraction [108.73502348754842]
We propose a novel attention mechanism for multi-modal fusion and its training methods.
Our proposals improve signal to distortion ratio (SDR) by 1.0 dB over conventional fusion mechanisms on simulated data.
arXiv Detail & Related papers (2021-02-02T05:59:35Z) - Cortical Features for Defense Against Adversarial Audio Attacks [55.61885805423492]
We propose using a computational model of the auditory cortex as a defense against adversarial attacks on audio.
We show that the cortical features help defend against universal adversarial examples.
arXiv Detail & Related papers (2021-01-30T21:21:46Z) - FoolHD: Fooling speaker identification by Highly imperceptible
adversarial Disturbances [63.80959552818541]
We propose a white-box steganography-inspired adversarial attack that generates imperceptible perturbations against a speaker identification model.
Our approach, FoolHD, uses a Gated Convolutional Autoencoder that operates in the DCT domain and is trained with a multi-objective loss function.
We validate FoolHD with a 250-speaker identification x-vector network, trained using VoxCeleb, in terms of accuracy, success rate, and imperceptibility.
arXiv Detail & Related papers (2020-11-17T07:38:26Z) - Speaker De-identification System using Autoencoders and Adversarial
Training [58.720142291102135]
We propose a speaker de-identification system based on adversarial training and autoencoders.
Experimental results show that combining adversarial learning and autoencoders increase the equal error rate of a speaker verification system.
arXiv Detail & Related papers (2020-11-09T19:22:05Z) - Integrated Replay Spoofing-aware Text-independent Speaker Verification [47.41124427552161]
We propose two approaches for building an integrated system of speaker verification and presentation attack detection.
The first approach simultaneously trains speaker identification, presentation attack detection, and the integrated system using multi-task learning.
We propose a back-end modular approach using a separate deep neural network (DNN) for speaker verification and presentation attack detection.
arXiv Detail & Related papers (2020-06-10T01:24:55Z) - Real-time, Universal, and Robust Adversarial Attacks Against Speaker
Recognition Systems [21.559732692440424]
We propose the first real-time, universal, and robust adversarial attack against the state-of-the-art deep neural network (DNN) based speaker recognition system.
Experiment using a public dataset of 109 English speakers demonstrates the effectiveness and robustness of our proposed attack with a high attack success rate of over 90%.
arXiv Detail & Related papers (2020-03-04T19:30:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.