Dictionary Attacks on Speaker Verification
- URL: http://arxiv.org/abs/2204.11304v1
- Date: Sun, 24 Apr 2022 15:31:41 GMT
- Title: Dictionary Attacks on Speaker Verification
- Authors: Mirko Marras, Pawel Korus, Anubhav Jain, Nasir Memon
- Abstract summary: We introduce a generic formulation of the attack that can be used with various speech representations and threat models.
The attacker uses adversarial optimization to maximize raw similarity of speaker embeddings between a seed speech sample and a proxy population.
We show that, combined with multiple attempts, this attack opens even more to serious issues on the security of these systems.
- Score: 15.00667613025837
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we propose dictionary attacks against speaker verification - a
novel attack vector that aims to match a large fraction of speaker population
by chance. We introduce a generic formulation of the attack that can be used
with various speech representations and threat models. The attacker uses
adversarial optimization to maximize raw similarity of speaker embeddings
between a seed speech sample and a proxy population. The resulting master voice
successfully matches a non-trivial fraction of people in an unknown population.
Adversarial waveforms obtained with our approach can match on average 69% of
females and 38% of males enrolled in the target system at a strict decision
threshold calibrated to yield false alarm rate of 1%. By using the attack with
a black-box voice cloning system, we obtain master voices that are effective in
the most challenging conditions and transferable between speaker encoders. We
also show that, combined with multiple attempts, this attack opens even more to
serious issues on the security of these systems.
Related papers
- Interpretable Spectrum Transformation Attacks to Speaker Recognition [8.770780902627441]
A general framework is proposed to improve the transferability of adversarial voices to a black-box victim model.
The proposed framework operates voices in the time-frequency domain, which improves the interpretability, transferability, and imperceptibility of the attack.
arXiv Detail & Related papers (2023-02-21T14:12:29Z) - SAMO: Speaker Attractor Multi-Center One-Class Learning for Voice
Anti-Spoofing [22.47152800242178]
Anti-spoofing systems are crucial auxiliaries for automatic speaker verification (ASV) systems.
We propose speaker attractor multi-center one-class learning (SAMO), which clusters bona fide speech around a number of speaker attractors.
Our proposed system outperforms existing state-of-the-art single systems with a relative improvement of 38% on equal error rate (EER) on the ASVspoof 2019 LA evaluation set.
arXiv Detail & Related papers (2022-11-04T19:31:33Z) - Symmetric Saliency-based Adversarial Attack To Speaker Identification [17.087523686496958]
We propose a novel generation-network-based approach, called symmetric saliency-based encoder-decoder (SSED)
First, it uses a novel saliency map decoder to learn the importance of speech samples to the decision of a targeted speaker identification system.
Second, it proposes an angular loss function to push the speaker embedding far away from the source speaker.
arXiv Detail & Related papers (2022-10-30T08:54:02Z) - Attack on practical speaker verification system using universal
adversarial perturbations [20.38185341318529]
This work shows that by playing our crafted adversarial perturbation as a separate source when the adversary is speaking, the practical speaker verification system will misjudge the adversary as a target speaker.
A two-step algorithm is proposed to optimize the universal adversarial perturbation to be text-independent and has little effect on the authentication text recognition.
arXiv Detail & Related papers (2021-05-19T09:43:34Z) - Towards Robust Speech-to-Text Adversarial Attack [78.5097679815944]
This paper introduces a novel adversarial algorithm for attacking the state-of-the-art speech-to-text systems, namely DeepSpeech, Kaldi, and Lingvo.
Our approach is based on developing an extension for the conventional distortion condition of the adversarial optimization formulation.
Minimizing over this metric, which measures the discrepancies between original and adversarial samples' distributions, contributes to crafting signals very close to the subspace of legitimate speech recordings.
arXiv Detail & Related papers (2021-03-15T01:51:41Z) - Cortical Features for Defense Against Adversarial Audio Attacks [55.61885805423492]
We propose using a computational model of the auditory cortex as a defense against adversarial attacks on audio.
We show that the cortical features help defend against universal adversarial examples.
arXiv Detail & Related papers (2021-01-30T21:21:46Z) - FoolHD: Fooling speaker identification by Highly imperceptible
adversarial Disturbances [63.80959552818541]
We propose a white-box steganography-inspired adversarial attack that generates imperceptible perturbations against a speaker identification model.
Our approach, FoolHD, uses a Gated Convolutional Autoencoder that operates in the DCT domain and is trained with a multi-objective loss function.
We validate FoolHD with a 250-speaker identification x-vector network, trained using VoxCeleb, in terms of accuracy, success rate, and imperceptibility.
arXiv Detail & Related papers (2020-11-17T07:38:26Z) - Speaker De-identification System using Autoencoders and Adversarial
Training [58.720142291102135]
We propose a speaker de-identification system based on adversarial training and autoencoders.
Experimental results show that combining adversarial learning and autoencoders increase the equal error rate of a speaker verification system.
arXiv Detail & Related papers (2020-11-09T19:22:05Z) - Backdoor Attack against Speaker Verification [86.43395230456339]
We show that it is possible to inject the hidden backdoor for infecting speaker verification models by poisoning the training data.
We also demonstrate that existing backdoor attacks cannot be directly adopted in attacking speaker verification.
arXiv Detail & Related papers (2020-10-22T11:10:08Z) - VenoMave: Targeted Poisoning Against Speech Recognition [30.448709704880518]
VENOMAVE is the first training-time poisoning attack against speech recognition.
We evaluate our attack on two datasets: TIDIGITS and Speech Commands.
arXiv Detail & Related papers (2020-10-21T00:30:08Z) - Improving speaker discrimination of target speech extraction with
time-domain SpeakerBeam [100.95498268200777]
SpeakerBeam exploits an adaptation utterance of the target speaker to extract his/her voice characteristics.
SpeakerBeam sometimes fails when speakers have similar voice characteristics, such as in same-gender mixtures.
We show experimentally that these strategies greatly improve speech extraction performance, especially for same-gender mixtures.
arXiv Detail & Related papers (2020-01-23T05:36:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.