Spotting adversarial samples for speaker verification by neural vocoders
- URL: http://arxiv.org/abs/2107.00309v2
- Date: Fri, 2 Jul 2021 07:47:17 GMT
- Title: Spotting adversarial samples for speaker verification by neural vocoders
- Authors: Haibin Wu, Po-chun Hsu, Ji Gao, Shanshan Zhang, Shen Huang, Jian Kang,
Zhiyong Wu, Helen Meng, Hung-yi Lee
- Abstract summary: We adopt neural vocoders to spot adversarial samples for automatic speaker verification (ASV)
We find that the difference between the ASV scores for the original and re-synthesize audio is a good indicator for discrimination between genuine and adversarial samples.
Our codes will be made open-source for future works to do comparison.
- Score: 102.1486475058963
- License: http://creativecommons.org/publicdomain/zero/1.0/
- Abstract: Automatic speaker verification (ASV), one of the most important technology
for biometric identification, has been widely adopted in security-critical
applications, including transaction authentication and access control. However,
previous work has shown that ASV is seriously vulnerable to recently emerged
adversarial attacks, yet effective countermeasures against them are limited. In
this paper, we adopt neural vocoders to spot adversarial samples for ASV. We
use the neural vocoder to re-synthesize audio and find that the difference
between the ASV scores for the original and re-synthesized audio is a good
indicator for discrimination between genuine and adversarial samples. This
effort is, to the best of our knowledge, among the first to pursue such a
technical direction for detecting adversarial samples for ASV, and hence there
is a lack of established baselines for comparison. Consequently, we implement
the Griffin-Lim algorithm as the detection baseline. The proposed approach
achieves effective detection performance that outperforms all the baselines in
all the settings. We also show that the neural vocoder adopted in the detection
framework is dataset-independent. Our codes will be made open-source for future
works to do comparison.
Related papers
- Toward Improving Synthetic Audio Spoofing Detection Robustness via Meta-Learning and Disentangled Training With Adversarial Examples [33.445126880876415]
We propose a reliable and robust spoofing detection system to filter out spoofing attacks instead of having them reach the automatic speaker verification system.
A weighted additive angular margin loss is proposed to address the data imbalance issue, and different margins has been assigned to improve generalization to unseen spoofing attacks.
We craft adversarial examples by adding imperceptible perturbations to spoofing speech as a data augmentation strategy, then we use an auxiliary batch normalization to guarantee that corresponding normalization statistics are performed exclusively on the adversarial examples.
arXiv Detail & Related papers (2024-08-23T19:26:54Z) - LMD: A Learnable Mask Network to Detect Adversarial Examples for Speaker
Verification [17.968334617708244]
We propose an attacker-independent and interpretable method to separate adversarial examples from the genuine ones.
A core component of the score variation detector is to generate the masked spectrogram by a neural network.
Our proposed method outperforms five state-of-the-art baselines.
arXiv Detail & Related papers (2022-11-02T02:03:53Z) - Voice Spoofing Countermeasures: Taxonomy, State-of-the-art, experimental
analysis of generalizability, open challenges, and the way forward [2.393661358372807]
We conduct a review of the literature on spoofing detection using hand-crafted features, deep learning, end-to-end, and universal spoofing countermeasure solutions.
We report the performance of these countermeasures on several datasets and evaluate them across corpora.
arXiv Detail & Related papers (2022-10-02T03:53:37Z) - Voting for the right answer: Adversarial defense for speaker
verification [79.10523688806852]
ASV is under the radar of adversarial attacks, which are similar to their original counterparts from human's perception.
We propose the idea of "voting for the right answer" to prevent risky decisions of ASV in blind spot areas.
Experimental results show that our proposed method improves the robustness against both the limited-knowledge attackers.
arXiv Detail & Related papers (2021-06-15T04:05:28Z) - Improving the Adversarial Robustness for Speaker Verification by Self-Supervised Learning [95.60856995067083]
This work is among the first to perform adversarial defense for ASV without knowing the specific attack algorithms.
We propose to perform adversarial defense from two perspectives: 1) adversarial perturbation purification and 2) adversarial perturbation detection.
Experimental results show that our detection module effectively shields the ASV by detecting adversarial samples with an accuracy of around 80%.
arXiv Detail & Related papers (2021-06-01T07:10:54Z) - DAAIN: Detection of Anomalous and Adversarial Input using Normalizing
Flows [52.31831255787147]
We introduce a novel technique, DAAIN, to detect out-of-distribution (OOD) inputs and adversarial attacks (AA)
Our approach monitors the inner workings of a neural network and learns a density estimator of the activation distribution.
Our model can be trained on a single GPU making it compute efficient and deployable without requiring specialized accelerators.
arXiv Detail & Related papers (2021-05-30T22:07:13Z) - Discriminative Nearest Neighbor Few-Shot Intent Detection by
Transferring Natural Language Inference [150.07326223077405]
Few-shot learning is attracting much attention to mitigate data scarcity.
We present a discriminative nearest neighbor classification with deep self-attention.
We propose to boost the discriminative ability by transferring a natural language inference (NLI) model.
arXiv Detail & Related papers (2020-10-25T00:39:32Z) - Unsupervised Domain Adaptation for Acoustic Scene Classification Using
Band-Wise Statistics Matching [69.24460241328521]
Machine learning algorithms can be negatively affected by mismatches between training (source) and test (target) data distributions.
We propose an unsupervised domain adaptation method that consists of aligning the first- and second-order sample statistics of each frequency band of target-domain acoustic scenes to the ones of the source-domain training dataset.
We show that the proposed method outperforms the state-of-the-art unsupervised methods found in the literature in terms of both source- and target-domain classification accuracy.
arXiv Detail & Related papers (2020-04-30T23:56:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.