Channel-wise Gated Res2Net: Towards Robust Detection of Synthetic Speech
Attacks
- URL: http://arxiv.org/abs/2107.08803v1
- Date: Mon, 19 Jul 2021 12:27:40 GMT
- Title: Channel-wise Gated Res2Net: Towards Robust Detection of Synthetic Speech
Attacks
- Authors: Xu Li, Xixin Wu, Hui Lu, Xunying Liu, Helen Meng
- Abstract summary: Existing approaches for anti-spoofing in automatic speaker verification (ASV) still lack generalizability to unseen attacks.
We present a novel, channel-wise gated Res2Net (CG-Res2Net), which modifies Res2Net to enable a channel-wise gating mechanism.
- Score: 67.7648985513978
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Existing approaches for anti-spoofing in automatic speaker verification (ASV)
still lack generalizability to unseen attacks. The Res2Net approach designs a
residual-like connection between feature groups within one block, which
increases the possible receptive fields and improves the system's detection
generalizability. However, such a residual-like connection is performed by a
direct addition between feature groups without channel-wise priority. We argue
that the information across channels may not contribute to spoofing cues
equally, and the less relevant channels are expected to be suppressed before
adding onto the next feature group, so that the system can generalize better to
unseen attacks. This argument motivates the current work that presents a novel,
channel-wise gated Res2Net (CG-Res2Net), which modifies Res2Net to enable a
channel-wise gating mechanism in the connection between feature groups. This
gating mechanism dynamically selects channel-wise features based on the input,
to suppress the less relevant channels and enhance the detection
generalizability. Three gating mechanisms with different structures are
proposed and integrated into Res2Net. Experimental results conducted on
ASVspoof 2019 logical access (LA) demonstrate that the proposed CG-Res2Net
significantly outperforms Res2Net on both the overall LA evaluation set and
individual difficult unseen attacks, which also outperforms other
state-of-the-art single systems, depicting the effectiveness of our method.
Related papers
- Joint Channel Estimation and Feedback with Masked Token Transformers in
Massive MIMO Systems [74.52117784544758]
This paper proposes an encoder-decoder based network that unveils the intrinsic frequency-domain correlation within the CSI matrix.
The entire encoder-decoder network is utilized for channel compression.
Our method outperforms state-of-the-art channel estimation and feedback techniques in joint tasks.
arXiv Detail & Related papers (2023-06-08T06:15:17Z) - Searching for Network Width with Bilaterally Coupled Network [75.43658047510334]
We introduce a new supernet called Bilaterally Coupled Network (BCNet) to address this issue.
In BCNet, each channel is fairly trained and responsible for the same amount of network widths, thus each network width can be evaluated more accurately.
We propose the first open-source width benchmark on macro structures named Channel-Bench-Macro for the better comparison of width search algorithms.
arXiv Detail & Related papers (2022-03-25T15:32:46Z) - PWG-IDS: An Intrusion Detection Model for Solving Class Imbalance in
IIoT Networks Using Generative Adversarial Networks [13.552023164115138]
Pretraining Wasserstein generative adversarial network intrusion detection system (PWG-IDS) is proposed in this paper.
We use LightGBM as the classification algorithm to detect attack traffic in IIoT networks.
Our proposed PWG-IDS outperforms other models, with F1-scores of 99% and 89% on the 2 datasets.
arXiv Detail & Related papers (2021-10-06T02:34:50Z) - Group Fisher Pruning for Practical Network Compression [58.25776612812883]
We present a general channel pruning approach that can be applied to various complicated structures.
We derive a unified metric based on Fisher information to evaluate the importance of a single channel and coupled channels.
Our method can be used to prune any structures including those with coupled channels.
arXiv Detail & Related papers (2021-08-02T08:21:44Z) - Deep Learning Based RIS Channel Extrapolation with Element-grouping [61.18715079535163]
We consider the acquisition of the cascaded channels, which is a challenging task due to the massive number of passive RIS elements.
We adopt the element-grouping strategy, where each element in one group shares the same reflection coefficient and is assumed to have the same channel condition.
We analyze the channel interference caused by the element-grouping strategy and further design two deep learning based networks.
arXiv Detail & Related papers (2021-05-14T14:24:54Z) - Efficient Two-Stream Network for Violence Detection Using Separable
Convolutional LSTM [0.0]
We propose an efficient two-stream deep learning architecture leveraging Separable Convolutional LSTM (SepConvLSTM) and pre-trained MobileNet.
SepConvLSTM is constructed by replacing convolution operation at each gate of ConvLSTM with a depthwise separable convolution.
Our model outperforms the accuracy on the larger and more challenging RWF-2000 dataset by more than a 2% margin.
arXiv Detail & Related papers (2021-02-21T12:01:48Z) - PRVNet: A Novel Partially-Regularized Variational Autoencoders for
Massive MIMO CSI Feedback [15.972209500908642]
In a multiple-input multiple-output frequency-division duplexing (MIMO-FDD) system, the user equipment (UE) sends the downlink channel state information (CSI) to the base station to report link status.
In this paper, we introduce PRVNet, a neural network architecture inspired by variational autoencoders (VAE) to compress the CSI matrix before sending it back to the base station.
arXiv Detail & Related papers (2020-11-09T04:07:45Z) - Replay and Synthetic Speech Detection with Res2net Architecture [85.20912636149552]
Existing approaches for replay and synthetic speech detection still lack generalizability to unseen spoofing attacks.
This work proposes to leverage a novel model structure, so-called Res2Net, to improve the anti-spoofing countermeasure's generalizability.
arXiv Detail & Related papers (2020-10-28T14:33:42Z) - Cascade Network with Guided Loss and Hybrid Attention for Two-view
Geometry [32.52184271700281]
We propose a Guided Loss to establish the direct negative correlation between the loss and Fn-measure.
We then propose a hybrid attention block to extract feature.
Experiments have shown that our network achieves the state-of-the-art performance on benchmark datasets.
arXiv Detail & Related papers (2020-07-11T07:44:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.