Spoof Diarization: "What Spoofed When" in Partially Spoofed Audio
- URL: http://arxiv.org/abs/2406.07816v1
- Date: Wed, 12 Jun 2024 02:23:57 GMT
- Title: Spoof Diarization: "What Spoofed When" in Partially Spoofed Audio
- Authors: Lin Zhang, Xin Wang, Erica Cooper, Mireia Diez, Federico Landini, Nicholas Evans, Junichi Yamagishi,
- Abstract summary: This paper defines Spoof Diarization as a novel task in the Partial Spoof (PS) scenario.
It aims to determine what spoofed when, which includes locating spoof regions and clustering them according to different spoofing methods.
- Score: 35.485350559012645
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper defines Spoof Diarization as a novel task in the Partial Spoof (PS) scenario. It aims to determine what spoofed when, which includes not only locating spoof regions but also clustering them according to different spoofing methods. As a pioneering study in spoof diarization, we focus on defining the task, establishing evaluation metrics, and proposing a benchmark model, namely the Countermeasure-Condition Clustering (3C) model. Utilizing this model, we first explore how to effectively train countermeasures to support spoof diarization using three labeling schemes. We then utilize spoof localization predictions to enhance the diarization performance. This first study reveals the high complexity of the task, even in restricted scenarios where only a single speaker per audio file and an oracle number of spoofing methods are considered. Our code is available at https://github.com/nii-yamagishilab/PartialSpoof.
Related papers
- How Do Neural Spoofing Countermeasures Detect Partially Spoofed Audio? [53.58852794805362]
countermeasures (CMs) trained on partially spoofed audio can effectively detect such spoofing.
We utilize Grad-CAM and introduce a quantitative analysis metric to interpret CMs' decisions.
We find that CMs prioritize the artifacts of transition regions created when concatenating bona fide and spoofed audio.
arXiv Detail & Related papers (2024-06-04T16:51:42Z) - An Efficient Temporary Deepfake Location Approach Based Embeddings for
Partially Spoofed Audio Detection [4.055489363682199]
We propose a fine-grained partially spoofed audio detection method, namely Temporal Deepfake Location (TDL)
Our approach involves two novel parts: embedding similarity module and temporal convolution operation.
Our method outperform baseline models in ASVspoof 2019 Partial Spoof dataset and demonstrate superior performance even in the crossdataset scenario.
arXiv Detail & Related papers (2023-09-06T14:29:29Z) - Generating Natural Language Proofs with Verifier-Guided Search [74.9614610172561]
We present a novel stepwise method NLProofS (Natural Language Proof Search)
NLProofS learns to generate relevant steps conditioning on the hypothesis.
It achieves state-of-the-art performance on EntailmentBank and RuleTaker.
arXiv Detail & Related papers (2022-05-25T02:22:30Z) - Physics-Guided Spoof Trace Disentanglement for Generic Face
Anti-Spoofing [26.389969978817042]
Key to face anti-spoofing lies in subtle image pattern, termed "spoof trace"
In this work, we design a novel adversarial learning framework to disentangle spoof faces into the spoof traces and the live counterparts.
arXiv Detail & Related papers (2020-12-09T17:22:44Z) - On Disentangling Spoof Trace for Generic Face Anti-Spoofing [24.75975874643976]
Key to face anti-spoofing lies in subtle image pattern, termed "spoof trace"
This work designs a novel adversarial learning framework to disentangle the spoof traces from input faces.
arXiv Detail & Related papers (2020-07-17T23:14:16Z) - Anomaly Detection-Based Unknown Face Presentation Attack Detection [74.4918294453537]
Anomaly detection-based spoof attack detection is a recent development in face Presentation Attack Detection.
In this paper, we present a deep-learning solution for anomaly detection-based spoof attack detection.
The proposed approach benefits from the representation learning power of the CNNs and learns better features for fPAD task.
arXiv Detail & Related papers (2020-07-11T21:20:55Z) - Look Locally Infer Globally: A Generalizable Face Anti-Spoofing Approach [53.86588268914105]
State-of-the-art spoof detection methods tend to overfit to the spoof types seen during training and fail to generalize to unknown spoof types.
We propose Self-Supervised Regional Fully Convolutional Network (SSR-FCN) that is trained to learn local discriminative cues from a face image in a self-supervised manner.
arXiv Detail & Related papers (2020-06-04T13:11:17Z) - Learning Generalized Spoof Cues for Face Anti-spoofing [43.32561471100592]
We propose a residual-learning framework to learn the discriminative live-spoof differences which are defined as the spoof cues.
The generator minimizes the spoof cues of live samples while imposes no explicit constraint on those of spoof samples to generalize well to unseen attacks.
We conduct extensive experiments and the experimental results show the proposed method consistently outperforms the state-of-the-art methods.
arXiv Detail & Related papers (2020-05-08T09:22:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.