Partially Fake Audio Detection by Self-attention-based Fake Span
Discovery
- URL: http://arxiv.org/abs/2202.06684v2
- Date: Tue, 15 Feb 2022 09:07:40 GMT
- Title: Partially Fake Audio Detection by Self-attention-based Fake Span
Discovery
- Authors: Haibin Wu, Heng-Cheng Kuo, Naijun Zheng, Kuo-Hsuan Hung, Hung-Yi Lee,
Yu Tsao, Hsin-Min Wang, Helen Meng
- Abstract summary: We propose a novel framework by introducing the question-answering (fake span discovery) strategy with the self-attention mechanism to detect partially fake audios.
Our submission ranked second in the partially fake audio detection track of ADD 2022.
- Score: 89.21979663248007
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The past few years have witnessed the significant advances of speech
synthesis and voice conversion technologies. However, such technologies can
undermine the robustness of broadly implemented biometric identification models
and can be harnessed by in-the-wild attackers for illegal uses. The ASVspoof
challenge mainly focuses on synthesized audios by advanced speech synthesis and
voice conversion models, and replay attacks. Recently, the first Audio Deep
Synthesis Detection challenge (ADD 2022) extends the attack scenarios into more
aspects. Also ADD 2022 is the first challenge to propose the partially fake
audio detection task. Such brand new attacks are dangerous and how to tackle
such attacks remains an open question. Thus, we propose a novel framework by
introducing the question-answering (fake span discovery) strategy with the
self-attention mechanism to detect partially fake audios. The proposed fake
span detection module tasks the anti-spoofing model to predict the start and
end positions of the fake clip within the partially fake audio, address the
model's attention into discovering the fake spans rather than other shortcuts
with less generalization, and finally equips the model with the discrimination
capacity between real and partially fake audios. Our submission ranked second
in the partially fake audio detection track of ADD 2022.
Related papers
- Can DeepFake Speech be Reliably Detected? [17.10792531439146]
This work presents the first systematic study of active malicious attacks against state-of-the-art open-source speech detectors.
The results highlight the urgent need for more robust detection methods in the face of evolving adversarial threats.
arXiv Detail & Related papers (2024-10-09T06:13:48Z) - SafeEar: Content Privacy-Preserving Audio Deepfake Detection [17.859275594843965]
We propose SafeEar, a novel framework that aims to detect deepfake audios without relying on accessing the speech content within.
Our key idea is to devise a neural audio into a novel decoupling model that well separates the semantic and acoustic information from audio samples.
In this way, no semantic content will be exposed to the detector.
arXiv Detail & Related papers (2024-09-14T02:45:09Z) - An RFP dataset for Real, Fake, and Partially fake audio detection [0.36832029288386137]
The paper presents the RFP da-taset, which comprises five distinct audio types: partial fake (PF), audio with noise, voice conversion (VC), text-to-speech (TTS), and real.
The data are then used to evaluate several detection models, revealing that the available models incur a markedly higher equal error rate (EER) when detecting PF audio instead of entirely fake audio.
arXiv Detail & Related papers (2024-04-26T23:00:56Z) - TranssionADD: A multi-frame reinforcement based sequence tagging model
for audio deepfake detection [11.27584658526063]
The second Audio Deepfake Detection Challenge (ADD 2023) aims to detect and analyze deepfake speech utterances.
We propose our novel TranssionADD system as a solution to the challenging problem of model robustness and audio segment outliers.
Our best submission achieved 2nd place in Track 2, demonstrating the effectiveness and robustness of our proposed system.
arXiv Detail & Related papers (2023-06-27T05:18:25Z) - Betray Oneself: A Novel Audio DeepFake Detection Model via
Mono-to-Stereo Conversion [70.99781219121803]
Audio Deepfake Detection (ADD) aims to detect the fake audio generated by text-to-speech (TTS), voice conversion (VC) and replay, etc.
We propose a novel ADD model, termed as M2S-ADD, that attempts to discover audio authenticity cues during the mono-to-stereo conversion process.
arXiv Detail & Related papers (2023-05-25T02:54:29Z) - SceneFake: An Initial Dataset and Benchmarks for Scene Fake Audio Detection [54.74467470358476]
This paper proposes a dataset for scene fake audio detection named SceneFake.
A manipulated audio is generated by only tampering with the acoustic scene of an original audio.
Some scene fake audio detection benchmark results on the SceneFake dataset are reported in this paper.
arXiv Detail & Related papers (2022-11-11T09:05:50Z) - Deepfake audio detection by speaker verification [79.99653758293277]
We propose a new detection approach that leverages only the biometric characteristics of the speaker, with no reference to specific manipulations.
The proposed approach can be implemented based on off-the-shelf speaker verification tools.
We test several such solutions on three popular test sets, obtaining good performance, high generalization ability, and high robustness to audio impairment.
arXiv Detail & Related papers (2022-09-28T13:46:29Z) - An Initial Investigation for Detecting Vocoder Fingerprints of Fake
Audio [53.134423013599914]
We propose a new problem for detecting vocoder fingerprints of fake audio.
Experiments are conducted on the datasets synthesized by eight state-of-the-art vocoders.
arXiv Detail & Related papers (2022-08-20T09:23:21Z) - ADD 2022: the First Audio Deep Synthesis Detection Challenge [92.41777858637556]
The first Audio Deep synthesis Detection challenge (ADD) was motivated to fill in the gap.
The ADD 2022 includes three tracks: low-quality fake audio detection (LF), partially fake audio detection (PF) and audio fake game (FG)
arXiv Detail & Related papers (2022-02-17T03:29:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.