An RFP dataset for Real, Fake, and Partially fake audio detection
- URL: http://arxiv.org/abs/2404.17721v1
- Date: Fri, 26 Apr 2024 23:00:56 GMT
- Title: An RFP dataset for Real, Fake, and Partially fake audio detection
- Authors: Abdulazeez AlAli, George Theodorakopoulos,
- Abstract summary: The paper presents the RFP da-taset, which comprises five distinct audio types: partial fake (PF), audio with noise, voice conversion (VC), text-to-speech (TTS), and real.
The data are then used to evaluate several detection models, revealing that the available models incur a markedly higher equal error rate (EER) when detecting PF audio instead of entirely fake audio.
- Score: 0.36832029288386137
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recent advances in deep learning have enabled the creation of natural-sounding synthesised speech. However, attackers have also utilised these tech-nologies to conduct attacks such as phishing. Numerous public datasets have been created to facilitate the development of effective detection models. How-ever, available datasets contain only entirely fake audio; therefore, detection models may miss attacks that replace a short section of the real audio with fake audio. In recognition of this problem, the current paper presents the RFP da-taset, which comprises five distinct audio types: partial fake (PF), audio with noise, voice conversion (VC), text-to-speech (TTS), and real. The data are then used to evaluate several detection models, revealing that the available detec-tion models incur a markedly higher equal error rate (EER) when detecting PF audio instead of entirely fake audio. The lowest EER recorded was 25.42%. Therefore, we believe that creators of detection models must seriously consid-er using datasets like RFP that include PF and other types of fake audio.
Related papers
- Statistics-aware Audio-visual Deepfake Detector [11.671275975119089]
Methods in audio-visualfake detection mostly assess the synchronization between audio and visual features.
We propose a statistical feature loss to enhance the discrimination capability of the model.
Experiments on the DFDC and FakeAVCeleb datasets demonstrate the relevance of the proposed method.
arXiv Detail & Related papers (2024-07-16T12:15:41Z) - The Codecfake Dataset and Countermeasures for the Universally Detection of Deepfake Audio [42.84634652376024]
ALM-based deepfake audio exhibits widespread, high deception, and type versatility.
To effectively detect ALM-based deepfake audio, we focus on the mechanism of the ALM-based audio generation method.
We propose the CSAM strategy to learn a domain balanced and generalized minima.
arXiv Detail & Related papers (2024-05-08T08:28:40Z) - Training-Free Deepfake Voice Recognition by Leveraging Large-Scale Pre-Trained Models [52.04189118767758]
Generalization is a main issue for current audio deepfake detectors.
In this paper we study the potential of large-scale pre-trained models for audio deepfake detection.
arXiv Detail & Related papers (2024-05-03T15:27:11Z) - Cross-Domain Audio Deepfake Detection: Dataset and Analysis [11.985093463886056]
Audio deepfake detection (ADD) is essential for preventing the misuse of synthetic voices that may infringe on personal rights and privacy.
Recent zero-shot text-to-speech (TTS) models pose higher risks as they can clone voices with a single utterance.
We construct a new cross-domain ADD dataset comprising over 300 hours of speech data that is generated by five advanced zero-shot TTS models.
arXiv Detail & Related papers (2024-04-07T10:10:15Z) - SceneFake: An Initial Dataset and Benchmarks for Scene Fake Audio Detection [54.74467470358476]
This paper proposes a dataset for scene fake audio detection named SceneFake.
A manipulated audio is generated by only tampering with the acoustic scene of an original audio.
Some scene fake audio detection benchmark results on the SceneFake dataset are reported in this paper.
arXiv Detail & Related papers (2022-11-11T09:05:50Z) - An Initial Investigation for Detecting Vocoder Fingerprints of Fake
Audio [53.134423013599914]
We propose a new problem for detecting vocoder fingerprints of fake audio.
Experiments are conducted on the datasets synthesized by eight state-of-the-art vocoders.
arXiv Detail & Related papers (2022-08-20T09:23:21Z) - Fully Automated End-to-End Fake Audio Detection [57.78459588263812]
This paper proposes a fully automated end-toend fake audio detection method.
We first use wav2vec pre-trained model to obtain a high-level representation of the speech.
For the network structure, we use a modified version of the differentiable architecture search (DARTS) named light-DARTS.
arXiv Detail & Related papers (2022-08-20T06:46:55Z) - Partially Fake Audio Detection by Self-attention-based Fake Span
Discovery [89.21979663248007]
We propose a novel framework by introducing the question-answering (fake span discovery) strategy with the self-attention mechanism to detect partially fake audios.
Our submission ranked second in the partially fake audio detection track of ADD 2022.
arXiv Detail & Related papers (2022-02-14T13:20:55Z) - Half-Truth: A Partially Fake Audio Detection Dataset [60.08010668752466]
This paper develops a dataset for half-truth audio detection (HAD)
Partially fake audio in the HAD dataset involves only changing a few words in an utterance.
We can not only detect fake uttrances but also localize manipulated regions in a speech using this dataset.
arXiv Detail & Related papers (2021-04-08T08:57:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.