Half-Truth: A Partially Fake Audio Detection Dataset
- URL: http://arxiv.org/abs/2104.03617v2
- Date: Sat, 16 Dec 2023 02:17:19 GMT
- Title: Half-Truth: A Partially Fake Audio Detection Dataset
- Authors: Jiangyan Yi, Ye Bai, Jianhua Tao, Haoxin Ma, Zhengkun Tian, Chenglong
Wang, Tao Wang, Ruibo Fu
- Abstract summary: This paper develops a dataset for half-truth audio detection (HAD)
Partially fake audio in the HAD dataset involves only changing a few words in an utterance.
We can not only detect fake uttrances but also localize manipulated regions in a speech using this dataset.
- Score: 60.08010668752466
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Diverse promising datasets have been designed to hold back the development of
fake audio detection, such as ASVspoof databases. However, previous datasets
ignore an attacking situation, in which the hacker hides some small fake clips
in real speech audio. This poses a serious threat since that it is difficult to
distinguish the small fake clip from the whole speech utterance. Therefore,
this paper develops such a dataset for half-truth audio detection (HAD).
Partially fake audio in the HAD dataset involves only changing a few words in
an utterance.The audio of the words is generated with the very latest
state-of-the-art speech synthesis technology. We can not only detect fake
uttrances but also localize manipulated regions in a speech using this dataset.
Some benchmark results are presented on this dataset. The results show that
partially fake audio presents much more challenging than fully fake audio for
fake audio detection. The HAD dataset is publicly available:
https://zenodo.org/records/10377492.
Related papers
- SafeEar: Content Privacy-Preserving Audio Deepfake Detection [17.859275594843965]
We propose SafeEar, a novel framework that aims to detect deepfake audios without relying on accessing the speech content within.
Our key idea is to devise a neural audio into a novel decoupling model that well separates the semantic and acoustic information from audio samples.
In this way, no semantic content will be exposed to the detector.
arXiv Detail & Related papers (2024-09-14T02:45:09Z) - An RFP dataset for Real, Fake, and Partially fake audio detection [0.36832029288386137]
The paper presents the RFP da-taset, which comprises five distinct audio types: partial fake (PF), audio with noise, voice conversion (VC), text-to-speech (TTS), and real.
The data are then used to evaluate several detection models, revealing that the available models incur a markedly higher equal error rate (EER) when detecting PF audio instead of entirely fake audio.
arXiv Detail & Related papers (2024-04-26T23:00:56Z) - Deepfake audio as a data augmentation technique for training automatic
speech to text transcription models [55.2480439325792]
We propose a framework that approaches data augmentation based on deepfake audio.
A dataset produced by Indians (in English) was selected, ensuring the presence of a single accent.
arXiv Detail & Related papers (2023-09-22T11:33:03Z) - WavCaps: A ChatGPT-Assisted Weakly-Labelled Audio Captioning Dataset for Audio-Language Multimodal Research [82.42802570171096]
We introduce WavCaps, the first large-scale weakly-labelled audio captioning dataset, comprising approximately 400k audio clips with paired captions.
Online-harvested raw descriptions are highly noisy and unsuitable for direct use in tasks such as automated audio captioning.
We propose a three-stage processing pipeline for filtering noisy data and generating high-quality captions, where ChatGPT, a large language model, is leveraged to filter and transform raw descriptions automatically.
arXiv Detail & Related papers (2023-03-30T14:07:47Z) - SceneFake: An Initial Dataset and Benchmarks for Scene Fake Audio Detection [54.74467470358476]
This paper proposes a dataset for scene fake audio detection named SceneFake.
A manipulated audio is generated by only tampering with the acoustic scene of an original audio.
Some scene fake audio detection benchmark results on the SceneFake dataset are reported in this paper.
arXiv Detail & Related papers (2022-11-11T09:05:50Z) - Faked Speech Detection with Zero Prior Knowledge [2.407976495888858]
We introduce a neural network method to develop a classifier that will blindly classify an input audio as real or mimicked.
We propose a deep neural network following a sequential model that comprises three hidden layers, with alternating dense and drop out layers.
We were able to get at least 94% correct classification of the test cases, as against the 85% accuracy in the case of human observers.
arXiv Detail & Related papers (2022-09-26T10:38:39Z) - Partially Fake Audio Detection by Self-attention-based Fake Span
Discovery [89.21979663248007]
We propose a novel framework by introducing the question-answering (fake span discovery) strategy with the self-attention mechanism to detect partially fake audios.
Our submission ranked second in the partially fake audio detection track of ADD 2022.
arXiv Detail & Related papers (2022-02-14T13:20:55Z) - FakeAVCeleb: A Novel Audio-Video Multimodal Deepfake Dataset [21.199288324085444]
Recently, a new problem of generating cloned or synthesized human voice of a person is emerging.
With the emerging threat of impersonation attacks using deepfake videos and audios, new deepfake detectors are need that focuses on both, video and audio.
We propose a novel Audio-Video Deepfake dataset (FakeAVCeleb) that not only contains deepfake videos but respective synthesized cloned audios as well.
arXiv Detail & Related papers (2021-08-11T07:49:36Z) - VGGSound: A Large-scale Audio-Visual Dataset [160.1604237188594]
We propose a scalable pipeline to create an audio dataset from open-source media.
We use this pipeline to curate the VGGSound dataset consisting of more than 210k videos for 310 audio classes.
The resulting dataset can be used for training and evaluating audio recognition models.
arXiv Detail & Related papers (2020-04-29T17:46:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.