Related papers: FakeAVCeleb: A Novel Audio-Video Multimodal Deepfake Dataset

FakeAVCeleb: A Novel Audio-Video Multimodal Deepfake Dataset

URL: http://arxiv.org/abs/2108.05080v2
Date: Thu, 12 Aug 2021 03:26:20 GMT
Title: FakeAVCeleb: A Novel Audio-Video Multimodal Deepfake Dataset
Authors: Hasam Khalid and Shahroz Tariq and Simon S. Woo
Abstract summary: Recently, a new problem of generating cloned or synthesized human voice of a person is emerging. With the emerging threat of impersonation attacks using deepfake videos and audios, new deepfake detectors are need that focuses on both, video and audio. We propose a novel Audio-Video Deepfake dataset (FakeAVCeleb) that not only contains deepfake videos but respective synthesized cloned audios as well.
Score: 21.199288324085444
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: With the significant advancements made in generation of forged video and audio, commonly known as deepfakes, using deep learning technologies, the problem of its misuse is a well-known issue now. Recently, a new problem of generating cloned or synthesized human voice of a person is emerging. AI-based deep learning models can synthesize any person's voice requiring just a few seconds of audio. With the emerging threat of impersonation attacks using deepfake videos and audios, new deepfake detectors are need that focuses on both, video and audio. Detecting deepfakes is a challenging task and researchers have made numerous attempts and proposed several deepfake detection methods. To develop a good deepfake detector, a handsome amount of good quality dataset is needed that captures the real world scenarios. Many researchers have contributed in this cause and provided several deepfake dataset, self generated and in-the-wild. However, almost all of these datasets either contains deepfake videos or audio. Moreover, the recent deepfake datasets proposed by researchers have racial bias issues. Hence, there is a crucial need of a good deepfake video and audio deepfake dataset. To fill this gap, we propose a novel Audio-Video Deepfake dataset (FakeAVCeleb) that not only contains deepfake videos but respective synthesized cloned audios as well. We generated our dataset using recent most popular deepfake generation methods and the videos and audios are perfectly lip-synced with each other. To generate a more realistic dataset, we selected real YouTube videos of celebrities having four racial backgrounds (Caucasian, Black, East Asian and South Asian) to counter the racial bias issue. Lastly, we propose a novel multimodal detection method that detects deepfake videos and audios based on our multimodal Audio-Video deepfake dataset.

Related papers

Deepfake Media Generation and Detection in the Generative AI Era: A Survey and Outlook [101.30779332427217]
We survey deepfake generation and detection techniques, including the most recent developments in the field. We identify various kinds of deepfakes, according to the procedure used to alter or generate the fake content. We develop a novel multimodal benchmark to evaluate deepfake detectors on out-of-distribution content.
arXiv Detail & Related papers (2024-11-29T08:29:25Z)
Hindi audio-video-Deepfake (HAV-DF): A Hindi language-based Audio-video Deepfake Dataset [11.164272928464879]
Fake videos or speeches in Hindi can have an enormous impact on rural and semi-urban communities. This paper aims to create a first novel Hindi deep fake dataset, named Hindi audio-video-Deepfake'' (HAV-DF)
arXiv Detail & Related papers (2024-11-23T05:18:43Z)
AVTENet: Audio-Visual Transformer-based Ensemble Network Exploiting Multiple Experts for Video Deepfake Detection [53.448283629898214]
The recent proliferation of hyper-realistic deepfake videos has drawn attention to the threat of audio and visual forgeries. Most previous work on detecting AI-generated fake videos only utilize visual modality or audio modality. We propose an Audio-Visual Transformer-based Ensemble Network (AVTENet) framework that considers both acoustic manipulation and visual manipulation.
arXiv Detail & Related papers (2023-10-19T19:01:26Z)
SceneFake: An Initial Dataset and Benchmarks for Scene Fake Audio Detection [54.74467470358476]
This paper proposes a dataset for scene fake audio detection named SceneFake. A manipulated audio is generated by only tampering with the acoustic scene of an original audio. Some scene fake audio detection benchmark results on the SceneFake dataset are reported in this paper.
arXiv Detail & Related papers (2022-11-11T09:05:50Z)
DeePhy: On Deepfake Phylogeny [58.01631614114075]
DeePhy is a novel Deepfake Phylogeny dataset which consists of 5040 deepfake videos generated using three different generation techniques. We present the benchmark on DeePhy dataset using six deepfake detection algorithms.
arXiv Detail & Related papers (2022-09-19T15:30:33Z)
Audio-Visual Person-of-Interest DeepFake Detection [77.04789677645682]
The aim of this work is to propose a deepfake detector that can cope with the wide variety of manipulation methods and scenarios encountered in the real world. We leverage a contrastive learning paradigm to learn the moving-face and audio segment embeddings that are most discriminative for each identity. Our method can detect both single-modality (audio-only, video-only) and multi-modality (audio-video) attacks, and is robust to low-quality or corrupted videos.
arXiv Detail & Related papers (2022-04-06T20:51:40Z)
How Deep Are the Fakes? Focusing on Audio Deepfake: A Survey [0.0]
This paper critically analyzes and provides a unique source of audio deepfake research, mostly ranging from 2016 to 2020. This survey provides readers with a summary of 1) different deepfake categories 2) how they could be created and detected 3) the most recent trends in this domain and shortcomings in detection methods. We found that Generative Adversarial Networks(GAN), Convolutional Neural Networks (CNN), and Deep Neural Networks (DNN) are common ways of creating and detecting deepfakes.
arXiv Detail & Related papers (2021-11-28T18:28:30Z)
Evaluation of an Audio-Video Multimodal Deepfake Dataset using Unimodal and Multimodal Detectors [18.862258543488355]
Deepfakes can cause security and privacy issues. New domain of cloning human voices using deep-learning technologies is also emerging. To develop a good deepfake detector, we need a detector that can detect deepfakes of multiple modalities.
arXiv Detail & Related papers (2021-09-07T11:00:20Z)
Half-Truth: A Partially Fake Audio Detection Dataset [60.08010668752466]
This paper develops a dataset for half-truth audio detection (HAD) Partially fake audio in the HAD dataset involves only changing a few words in an utterance. We can not only detect fake uttrances but also localize manipulated regions in a speech using this dataset.
arXiv Detail & Related papers (2021-04-08T08:57:13Z)
WildDeepfake: A Challenging Real-World Dataset for Deepfake Detection [82.42495493102805]
We introduce a new dataset WildDeepfake which consists of 7,314 face sequences extracted from 707 deepfake videos collected completely from the internet. We conduct a systematic evaluation of a set of baseline detection networks on both existing and our WildDeepfake datasets, and show that WildDeepfake is indeed a more challenging dataset, where the detection performance can decrease drastically.
arXiv Detail & Related papers (2021-01-05T11:10:32Z)
Deepfake detection: humans vs. machines [4.485016243130348]
We present a subjective study conducted in a crowdsourcing-like scenario, which systematically evaluates how hard it is for humans to see if the video is deepfake or not. For each video, a simple question: "Is face of the person in the video real of fake?" was answered on average by 19 na"ive subjects. The evaluation demonstrates that while the human perception is very different from the perception of a machine, both successfully but in different ways are fooled by deepfakes.
arXiv Detail & Related papers (2020-09-07T15:20:37Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.