Related papers: The DeepFake Detection Challenge (DFDC) Dataset

The DeepFake Detection Challenge (DFDC) Dataset

URL: http://arxiv.org/abs/2006.07397v4
Date: Wed, 28 Oct 2020 03:48:28 GMT
Title: The DeepFake Detection Challenge (DFDC) Dataset
Authors: Brian Dolhansky, Joanna Bitton, Ben Pflaum, Jikuo Lu, Russ Howes, Menglin Wang, Cristian Canton Ferrer
Abstract summary: Deepfakes are a technique that allows anyone to swap two identities in a single video. To counter this emerging threat, we have constructed an extremely large face swap video dataset. All recorded subjects agreed to participate in and have their likenesses modified during the construction of the face-swapped dataset.
Score: 8.451007921188019
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Deepfakes are a recent off-the-shelf manipulation technique that allows anyone to swap two identities in a single video. In addition to Deepfakes, a variety of GAN-based face swapping methods have also been published with accompanying code. To counter this emerging threat, we have constructed an extremely large face swap video dataset to enable the training of detection models, and organized the accompanying DeepFake Detection Challenge (DFDC) Kaggle competition. Importantly, all recorded subjects agreed to participate in and have their likenesses modified during the construction of the face-swapped dataset. The DFDC dataset is by far the largest currently and publicly available face swap video dataset, with over 100,000 total clips sourced from 3,426 paid actors, produced with several Deepfake, GAN-based, and non-learned methods. In addition to describing the methods used to construct the dataset, we provide a detailed analysis of the top submissions from the Kaggle contest. We show although Deepfake detection is extremely difficult and still an unsolved problem, a Deepfake detection model trained only on the DFDC can generalize to real "in-the-wild" Deepfake videos, and such a model can be a valuable analysis tool when analyzing potentially Deepfaked videos. Training, validation and testing corpuses can be downloaded from https://ai.facebook.com/datasets/dfdc.

Related papers

Deepfake detection in videos with multiple faces using geometric-fakeness features [79.16635054977068]
Deepfakes of victims or public figures can be used by fraudsters for blackmailing, extorsion and financial fraud. In our research we propose to use geometric-fakeness features (GFF) that characterize a dynamic degree of a face presence in a video. We employ our approach to analyze videos with multiple faces that are simultaneously present in a video.
arXiv Detail & Related papers (2024-10-10T13:10:34Z)
DeePhy: On Deepfake Phylogeny [58.01631614114075]
DeePhy is a novel Deepfake Phylogeny dataset which consists of 5040 deepfake videos generated using three different generation techniques. We present the benchmark on DeePhy dataset using six deepfake detection algorithms.
arXiv Detail & Related papers (2022-09-19T15:30:33Z)
Voice-Face Homogeneity Tells Deepfake [56.334968246631725]
Existing detection approaches contribute to exploring the specific artifacts in deepfake videos. We propose to perform the deepfake detection from an unexplored voice-face matching view. Our model obtains significantly improved performance as compared to other state-of-the-art competitors.
arXiv Detail & Related papers (2022-03-04T09:08:50Z)
Model Attribution of Face-swap Deepfake Videos [39.771800841412414]
We first introduce a new dataset with DeepFakes from Different Models (DFDM) based on several Autoencoder models. Specifically, five generation models with variations in encoder, decoder, intermediate layer, input resolution, and compression ratio have been used to generate a total of 6,450 Deepfake videos. We take Deepfakes model attribution as a multiclass classification task and propose a spatial and temporal attention based method to explore the differences among Deepfakes.
arXiv Detail & Related papers (2022-02-25T20:05:18Z)
M2TR: Multi-modal Multi-scale Transformers for Deepfake Detection [74.19291916812921]
forged images generated by Deepfake techniques pose a serious threat to the trustworthiness of digital information. In this paper, we aim to capture the subtle manipulation artifacts at different scales for Deepfake detection. We introduce a high-quality Deepfake dataset, SR-DF, which consists of 4,000 DeepFake videos generated by state-of-the-art face swapping and facial reenactment methods.
arXiv Detail & Related papers (2021-04-20T05:43:44Z)
Deepfake Detection Scheme Based on Vision Transformer and Distillation [4.716110829725784]
We propose a Vision Transformer model with distillation methodology for detecting fake videos. We verify that the proposed scheme with patch embedding as input outperforms the state-of-the-art using the combined CNN features.
arXiv Detail & Related papers (2021-04-03T09:13:05Z)
KoDF: A Large-scale Korean DeepFake Detection Dataset [9.493398442214865]
Face-swap and face-reenactment methods have come to be collectively called deepfakes. We have built the Korean DeepFake Detection dataset (KoDF), a large-scale collection of synthesized and real videos focused on Korean subjects. In this paper, we provide a detailed description of methods used to construct the dataset, experimentally show the discrepancy between the distributions of KoDF and existing deepfake detection datasets, and underline the importance of using multiple datasets for real-world generalization.
arXiv Detail & Related papers (2021-03-18T09:04:02Z)
WildDeepfake: A Challenging Real-World Dataset for Deepfake Detection [82.42495493102805]
We introduce a new dataset WildDeepfake which consists of 7,314 face sequences extracted from 707 deepfake videos collected completely from the internet. We conduct a systematic evaluation of a set of baseline detection networks on both existing and our WildDeepfake datasets, and show that WildDeepfake is indeed a more challenging dataset, where the detection performance can decrease drastically.
arXiv Detail & Related papers (2021-01-05T11:10:32Z)
Sharp Multiple Instance Learning for DeepFake Video Detection [54.12548421282696]
We introduce a new problem of partial face attack in DeepFake video, where only video-level labels are provided but not all the faces in the fake videos are manipulated. A sharp MIL (S-MIL) is proposed which builds direct mapping from instance embeddings to bag prediction. Experiments on FFPMS and widely used DFDC dataset verify that S-MIL is superior to other counterparts for partially attacked DeepFake video detection.
arXiv Detail & Related papers (2020-08-11T08:52:17Z)

This list is automatically generated from the titles and abstracts of the papers in this site.