KoDF: A Large-scale Korean DeepFake Detection Dataset
- URL: http://arxiv.org/abs/2103.10094v1
- Date: Thu, 18 Mar 2021 09:04:02 GMT
- Title: KoDF: A Large-scale Korean DeepFake Detection Dataset
- Authors: Patrick Kwon, Jaeseong You, Gyuhyeon Nam, Sungwoo Park, Gyeongsu Chae
- Abstract summary: Face-swap and face-reenactment methods have come to be collectively called deepfakes.
We have built the Korean DeepFake Detection dataset (KoDF), a large-scale collection of synthesized and real videos focused on Korean subjects.
In this paper, we provide a detailed description of methods used to construct the dataset, experimentally show the discrepancy between the distributions of KoDF and existing deepfake detection datasets, and underline the importance of using multiple datasets for real-world generalization.
- Score: 9.493398442214865
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: A variety of effective face-swap and face-reenactment methods have been
publicized in recent years, democratizing the face synthesis technology to a
great extent. Videos generated as such have come to be collectively called
deepfakes with a negative connotation, for various social problems they have
caused. Facing the emerging threat of deepfakes, we have built the Korean
DeepFake Detection Dataset (KoDF), a large-scale collection of synthesized and
real videos focused on Korean subjects. In this paper, we provide a detailed
description of methods used to construct the dataset, experimentally show the
discrepancy between the distributions of KoDF and existing deepfake detection
datasets, and underline the importance of using multiple datasets for
real-world generalization. KoDF is publicly available at
https://moneybrain-research.github.io/kodf in its entirety (i.e. real clips,
synthesized clips, clips with additive noise, and their corresponding
metadata).
Related papers
- Hindi audio-video-Deepfake (HAV-DF): A Hindi language-based Audio-video Deepfake Dataset [11.164272928464879]
Fake videos or speeches in Hindi can have an enormous impact on rural and semi-urban communities.
This paper aims to create a first novel Hindi deep fake dataset, named Hindi audio-video-Deepfake'' (HAV-DF)
arXiv Detail & Related papers (2024-11-23T05:18:43Z) - Deepfake detection in videos with multiple faces using geometric-fakeness features [79.16635054977068]
Deepfakes of victims or public figures can be used by fraudsters for blackmailing, extorsion and financial fraud.
In our research we propose to use geometric-fakeness features (GFF) that characterize a dynamic degree of a face presence in a video.
We employ our approach to analyze videos with multiple faces that are simultaneously present in a video.
arXiv Detail & Related papers (2024-10-10T13:10:34Z) - MegaScenes: Scene-Level View Synthesis at Scale [69.21293001231993]
Scene-level novel view synthesis (NVS) is fundamental to many vision and graphics applications.
We create a large-scale scene-level dataset from Internet photo collections, called MegaScenes, which contains over 100K structure from motion (SfM) reconstructions from around the world.
We analyze failure cases of state-of-the-art NVS methods and significantly improve generation consistency.
arXiv Detail & Related papers (2024-06-17T17:55:55Z) - DeePhy: On Deepfake Phylogeny [58.01631614114075]
DeePhy is a novel Deepfake Phylogeny dataset which consists of 5040 deepfake videos generated using three different generation techniques.
We present the benchmark on DeePhy dataset using six deepfake detection algorithms.
arXiv Detail & Related papers (2022-09-19T15:30:33Z) - Audio-Visual Person-of-Interest DeepFake Detection [77.04789677645682]
The aim of this work is to propose a deepfake detector that can cope with the wide variety of manipulation methods and scenarios encountered in the real world.
We leverage a contrastive learning paradigm to learn the moving-face and audio segment embeddings that are most discriminative for each identity.
Our method can detect both single-modality (audio-only, video-only) and multi-modality (audio-video) attacks, and is robust to low-quality or corrupted videos.
arXiv Detail & Related papers (2022-04-06T20:51:40Z) - Voice-Face Homogeneity Tells Deepfake [56.334968246631725]
Existing detection approaches contribute to exploring the specific artifacts in deepfake videos.
We propose to perform the deepfake detection from an unexplored voice-face matching view.
Our model obtains significantly improved performance as compared to other state-of-the-art competitors.
arXiv Detail & Related papers (2022-03-04T09:08:50Z) - Model Attribution of Face-swap Deepfake Videos [39.771800841412414]
We first introduce a new dataset with DeepFakes from Different Models (DFDM) based on several Autoencoder models.
Specifically, five generation models with variations in encoder, decoder, intermediate layer, input resolution, and compression ratio have been used to generate a total of 6,450 Deepfake videos.
We take Deepfakes model attribution as a multiclass classification task and propose a spatial and temporal attention based method to explore the differences among Deepfakes.
arXiv Detail & Related papers (2022-02-25T20:05:18Z) - FakeAVCeleb: A Novel Audio-Video Multimodal Deepfake Dataset [21.199288324085444]
Recently, a new problem of generating cloned or synthesized human voice of a person is emerging.
With the emerging threat of impersonation attacks using deepfake videos and audios, new deepfake detectors are need that focuses on both, video and audio.
We propose a novel Audio-Video Deepfake dataset (FakeAVCeleb) that not only contains deepfake videos but respective synthesized cloned audios as well.
arXiv Detail & Related papers (2021-08-11T07:49:36Z) - WildDeepfake: A Challenging Real-World Dataset for Deepfake Detection [82.42495493102805]
We introduce a new dataset WildDeepfake which consists of 7,314 face sequences extracted from 707 deepfake videos collected completely from the internet.
We conduct a systematic evaluation of a set of baseline detection networks on both existing and our WildDeepfake datasets, and show that WildDeepfake is indeed a more challenging dataset, where the detection performance can decrease drastically.
arXiv Detail & Related papers (2021-01-05T11:10:32Z) - The DeepFake Detection Challenge (DFDC) Dataset [8.451007921188019]
Deepfakes are a technique that allows anyone to swap two identities in a single video.
To counter this emerging threat, we have constructed an extremely large face swap video dataset.
All recorded subjects agreed to participate in and have their likenesses modified during the construction of the face-swapped dataset.
arXiv Detail & Related papers (2020-06-12T18:15:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.