Voice-Face Homogeneity Tells Deepfake
- URL: http://arxiv.org/abs/2203.02195v1
- Date: Fri, 4 Mar 2022 09:08:50 GMT
- Title: Voice-Face Homogeneity Tells Deepfake
- Authors: Harry Cheng and Yangyang Guo and Tianyi Wang and Qi Li and Tao Ye and
Liqiang Nie
- Abstract summary: Existing detection approaches contribute to exploring the specific artifacts in deepfake videos.
We propose to perform the deepfake detection from an unexplored voice-face matching view.
Our model obtains significantly improved performance as compared to other state-of-the-art competitors.
- Score: 56.334968246631725
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Detecting forgery videos is highly desired due to the abuse of deepfake.
Existing detection approaches contribute to exploring the specific artifacts in
deepfake videos and fit well on certain data. However, the growing technique on
these artifacts keeps challenging the robustness of traditional deepfake
detectors. As a result, the development of generalizability of these approaches
has reached a blockage. To address this issue, given the empirical results that
the identities behind voices and faces are often mismatched in deepfake videos,
and the voices and faces have homogeneity to some extent, in this paper, we
propose to perform the deepfake detection from an unexplored voice-face
matching view. To this end, a voice-face matching detection model is devised to
measure the matching degree of these two on a generic audio-visual dataset.
Thereafter, this model can be smoothly transferred to deepfake datasets without
any fine-tuning, and the generalization across datasets is accordingly
enhanced. We conduct extensive experiments over two widely exploited datasets -
DFDC and FakeAVCeleb. Our model obtains significantly improved performance as
compared to other state-of-the-art competitors and maintains favorable
generalizability. The code has been released at
https://github.com/xaCheng1996/VFD.
Related papers
- DF40: Toward Next-Generation Deepfake Detection [62.073997142001424]
Existing works identify top-notch detection algorithms and models by adhering to the common practice: training detectors on one specific dataset (e.g., FF++) and testing them on other prevalent deepfake datasets.
But can these stand-out "winners" be truly applied to tackle the myriad of realistic and diverse deepfakes lurking in the real world?
We construct a highly diverse and large-scale deepfake dataset called DF40, which comprises 40 distinct deepfake techniques.
We then conduct comprehensive evaluations using 4 standard evaluation protocols and 7 representative detectors, resulting in over 2,000 evaluations.
arXiv Detail & Related papers (2024-06-19T12:35:02Z) - Training-Free Deepfake Voice Recognition by Leveraging Large-Scale Pre-Trained Models [52.04189118767758]
Generalization is a main issue for current audio deepfake detectors.
In this paper we study the potential of large-scale pre-trained models for audio deepfake detection.
arXiv Detail & Related papers (2024-05-03T15:27:11Z) - In Anticipation of Perfect Deepfake: Identity-anchored Artifact-agnostic Detection under Rebalanced Deepfake Detection Protocol [20.667392938528987]
We introduce the Rebalanced Deepfake Detection Protocol (RDDP) to stress-test detectors under balanced scenarios.
We present ID-Miner, a detector that identifies the puppeteer behind the disguise by focusing on motion over artifacts or appearances.
arXiv Detail & Related papers (2024-05-01T12:48:13Z) - MIS-AVoiDD: Modality Invariant and Specific Representation for
Audio-Visual Deepfake Detection [4.659427498118277]
A novel kind of deepfakes has emerged with either audio or visual modalities manipulated.
Existing multimodal deepfake detectors are often based on the fusion of the audio and visual streams from the video.
In this paper, we tackle the problem at the representation level to aid the fusion of audio and visual streams for multimodal deepfake detection.
arXiv Detail & Related papers (2023-10-03T17:43:24Z) - Towards General Visual-Linguistic Face Forgery Detection [95.73987327101143]
Deepfakes are realistic face manipulations that can pose serious threats to security, privacy, and trust.
Existing methods mostly treat this task as binary classification, which uses digital labels or mask signals to train the detection model.
We propose a novel paradigm named Visual-Linguistic Face Forgery Detection(VLFFD), which uses fine-grained sentence-level prompts as the annotation.
arXiv Detail & Related papers (2023-07-31T10:22:33Z) - Model Attribution of Face-swap Deepfake Videos [39.771800841412414]
We first introduce a new dataset with DeepFakes from Different Models (DFDM) based on several Autoencoder models.
Specifically, five generation models with variations in encoder, decoder, intermediate layer, input resolution, and compression ratio have been used to generate a total of 6,450 Deepfake videos.
We take Deepfakes model attribution as a multiclass classification task and propose a spatial and temporal attention based method to explore the differences among Deepfakes.
arXiv Detail & Related papers (2022-02-25T20:05:18Z) - Beyond the Spectrum: Detecting Deepfakes via Re-Synthesis [69.09526348527203]
Deep generative models have led to highly realistic media, known as deepfakes, that are commonly indistinguishable from real to human eyes.
We propose a novel fake detection that is designed to re-synthesize testing images and extract visual cues for detection.
We demonstrate the improved effectiveness, cross-GAN generalization, and robustness against perturbations of our approach in a variety of detection scenarios.
arXiv Detail & Related papers (2021-05-29T21:22:24Z) - Emotions Don't Lie: An Audio-Visual Deepfake Detection Method Using
Affective Cues [75.1731999380562]
We present a learning-based method for detecting real and fake deepfake multimedia content.
We extract and analyze the similarity between the two audio and visual modalities from within the same video.
We compare our approach with several SOTA deepfake detection methods and report per-video AUC of 84.4% on the DFDC and 96.6% on the DF-TIMIT datasets.
arXiv Detail & Related papers (2020-03-14T22:07:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.