Explicit Correlation Learning for Generalizable Cross-Modal Deepfake Detection
- URL: http://arxiv.org/abs/2404.19171v1
- Date: Tue, 30 Apr 2024 00:25:44 GMT
- Title: Explicit Correlation Learning for Generalizable Cross-Modal Deepfake Detection
- Authors: Cai Yu, Shan Jia, Xiaomeng Fu, Jin Liu, Jiahe Tian, Jiao Dai, Xi Wang, Siwei Lyu, Jizhong Han,
- Abstract summary: This paper aims to learn potential cross-modal correlation to enhance deepfake detection towards various generation scenarios.
Our approach introduces a correlation distillation task, which models the inherent cross-modal correlation based on content information.
We present the Cross-Modal Deepfake dataset with four generation methods to evaluate the detection of diverse cross-modal deepfakes.
- Score: 33.20064862916194
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: With the rising prevalence of deepfakes, there is a growing interest in developing generalizable detection methods for various types of deepfakes. While effective in their specific modalities, traditional detection methods fall short in addressing the generalizability of detection across diverse cross-modal deepfakes. This paper aims to explicitly learn potential cross-modal correlation to enhance deepfake detection towards various generation scenarios. Our approach introduces a correlation distillation task, which models the inherent cross-modal correlation based on content information. This strategy helps to prevent the model from overfitting merely to audio-visual synchronization. Additionally, we present the Cross-Modal Deepfake Dataset (CMDFD), a comprehensive dataset with four generation methods to evaluate the detection of diverse cross-modal deepfakes. The experimental results on CMDFD and FakeAVCeleb datasets demonstrate the superior generalizability of our method over existing state-of-the-art methods. Our code and data can be found at \url{https://github.com/ljj898/CMDFD-Dataset-and-Deepfake-Detection}.
Related papers
- Contextual Cross-Modal Attention for Audio-Visual Deepfake Detection and Localization [3.9440964696313485]
In the digital age, the emergence of deepfakes and synthetic media presents a significant threat to societal and political integrity.
Deepfakes based on multi-modal manipulation, such as audio-visual, are more realistic and pose a greater threat.
We propose a novel multi-modal attention framework based on recurrent neural networks (RNNs) that leverages contextual information for audio-visual deepfake detection.
arXiv Detail & Related papers (2024-08-02T18:45:01Z) - DF40: Toward Next-Generation Deepfake Detection [62.073997142001424]
existing works identify top-notch detection algorithms and models by adhering to the common practice: training detectors on one specific dataset and testing them on other prevalent deepfake datasets.
But can these stand-out "winners" be truly applied to tackle the myriad of realistic and diverse deepfakes lurking in the real world?
We construct a highly diverse deepfake detection dataset called DF40, which comprises 40 distinct deepfake techniques.
arXiv Detail & Related papers (2024-06-19T12:35:02Z) - Facial Forgery-based Deepfake Detection using Fine-Grained Features [7.378937711027777]
Facial forgery by deepfakes has caused major security risks and raised severe societal concerns.
We formulate deepfake detection as a fine-grained classification problem and propose a new fine-grained solution to it.
Our method is based on learning subtle and generalizable features by effectively suppressing background noise and learning discriminative features at various scales for deepfake detection.
arXiv Detail & Related papers (2023-10-10T21:30:05Z) - CrossDF: Improving Cross-Domain Deepfake Detection with Deep Information Decomposition [53.860796916196634]
We propose a Deep Information Decomposition (DID) framework to enhance the performance of Cross-dataset Deepfake Detection (CrossDF)
Unlike most existing deepfake detection methods, our framework prioritizes high-level semantic features over specific visual artifacts.
It adaptively decomposes facial features into deepfake-related and irrelevant information, only using the intrinsic deepfake-related information for real/fake discrimination.
arXiv Detail & Related papers (2023-09-30T12:30:25Z) - Towards Generalizable Deepfake Detection by Primary Region
Regularization [52.41801719896089]
This paper enhances the generalization capability from a novel regularization perspective.
Our method consists of two stages, namely the static localization for primary region maps, and the dynamic exploitation of primary region masks.
We conduct extensive experiments over three widely used deepfake datasets - DFDC, DF-1.0, and Celeb-DF with five backbones.
arXiv Detail & Related papers (2023-07-24T05:43:34Z) - Learning Pairwise Interaction for Generalizable DeepFake Detection [20.723277551489186]
A fast-paced development of DeepFake generation techniques challenge the detection schemes designed for known type DeepFakes.
We propose a new approach, Multi-Channel Xception Attention Pairwise Interaction (MCX-API), that exploits the power of pairwise learning and complementary information from different color space representations.
Our experiments indicate that our proposed method can generalize better than the state-of-the-art Deepfakes detectors.
arXiv Detail & Related papers (2023-02-26T10:39:08Z) - A Continual Deepfake Detection Benchmark: Dataset, Methods, and
Essentials [97.69553832500547]
This paper suggests a continual deepfake detection benchmark (CDDB) over a new collection of deepfakes from both known and unknown generative models.
We exploit multiple approaches to adapt multiclass incremental learning methods, commonly used in the continual visual recognition, to the continual deepfake detection problem.
arXiv Detail & Related papers (2022-05-11T13:07:19Z) - MD-CSDNetwork: Multi-Domain Cross Stitched Network for Deepfake
Detection [80.83725644958633]
Current deepfake generation methods leave discriminative artifacts in the frequency spectrum of fake images and videos.
We present a novel approach, termed as MD-CSDNetwork, for combining the features in the spatial and frequency domains to mine a shared discriminative representation.
arXiv Detail & Related papers (2021-09-15T14:11:53Z) - TAR: Generalized Forensic Framework to Detect Deepfakes using Weakly
Supervised Learning [17.40885531847159]
Deepfakes have become a critical social problem, and detecting them is of utmost importance.
In this work, we introduce a practical digital forensic tool to detect different types of deepfakes simultaneously.
We develop an autoencoder-based detection model with Residual blocks and sequentially perform transfer learning to detect different types of deepfakes simultaneously.
arXiv Detail & Related papers (2021-05-13T07:31:08Z) - M2TR: Multi-modal Multi-scale Transformers for Deepfake Detection [74.19291916812921]
forged images generated by Deepfake techniques pose a serious threat to the trustworthiness of digital information.
In this paper, we aim to capture the subtle manipulation artifacts at different scales for Deepfake detection.
We introduce a high-quality Deepfake dataset, SR-DF, which consists of 4,000 DeepFake videos generated by state-of-the-art face swapping and facial reenactment methods.
arXiv Detail & Related papers (2021-04-20T05:43:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.