Exposing the Deception: Uncovering More Forgery Clues for Deepfake
Detection
- URL: http://arxiv.org/abs/2403.01786v1
- Date: Mon, 4 Mar 2024 07:28:23 GMT
- Title: Exposing the Deception: Uncovering More Forgery Clues for Deepfake
Detection
- Authors: Zhongjie Ba, Qingyu Liu, Zhenguang Liu, Shuang Wu, Feng Lin, Li Lu,
Kui Ren
- Abstract summary: Current deepfake detection approaches may easily fall into the trap of overfitting, focusing only on forgery clues within one or a few local regions.
We present a novel framework to capture broader forgery clues by extracting multiple non-overlapping local representations and fusing them into a global semantic-rich feature.
Our method achieves state-of-the-art performance on five benchmark datasets.
- Score: 36.92399832886853
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deepfake technology has given rise to a spectrum of novel and compelling
applications. Unfortunately, the widespread proliferation of high-fidelity fake
videos has led to pervasive confusion and deception, shattering our faith that
seeing is believing. One aspect that has been overlooked so far is that current
deepfake detection approaches may easily fall into the trap of overfitting,
focusing only on forgery clues within one or a few local regions. Moreover,
existing works heavily rely on neural networks to extract forgery features,
lacking theoretical constraints guaranteeing that sufficient forgery clues are
extracted and superfluous features are eliminated. These deficiencies culminate
in unsatisfactory accuracy and limited generalizability in real-life scenarios.
In this paper, we try to tackle these challenges through three designs: (1)
We present a novel framework to capture broader forgery clues by extracting
multiple non-overlapping local representations and fusing them into a global
semantic-rich feature. (2) Based on the information bottleneck theory, we
derive Local Information Loss to guarantee the orthogonality of local
representations while preserving comprehensive task-relevant information. (3)
Further, to fuse the local representations and remove task-irrelevant
information, we arrive at a Global Information Loss through the theoretical
analysis of mutual information. Empirically, our method achieves
state-of-the-art performance on five benchmark datasets.Our code is available
at \url{https://github.com/QingyuLiu/Exposing-the-Deception}, hoping to inspire
researchers.
Related papers
- DiffusionFake: Enhancing Generalization in Deepfake Detection via Guided Stable Diffusion [94.46904504076124]
Deepfake technology has made face swapping highly realistic, raising concerns about the malicious use of fabricated facial content.
Existing methods often struggle to generalize to unseen domains due to the diverse nature of facial manipulations.
We introduce DiffusionFake, a novel framework that reverses the generative process of face forgeries to enhance the generalization of detection models.
arXiv Detail & Related papers (2024-10-06T06:22:43Z) - ED$^4$: Explicit Data-level Debiasing for Deepfake Detection [24.695989108814018]
Learning intrinsic bias from limited data has been considered the main reason for the failure of deepfake detection with generalizability.
We present ED$4$, a simple and effective strategy to address aforementioned biases explicitly at the data level.
We conduct extensive experiments to demonstrate its effectiveness and superiority over existing deepfake detection approaches.
arXiv Detail & Related papers (2024-08-13T10:05:20Z) - Locate and Verify: A Two-Stream Network for Improved Deepfake Detection [33.50963446256726]
Current deepfake detection methods are typically inadequate in generalizability.
We propose an innovative two-stream network that effectively enlarges the potential regions from which the model extracts evidence.
We also propose a Semi-supervised Patch Similarity Learning strategy to estimate patch-level forged location annotations.
arXiv Detail & Related papers (2023-09-20T08:25:19Z) - Robust Saliency-Aware Distillation for Few-shot Fine-grained Visual
Recognition [57.08108545219043]
Recognizing novel sub-categories with scarce samples is an essential and challenging research topic in computer vision.
Existing literature addresses this challenge by employing local-based representation approaches.
This article proposes a novel model, Robust Saliency-aware Distillation (RSaD), for few-shot fine-grained visual recognition.
arXiv Detail & Related papers (2023-05-12T00:13:17Z) - Cross-Domain Local Characteristic Enhanced Deepfake Video Detection [18.430287055542315]
Deepfake detection has attracted increasing attention due to security concerns.
Many detectors cannot achieve accurate results when detecting unseen manipulations.
We propose a novel pipeline, Cross-Domain Local Forensics, for more general deepfake video detection.
arXiv Detail & Related papers (2022-11-07T07:44:09Z) - FedForgery: Generalized Face Forgery Detection with Residual Federated
Learning [87.746829550726]
Existing face forgery detection methods directly utilize the obtained public shared or centralized data for training.
The paper proposes a novel generalized residual Federated learning for face Forgery detection (FedForgery)
Experiments conducted on publicly available face forgery detection datasets prove the superior performance of the proposed FedForgery.
arXiv Detail & Related papers (2022-10-18T03:32:18Z) - Delving into Sequential Patches for Deepfake Detection [64.19468088546743]
Recent advances in face forgery techniques produce nearly untraceable deepfake videos, which could be leveraged with malicious intentions.
Previous studies has identified the importance of local low-level cues and temporal information in pursuit to generalize well across deepfake methods.
We propose the Local- & Temporal-aware Transformer-based Deepfake Detection framework, which adopts a local-to-global learning protocol.
arXiv Detail & Related papers (2022-07-06T16:46:30Z) - Multi-attentional Deepfake Detection [79.80308897734491]
Face forgery by deepfake is widely spread over the internet and has raised severe societal concerns.
We propose a new multi-attentional deepfake detection network. Specifically, it consists of three key components: 1) multiple spatial attention heads to make the network attend to different local parts; 2) textural feature enhancement block to zoom in the subtle artifacts in shallow features; 3) aggregate the low-level textural feature and high-level semantic features guided by the attention maps.
arXiv Detail & Related papers (2021-03-03T13:56:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.