Unearthing Common Inconsistency for Generalisable Deepfake Detection
- URL: http://arxiv.org/abs/2311.11549v1
- Date: Mon, 20 Nov 2023 06:04:09 GMT
- Title: Unearthing Common Inconsistency for Generalisable Deepfake Detection
- Authors: Beilin Chu, Xuan Xu, Weike You and Linna Zhou
- Abstract summary: Video-level one shows its potential to have both generalization across multiple domains and robustness to compression.
We propose a detection approach by capturing frame inconsistency that broadly exists in different forgery techniques.
We introduce a temporally-preserved module method to introduce spatial noise perturbations, directing the model's attention towards temporal information.
- Score: 8.327980745153216
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deepfake has emerged for several years, yet efficient detection techniques
could generalize over different manipulation methods require further research.
While current image-level detection method fails to generalize to unseen
domains, owing to the domain-shift phenomenon brought by CNN's strong inductive
bias towards Deepfake texture, video-level one shows its potential to have both
generalization across multiple domains and robustness to compression. We argue
that although distinct face manipulation tools have different inherent bias,
they all disrupt the consistency between frames, which is a natural
characteristic shared by authentic videos. Inspired by this, we proposed a
detection approach by capturing frame inconsistency that broadly exists in
different forgery techniques, termed unearthing-common-inconsistency (UCI).
Concretely, the UCI network based on self-supervised contrastive learning can
better distinguish temporal consistency between real and fake videos from
multiple domains. We introduced a temporally-preserved module method to
introduce spatial noise perturbations, directing the model's attention towards
temporal information. Subsequently, leveraging a multi-view cross-correlation
learning module, we extensively learn the disparities in temporal
representations between genuine and fake samples. Extensive experiments
demonstrate the generalization ability of our method on unseen Deepfake
domains.
Related papers
- DIP: Diffusion Learning of Inconsistency Pattern for General DeepFake Detection [18.116004258266535]
A transformer-based framework for Diffusion Inconsistency Learning (DIP) is proposed, which exploits directional inconsistencies for deepfake video detection.
Our method could effectively identify forgery clues and achieve state-of-the-art performance.
arXiv Detail & Related papers (2024-10-31T06:26:00Z) - UniForensics: Face Forgery Detection via General Facial Representation [60.5421627990707]
High-level semantic features are less susceptible to perturbations and not limited to forgery-specific artifacts, thus having stronger generalization.
We introduce UniForensics, a novel deepfake detection framework that leverages a transformer-based video network, with a meta-functional face classification for enriched facial representation.
arXiv Detail & Related papers (2024-07-26T20:51:54Z) - Learning Natural Consistency Representation for Face Forgery Video Detection [23.53549629885891]
We propose to learn the Natural representation (NACO) real face videos in a self-supervised manner.
Our method outperforms other state-of-the-art methods with impressive robustness.
arXiv Detail & Related papers (2024-07-15T09:00:02Z) - Adversarially Robust Deepfake Detection via Adversarial Feature Similarity Learning [0.0]
Deepfake technology has raised concerns about the authenticity of digital content, necessitating the development of effective detection methods.
Adversaries can manipulate deepfake videos with small, imperceptible perturbations that can deceive the detection models into producing incorrect outputs.
We introduce Adversarial Feature Similarity Learning (AFSL), which integrates three fundamental deep feature learning paradigms.
arXiv Detail & Related papers (2024-02-06T11:35:05Z) - Cross-Domain Local Characteristic Enhanced Deepfake Video Detection [18.430287055542315]
Deepfake detection has attracted increasing attention due to security concerns.
Many detectors cannot achieve accurate results when detecting unseen manipulations.
We propose a novel pipeline, Cross-Domain Local Forensics, for more general deepfake video detection.
arXiv Detail & Related papers (2022-11-07T07:44:09Z) - Deep Convolutional Pooling Transformer for Deepfake Detection [54.10864860009834]
We propose a deep convolutional Transformer to incorporate decisive image features both locally and globally.
Specifically, we apply convolutional pooling and re-attention to enrich the extracted features and enhance efficacy.
The proposed solution consistently outperforms several state-of-the-art baselines on both within- and cross-dataset experiments.
arXiv Detail & Related papers (2022-09-12T15:05:41Z) - Detecting Deepfake by Creating Spatio-Temporal Regularity Disruption [94.5031244215761]
We propose to boost the generalization of deepfake detection by distinguishing the "regularity disruption" that does not appear in real videos.
Specifically, by carefully examining the spatial and temporal properties, we propose to disrupt a real video through a Pseudo-fake Generator.
Such practice allows us to achieve deepfake detection without using fake videos and improves the generalization ability in a simple and efficient manner.
arXiv Detail & Related papers (2022-07-21T10:42:34Z) - Detection of Deepfake Videos Using Long Distance Attention [73.6659488380372]
Most existing detection methods treat the problem as a vanilla binary classification problem.
In this paper, the problem is treated as a special fine-grained classification problem since the differences between fake and real faces are very subtle.
A spatial-temporal model is proposed which has two components for capturing spatial and temporal forgery traces in global perspective.
arXiv Detail & Related papers (2021-06-24T08:33:32Z) - Spatio-temporal Features for Generalized Detection of Deepfake Videos [12.453288832098314]
We propose-temporal features, modeled by 3D CNNs, to extend the capabilities to detect new sorts of deep videos.
We show that our approach outperforms existing methods in terms of generalization capabilities.
arXiv Detail & Related papers (2020-10-22T16:28:50Z) - DeFeat-Net: General Monocular Depth via Simultaneous Unsupervised
Representation Learning [65.94499390875046]
DeFeat-Net is an approach to simultaneously learn a cross-domain dense feature representation.
Our technique is able to outperform the current state-of-the-art with around 10% reduction in all error measures.
arXiv Detail & Related papers (2020-03-30T13:10:32Z) - Cross-domain Object Detection through Coarse-to-Fine Feature Adaptation [62.29076080124199]
This paper proposes a novel coarse-to-fine feature adaptation approach to cross-domain object detection.
At the coarse-grained stage, foreground regions are extracted by adopting the attention mechanism, and aligned according to their marginal distributions.
At the fine-grained stage, we conduct conditional distribution alignment of foregrounds by minimizing the distance of global prototypes with the same category but from different domains.
arXiv Detail & Related papers (2020-03-23T13:40:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.