Exploring Unbiased Deepfake Detection via Token-Level Shuffling and Mixing
- URL: http://arxiv.org/abs/2501.04376v1
- Date: Wed, 08 Jan 2025 09:30:45 GMT
- Title: Exploring Unbiased Deepfake Detection via Token-Level Shuffling and Mixing
- Authors: Xinghe Fu, Zhiyuan Yan, Taiping Yao, Shen Chen, Xi Li,
- Abstract summary: We identify two biases that detectors may also be prone to overfitting: position bias and content bias.
For the position bias, we observe that detectors are prone to lazily depending on the specific positions within an image.
As for content bias, we argue that detectors may potentially and mistakenly utilize forgery-unrelated information for detection.
- Score: 22.61113682126067
- License:
- Abstract: The generalization problem is broadly recognized as a critical challenge in detecting deepfakes. Most previous work believes that the generalization gap is caused by the differences among various forgery methods. However, our investigation reveals that the generalization issue can still occur when forgery-irrelevant factors shift. In this work, we identify two biases that detectors may also be prone to overfitting: position bias and content bias, as depicted in Fig. 1. For the position bias, we observe that detectors are prone to lazily depending on the specific positions within an image (e.g., central regions even no forgery). As for content bias, we argue that detectors may potentially and mistakenly utilize forgery-unrelated information for detection (e.g., background, and hair). To intervene these biases, we propose two branches for shuffling and mixing with tokens in the latent space of transformers. For the shuffling branch, we rearrange the tokens and corresponding position embedding for each image while maintaining the local correlation. For the mixing branch, we randomly select and mix the tokens in the latent space between two images with the same label within the mini-batch to recombine the content information. During the learning process, we align the outputs of detectors from different branches in both feature space and logit space. Contrastive losses for features and divergence losses for logits are applied to obtain unbiased feature representation and classifiers. We demonstrate and verify the effectiveness of our method through extensive experiments on widely used evaluation datasets.
Related papers
- ED$^4$: Explicit Data-level Debiasing for Deepfake Detection [24.695989108814018]
Learning intrinsic bias from limited data has been considered the main reason for the failure of deepfake detection with generalizability.
We present ED$4$, a simple and effective strategy to address aforementioned biases explicitly at the data level.
We conduct extensive experiments to demonstrate its effectiveness and superiority over existing deepfake detection approaches.
arXiv Detail & Related papers (2024-08-13T10:05:20Z) - GeneralAD: Anomaly Detection Across Domains by Attending to Distorted Features [68.14842693208465]
GeneralAD is an anomaly detection framework designed to operate in semantic, near-distribution, and industrial settings.
We propose a novel self-supervised anomaly generation module that employs straightforward operations like noise addition and shuffling to patch features.
We extensively evaluated our approach on ten datasets, achieving state-of-the-art results in six and on-par performance in the remaining.
arXiv Detail & Related papers (2024-07-17T09:27:41Z) - Transcending Forgery Specificity with Latent Space Augmentation for Generalizable Deepfake Detection [57.646582245834324]
We propose a simple yet effective deepfake detector called LSDA.
It is based on a idea: representations with a wider variety of forgeries should be able to learn a more generalizable decision boundary.
We show that our proposed method is surprisingly effective and transcends state-of-the-art detectors across several widely used benchmarks.
arXiv Detail & Related papers (2023-11-19T09:41:10Z) - Is Probing All You Need? Indicator Tasks as an Alternative to Probing
Embedding Spaces [19.4968960182412]
We introduce the term indicator tasks for non-trainable tasks which are used to query embedding spaces for the existence of certain properties.
We show that the application of a suitable indicator provides a more accurate picture of the information captured and removed compared to probes.
arXiv Detail & Related papers (2023-10-24T15:08:12Z) - Spatial-Frequency Discriminability for Revealing Adversarial Perturbations [53.279716307171604]
Vulnerability of deep neural networks to adversarial perturbations has been widely perceived in the computer vision community.
Current algorithms typically detect adversarial patterns through discriminative decomposition for natural and adversarial data.
We propose a discriminative detector relying on a spatial-frequency Krawtchouk decomposition.
arXiv Detail & Related papers (2023-05-18T10:18:59Z) - Ambiguity-Resistant Semi-Supervised Learning for Dense Object Detection [98.66771688028426]
We propose a Ambiguity-Resistant Semi-supervised Learning (ARSL) for one-stage detectors.
Joint-Confidence Estimation (JCE) is proposed to quantifies the classification and localization quality of pseudo labels.
ARSL effectively mitigates the ambiguities and achieves state-of-the-art SSOD performance on MS COCO and PASCAL VOC.
arXiv Detail & Related papers (2023-03-27T07:46:58Z) - The Treasure Beneath Multiple Annotations: An Uncertainty-aware Edge
Detector [70.43599299422813]
Existing methods fuse multiple annotations using a simple voting process, ignoring the inherent ambiguity of edges and labeling bias of annotators.
We propose a novel uncertainty-aware edge detector (UAED), which employs uncertainty to investigate the subjectivity and ambiguity of diverse annotations.
UAED achieves superior performance consistently across multiple edge detection benchmarks.
arXiv Detail & Related papers (2023-03-21T13:14:36Z) - Causal Transportability for Visual Recognition [70.13627281087325]
We show that standard classifiers fail because the association between images and labels is not transportable across settings.
We then show that the causal effect, which severs all sources of confounding, remains invariant across domains.
This motivates us to develop an algorithm to estimate the causal effect for image classification.
arXiv Detail & Related papers (2022-04-26T15:02:11Z) - MC-LCR: Multi-modal contrastive classification by locally correlated
representations for effective face forgery detection [11.124150983521158]
We propose a novel framework named Multi-modal Contrastive Classification by Locally Correlated Representations.
Our MC-LCR aims to amplify implicit local discrepancies between authentic and forged faces from both spatial and frequency domains.
We achieve state-of-the-art performance and demonstrate the robustness and generalization of our method.
arXiv Detail & Related papers (2021-10-07T09:24:12Z) - Partial Wasserstein and Maximum Mean Discrepancy distances for bridging
the gap between outlier detection and drift detection [0.0]
An important aspect of monitoring is to check whether the inputs have strayed from the distribution they were validated for.
We bridge the gap between outlier detection and drift detection through comparing a given number of inputs to an automatically chosen part of the reference distribution.
arXiv Detail & Related papers (2021-06-09T18:49:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.