XAI-Based Detection of Adversarial Attacks on Deepfake Detectors
- URL: http://arxiv.org/abs/2403.02955v1
- Date: Tue, 5 Mar 2024 13:25:30 GMT
- Title: XAI-Based Detection of Adversarial Attacks on Deepfake Detectors
- Authors: Ben Pinhasov, Raz Lapid, Rony Ohayon, Moshe Sipper and Yehudit
Aperstein
- Abstract summary: We introduce a novel methodology for identifying adversarial attacks on deepfake detectors using XAI.
Our approach contributes not only to the detection of deepfakes but also enhances the understanding of possible adversarial attacks.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We introduce a novel methodology for identifying adversarial attacks on
deepfake detectors using eXplainable Artificial Intelligence (XAI). In an era
characterized by digital advancement, deepfakes have emerged as a potent tool,
creating a demand for efficient detection systems. However, these systems are
frequently targeted by adversarial attacks that inhibit their performance. We
address this gap, developing a defensible deepfake detector by leveraging the
power of XAI. The proposed methodology uses XAI to generate interpretability
maps for a given method, providing explicit visualizations of decision-making
factors within the AI models. We subsequently employ a pretrained feature
extractor that processes both the input image and its corresponding XAI image.
The feature embeddings extracted from this process are then used for training a
simple yet effective classifier. Our approach contributes not only to the
detection of deepfakes but also enhances the understanding of possible
adversarial attacks, pinpointing potential vulnerabilities. Furthermore, this
approach does not change the performance of the deepfake detector. The paper
demonstrates promising results suggesting a potential pathway for future
deepfake detection mechanisms. We believe this study will serve as a valuable
contribution to the community, sparking much-needed discourse on safeguarding
deepfake detectors.
Related papers
- DF40: Toward Next-Generation Deepfake Detection [62.073997142001424]
Existing works identify top-notch detection algorithms and models by adhering to the common practice: training detectors on one specific dataset (e.g., FF++) and testing them on other prevalent deepfake datasets.
But can these stand-out "winners" be truly applied to tackle the myriad of realistic and diverse deepfakes lurking in the real world?
We construct a highly diverse and large-scale deepfake dataset called DF40, which comprises 40 distinct deepfake techniques.
We then conduct comprehensive evaluations using 4 standard evaluation protocols and 7 representative detectors, resulting in over 2,000 evaluations.
arXiv Detail & Related papers (2024-06-19T12:35:02Z) - Evolving from Single-modal to Multi-modal Facial Deepfake Detection: A Survey [40.11614155244292]
As AI-generated media become more realistic, the risk of misuse to spread misinformation and commit identity fraud increases.
This work traces the evolution from traditional single-modality methods to sophisticated multi-modal approaches that handle audio-visual and text-visual scenarios.
To our knowledge, this is the first survey of its kind.
arXiv Detail & Related papers (2024-06-11T05:48:04Z) - Real is not True: Backdoor Attacks Against Deepfake Detection [9.572726483706846]
We introduce a pioneering paradigm denominated as Bad-Deepfake, which represents a novel foray into the realm of backdoor attacks levied against deepfake detectors.
Our approach hinges upon the strategic manipulation of a subset of the training data, enabling us to wield disproportionate influence over the operational characteristics of a trained model.
arXiv Detail & Related papers (2024-03-11T10:57:14Z) - Adversarially Robust Deepfake Detection via Adversarial Feature Similarity Learning [0.0]
Deepfake technology has raised concerns about the authenticity of digital content, necessitating the development of effective detection methods.
Adversaries can manipulate deepfake videos with small, imperceptible perturbations that can deceive the detection models into producing incorrect outputs.
We introduce Adversarial Feature Similarity Learning (AFSL), which integrates three fundamental deep feature learning paradigms.
arXiv Detail & Related papers (2024-02-06T11:35:05Z) - What to Remember: Self-Adaptive Continual Learning for Audio Deepfake
Detection [53.063161380423715]
Existing detection models have shown remarkable success in discriminating known deepfake audio, but struggle when encountering new attack types.
We propose a continual learning approach called Radian Weight Modification (RWM) for audio deepfake detection.
arXiv Detail & Related papers (2023-12-15T09:52:17Z) - Facial Forgery-based Deepfake Detection using Fine-Grained Features [7.378937711027777]
Facial forgery by deepfakes has caused major security risks and raised severe societal concerns.
We formulate deepfake detection as a fine-grained classification problem and propose a new fine-grained solution to it.
Our method is based on learning subtle and generalizable features by effectively suppressing background noise and learning discriminative features at various scales for deepfake detection.
arXiv Detail & Related papers (2023-10-10T21:30:05Z) - Improving Cross-dataset Deepfake Detection with Deep Information
Decomposition [57.284370468207214]
Deepfake technology poses a significant threat to security and social trust.
Existing detection methods suffer from sharp performance degradation when faced with cross-dataset scenarios.
We propose a deep information decomposition (DID) framework in this paper.
arXiv Detail & Related papers (2023-09-30T12:30:25Z) - Can AI-Generated Text be Reliably Detected? [54.670136179857344]
Unregulated use of LLMs can potentially lead to malicious consequences such as plagiarism, generating fake news, spamming, etc.
Recent works attempt to tackle this problem either using certain model signatures present in the generated text outputs or by applying watermarking techniques.
In this paper, we show that these detectors are not reliable in practical scenarios.
arXiv Detail & Related papers (2023-03-17T17:53:19Z) - Illusory Attacks: Information-Theoretic Detectability Matters in Adversarial Attacks [76.35478518372692]
We introduce epsilon-illusory, a novel form of adversarial attack on sequential decision-makers.
Compared to existing attacks, we empirically find epsilon-illusory to be significantly harder to detect with automated methods.
Our findings suggest the need for better anomaly detectors, as well as effective hardware- and system-level defenses.
arXiv Detail & Related papers (2022-07-20T19:49:09Z) - Self-supervised Transformer for Deepfake Detection [112.81127845409002]
Deepfake techniques in real-world scenarios require stronger generalization abilities of face forgery detectors.
Inspired by transfer learning, neural networks pre-trained on other large-scale face-related tasks may provide useful features for deepfake detection.
In this paper, we propose a self-supervised transformer based audio-visual contrastive learning method.
arXiv Detail & Related papers (2022-03-02T17:44:40Z) - Understanding the Security of Deepfake Detection [23.118012417901078]
We study the security of state-of-the-art deepfake detection methods in adversarial settings.
We use two large-scale public deepfakes data sources including FaceForensics++ and Facebook Deepfake Detection Challenge.
Our results uncover multiple security limitations of the deepfake detection methods in adversarial settings.
arXiv Detail & Related papers (2021-07-05T14:18:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.