Improving the Perturbation-Based Explanation of Deepfake Detectors Through the Use of Adversarially-Generated Samples
- URL: http://arxiv.org/abs/2502.03957v1
- Date: Thu, 06 Feb 2025 10:47:34 GMT
- Title: Improving the Perturbation-Based Explanation of Deepfake Detectors Through the Use of Adversarially-Generated Samples
- Authors: Konstantinos Tsigos, Evlampios Apostolidis, Vasileios Mezaris,
- Abstract summary: We introduce the idea of using adversarially-generated samples of the input images that were classified as deepfakes by a detector.
We generate these samples based on Natural Evolution Strategies, aiming to flip the original deepfake detector's decision and classify these samples as real.
We apply this idea to four perturbation-based explanation methods and evaluate the performance of the resulting modified methods.
- Score: 6.076406622352117
- License:
- Abstract: In this paper, we introduce the idea of using adversarially-generated samples of the input images that were classified as deepfakes by a detector, to form perturbation masks for inferring the importance of different input features and produce visual explanations. We generate these samples based on Natural Evolution Strategies, aiming to flip the original deepfake detector's decision and classify these samples as real. We apply this idea to four perturbation-based explanation methods (LIME, SHAP, SOBOL and RISE) and evaluate the performance of the resulting modified methods using a SOTA deepfake detection model, a benchmarking dataset (FaceForensics++) and a corresponding explanation evaluation framework. Our quantitative assessments document the mostly positive contribution of the proposed perturbation approach in the performance of explanation methods. Our qualitative analysis shows the capacity of the modified explanation methods to demarcate the manipulated image regions more accurately, and thus to provide more useful explanations.
Related papers
- Understanding and Improving Training-Free AI-Generated Image Detections with Vision Foundation Models [68.90917438865078]
Deepfake techniques for facial synthesis and editing pose serious risks for generative models.
In this paper, we investigate how detection performance varies across model backbones, types, and datasets.
We introduce Contrastive Blur, which enhances performance on facial images, and MINDER, which addresses noise type bias, balancing performance across domains.
arXiv Detail & Related papers (2024-11-28T13:04:45Z) - On the Inherent Robustness of One-Stage Object Detection against Out-of-Distribution Data [6.267143531261792]
We propose a novel detection algorithm for detecting unknown objects in image data.
It exploits supervised dimensionality reduction techniques to mitigate the effects of the curse of dimensionality on the features extracted by the model.
It utilizes high-resolution feature maps to identify potential unknown objects in an unsupervised fashion.
arXiv Detail & Related papers (2024-11-07T10:15:25Z) - Towards Quantitative Evaluation of Explainable AI Methods for Deepfake Detection [12.179602756337818]
This framework assesses the ability of an explanation method to spot the regions of a fake image with the biggest influence on the decision of the deepfake detector.
We conduct a comparative study using a state-of-the-art model for deepfake detection that has been trained on the FaceForensics++ dataset.
arXiv Detail & Related papers (2024-04-29T12:32:14Z) - A Quantitative Evaluation of Score Distillation Sampling Based
Text-to-3D [54.78611187426158]
We propose more objective quantitative evaluation metrics, which we cross-validate via human ratings, and show analysis of the failure cases of the SDS technique.
We demonstrate the effectiveness of this analysis by designing a novel computationally efficient baseline model.
arXiv Detail & Related papers (2024-02-29T00:54:09Z) - Diffusion-based Visual Counterfactual Explanations -- Towards Systematic
Quantitative Evaluation [64.0476282000118]
Latest methods for visual counterfactual explanations (VCE) harness the power of deep generative models to synthesize new examples of high-dimensional images of impressive quality.
It is currently difficult to compare the performance of these VCE methods as the evaluation procedures largely vary and often boil down to visual inspection of individual examples and small scale user studies.
We propose a framework for systematic, quantitative evaluation of the VCE methods and a minimal set of metrics to be used.
arXiv Detail & Related papers (2023-08-11T12:22:37Z) - Self-Supervised Graph Transformer for Deepfake Detection [1.8133635752982105]
Deepfake detection methods have shown promising results in recognizing forgeries within a given dataset.
Deepfake detection system must remain impartial to forgery types, appearance, and quality for guaranteed generalizable detection performance.
This study introduces a deepfake detection framework, leveraging a self-supervised pre-training model that delivers exceptional generalization ability.
arXiv Detail & Related papers (2023-07-27T17:22:41Z) - ODAM: Gradient-based instance-specific visual explanations for object
detection [51.476702316759635]
gradient-weighted Object Detector Activation Maps (ODAM)
ODAM produces heat maps that show the influence of regions on the detector's decision for each predicted attribute.
We propose Odam-NMS, which considers the information of the model's explanation for each prediction to distinguish duplicate detected objects.
arXiv Detail & Related papers (2023-04-13T09:20:26Z) - Assessment Framework for Deepfake Detection in Real-world Situations [13.334500258498798]
Deep learning-based deepfake detection methods have exhibited remarkable performance.
The impact of various image and video processing operations and typical workflow distortions on detection accuracy has not been systematically measured.
A more reliable assessment framework is proposed to evaluate the performance of learning-based deepfake detectors in more realistic settings.
arXiv Detail & Related papers (2023-04-12T19:09:22Z) - Watermarking for Out-of-distribution Detection [76.20630986010114]
Out-of-distribution (OOD) detection aims to identify OOD data based on representations extracted from well-trained deep models.
We propose a general methodology named watermarking in this paper.
We learn a unified pattern that is superimposed onto features of original data, and the model's detection capability is largely boosted after watermarking.
arXiv Detail & Related papers (2022-10-27T06:12:32Z) - Self-Supervised Training with Autoencoders for Visual Anomaly Detection [61.62861063776813]
We focus on a specific use case in anomaly detection where the distribution of normal samples is supported by a lower-dimensional manifold.
We adapt a self-supervised learning regime that exploits discriminative information during training but focuses on the submanifold of normal examples.
We achieve a new state-of-the-art result on the MVTec AD dataset -- a challenging benchmark for visual anomaly detection in the manufacturing domain.
arXiv Detail & Related papers (2022-06-23T14:16:30Z) - Explaining Convolutional Neural Networks through Attribution-Based Input
Sampling and Block-Wise Feature Aggregation [22.688772441351308]
Methods based on class activation mapping and randomized input sampling have gained great popularity.
However, the attribution methods provide lower resolution and blurry explanation maps that limit their explanation power.
In this work, we collect visualization maps from multiple layers of the model based on an attribution-based input sampling technique.
We also propose a layer selection strategy that applies to the whole family of CNN-based models.
arXiv Detail & Related papers (2020-10-01T20:27:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.