Towards Benchmarking and Evaluating Deepfake Detection
- URL: http://arxiv.org/abs/2203.02115v2
- Date: Wed, 13 Mar 2024 06:07:52 GMT
- Title: Towards Benchmarking and Evaluating Deepfake Detection
- Authors: Chenhao Lin, Jingyi Deng, Pengbin Hu, Chao Shen, Qian Wang, Qi Li
- Abstract summary: Deepfake detection automatically recognizes the manipulated medias through the analysis of the difference between manipulated and non-altered videos.
It is difficult to conduct a sound benchmarking comparison of existing detection approaches because evaluation conditions are inconsistent across studies.
Our objective is to establish a comprehensive and consistent benchmark, to develop a repeatable evaluation procedure, and to measure the performance of a range of detection approaches.
- Score: 18.758101631430726
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deepfake detection automatically recognizes the manipulated medias through
the analysis of the difference between manipulated and non-altered videos. It
is natural to ask which are the top performers among the existing deepfake
detection approaches to identify promising research directions and provide
practical guidance. Unfortunately, it's difficult to conduct a sound
benchmarking comparison of existing detection approaches using the results in
the literature because evaluation conditions are inconsistent across studies.
Our objective is to establish a comprehensive and consistent benchmark, to
develop a repeatable evaluation procedure, and to measure the performance of a
range of detection approaches so that the results can be compared soundly. A
challenging dataset consisting of the manipulated samples generated by more
than 13 different methods has been collected, and 11 popular detection
approaches (9 algorithms) from the existing literature have been implemented
and evaluated with 6 fair-minded and practical evaluation metrics. Finally, 92
models have been trained and 644 experiments have been performed for the
evaluation. The results along with the shared data and evaluation methodology
constitute a benchmark for comparing deepfake detection approaches and
measuring progress.
Related papers
- Datasets, Clues and State-of-the-Arts for Multimedia Forensics: An
Extensive Review [19.30075248247771]
This survey focusses on approaches for tampering detection in multimedia data using deep learning models.
It presents a detailed analysis of benchmark datasets for malicious manipulation detection that are publicly available.
It also offers a comprehensive list of tampering clues and commonly used deep learning architectures.
arXiv Detail & Related papers (2024-01-13T07:03:58Z) - On Pixel-level Performance Assessment in Anomaly Detection [87.7131059062292]
Anomaly detection methods have demonstrated remarkable success across various applications.
However, assessing their performance, particularly at the pixel-level, presents a complex challenge.
In this paper, we dissect the intricacies of this challenge, underscored by visual evidence and statistical analysis.
arXiv Detail & Related papers (2023-10-25T08:02:27Z) - Beyond AUROC & co. for evaluating out-of-distribution detection
performance [50.88341818412508]
Given their relevance for safe(r) AI, it is important to examine whether the basis for comparing OOD detection methods is consistent with practical needs.
We propose a new metric - Area Under the Threshold Curve (AUTC), which explicitly penalizes poor separation between ID and OOD samples.
arXiv Detail & Related papers (2023-06-26T12:51:32Z) - An Experimental Investigation into the Evaluation of Explainability
Methods [60.54170260771932]
This work compares 14 different metrics when applied to nine state-of-the-art XAI methods and three dummy methods (e.g., random saliency maps) used as references.
Experimental results show which of these metrics produces highly correlated results, indicating potential redundancy.
arXiv Detail & Related papers (2023-05-25T08:07:07Z) - Assessment Framework for Deepfake Detection in Real-world Situations [13.334500258498798]
Deep learning-based deepfake detection methods have exhibited remarkable performance.
The impact of various image and video processing operations and typical workflow distortions on detection accuracy has not been systematically measured.
A more reliable assessment framework is proposed to evaluate the performance of learning-based deepfake detectors in more realistic settings.
arXiv Detail & Related papers (2023-04-12T19:09:22Z) - Better Understanding Differences in Attribution Methods via Systematic Evaluations [57.35035463793008]
Post-hoc attribution methods have been proposed to identify image regions most influential to the models' decisions.
We propose three novel evaluation schemes to more reliably measure the faithfulness of those methods.
We use these evaluation schemes to study strengths and shortcomings of some widely used attribution methods over a wide range of models.
arXiv Detail & Related papers (2023-03-21T14:24:58Z) - Benchmarking common uncertainty estimation methods with
histopathological images under domain shift and label noise [62.997667081978825]
In high-risk environments, deep learning models need to be able to judge their uncertainty and reject inputs when there is a significant chance of misclassification.
We conduct a rigorous evaluation of the most commonly used uncertainty and robustness methods for the classification of Whole Slide Images.
We observe that ensembles of methods generally lead to better uncertainty estimates as well as an increased robustness towards domain shifts and label noise.
arXiv Detail & Related papers (2023-01-03T11:34:36Z) - A Continual Deepfake Detection Benchmark: Dataset, Methods, and
Essentials [97.69553832500547]
This paper suggests a continual deepfake detection benchmark (CDDB) over a new collection of deepfakes from both known and unknown generative models.
We exploit multiple approaches to adapt multiclass incremental learning methods, commonly used in the continual visual recognition, to the continual deepfake detection problem.
arXiv Detail & Related papers (2022-05-11T13:07:19Z) - A Revealing Large-Scale Evaluation of Unsupervised Anomaly Detection
Algorithms [0.0]
Anomaly detection has many applications ranging from bank-fraud detection and cyber-threat detection to equipment maintenance and health monitoring.
We extensively reviewed twelve of the most popular unsupervised anomaly detection methods.
arXiv Detail & Related papers (2022-04-21T00:17:12Z) - Practical Evaluation of Out-of-Distribution Detection Methods for Image
Classification [22.26009759606856]
In this paper, we experimentally evaluate the performance of representative OOD detection methods for three scenarios.
The results show that differences in scenarios and datasets alter the relative performance among the methods.
Our results can also be used as a guide for the selection of OOD detection methods.
arXiv Detail & Related papers (2021-01-07T09:28:45Z) - A Review and Comparative Study on Probabilistic Object Detection in
Autonomous Driving [14.034548457000884]
Capturing uncertainty in object detection is indispensable for safe autonomous driving.
There is no summary on uncertainty estimation in deep object detection.
This paper provides a review and comparative study on existing probabilistic object detection methods.
arXiv Detail & Related papers (2020-11-20T22:30:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.