Ensemble-Based Deepfake Detection using State-of-the-Art Models with Robust Cross-Dataset Generalisation
- URL: http://arxiv.org/abs/2507.05996v1
- Date: Tue, 08 Jul 2025 13:54:48 GMT
- Title: Ensemble-Based Deepfake Detection using State-of-the-Art Models with Robust Cross-Dataset Generalisation
- Authors: Haroon Wahab, Hassan Ugail, Lujain Jaleel,
- Abstract summary: Machine learning-based Deepfake detection models have achieved impressive results on benchmark datasets.<n>But their performance often deteriorates significantly when evaluated on out-of-distribution data.<n>In this work, we investigate an ensemble-based approach for improving the generalization of deepfake detection systems.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Machine learning-based Deepfake detection models have achieved impressive results on benchmark datasets, yet their performance often deteriorates significantly when evaluated on out-of-distribution data. In this work, we investigate an ensemble-based approach for improving the generalization of deepfake detection systems across diverse datasets. Building on a recent open-source benchmark, we combine prediction probabilities from several state-of-the-art asymmetric models proposed at top venues. Our experiments span two distinct out-of-domain datasets and demonstrate that no single model consistently outperforms others across settings. In contrast, ensemble-based predictions provide more stable and reliable performance in all scenarios. Our results suggest that asymmetric ensembling offers a robust and scalable solution for real-world deepfake detection where prior knowledge of forgery type or quality is often unavailable.
Related papers
- Zero-Shot Image Anomaly Detection Using Generative Foundation Models [2.241618130319058]
This research explores the use of score-based generative models as foundational tools for semantic anomaly detection.<n>By analyzing Stein score errors, we introduce a novel method for identifying anomalous samples without requiring re-training on each target dataset.<n>Our approach improves over state-of-the-art and relies on training a single model on one dataset -- CelebA -- which we find to be an effective base distribution.
arXiv Detail & Related papers (2025-07-30T13:56:36Z) - CoCAI: Copula-based Conformal Anomaly Identification for Multivariate Time-Series [0.3495246564946556]
We propose a novel framework that harnesses the power of generative artificial intelligence and copula-based modeling to deliver accurate predictions and enable robust anomaly detection.
arXiv Detail & Related papers (2025-07-23T14:15:31Z) - On the Robustness of Human-Object Interaction Detection against Distribution Shift [27.40641711088878]
Human-Object Interaction (HOI) detection has seen substantial advances in recent years.<n>Existing works focus on the standard setting with ideal images and natural distribution, far from practical scenarios with inevitable distribution shifts.<n>In this work, we investigate this issue by benchmarking, analyzing, and enhancing the robustness of HOI detection models under various distribution shifts.
arXiv Detail & Related papers (2025-06-22T13:01:34Z) - Generalization is not a universal guarantee: Estimating similarity to training data with an ensemble out-of-distribution metric [0.09363323206192666]
Failure of machine learning models to generalize to new data is a core problem limiting the reliability of AI systems.<n>We propose a standardized approach for assessing data similarity by constructing a supervised autoencoder for generalizability estimation (SAGE)<n>We show that out-of-the-box model performance increases after SAGE score filtering, even when applied to data from the model's own training and test datasets.
arXiv Detail & Related papers (2025-02-22T19:21:50Z) - Ranking and Combining Latent Structured Predictive Scores without Labeled Data [2.5064967708371553]
This paper introduces a novel structured unsupervised ensemble learning model (SUEL)
It exploits the dependency between a set of predictors with continuous predictive scores, rank the predictors without labeled data and combine them to an ensembled score with weights.
The efficacy of the proposed methods is rigorously assessed through both simulation studies and real-world application of risk genes discovery.
arXiv Detail & Related papers (2024-08-14T20:14:42Z) - Bayesian Detector Combination for Object Detection with Crowdsourced Annotations [49.43709660948812]
Acquiring fine-grained object detection annotations in unconstrained images is time-consuming, expensive, and prone to noise.
We propose a novel Bayesian Detector Combination (BDC) framework to more effectively train object detectors with noisy crowdsourced annotations.
BDC is model-agnostic, requires no prior knowledge of the annotators' skill level, and seamlessly integrates with existing object detection models.
arXiv Detail & Related papers (2024-07-10T18:00:54Z) - GM-DF: Generalized Multi-Scenario Deepfake Detection [49.072106087564144]
Existing face forgery detection usually follows the paradigm of training models in a single domain.
In this paper, we elaborately investigate the generalization capacity of deepfake detection models when jointly trained on multiple face forgery detection datasets.
arXiv Detail & Related papers (2024-06-28T17:42:08Z) - Self-Supervised Graph Transformer for Deepfake Detection [1.8133635752982105]
Deepfake detection methods have shown promising results in recognizing forgeries within a given dataset.
Deepfake detection system must remain impartial to forgery types, appearance, and quality for guaranteed generalizable detection performance.
This study introduces a deepfake detection framework, leveraging a self-supervised pre-training model that delivers exceptional generalization ability.
arXiv Detail & Related papers (2023-07-27T17:22:41Z) - Preserving Knowledge Invariance: Rethinking Robustness Evaluation of Open Information Extraction [49.15931834209624]
We present the first benchmark that simulates the evaluation of open information extraction models in the real world.<n>We design and annotate a large-scale testbed in which each example is a knowledge-invariant clique.<n>By further elaborating the robustness metric, a model is judged to be robust if its performance is consistently accurate on the overall cliques.
arXiv Detail & Related papers (2023-05-23T12:05:09Z) - Leveraging Unlabeled Data to Predict Out-of-Distribution Performance [63.740181251997306]
Real-world machine learning deployments are characterized by mismatches between the source (training) and target (test) distributions.
In this work, we investigate methods for predicting the target domain accuracy using only labeled source data and unlabeled target data.
We propose Average Thresholded Confidence (ATC), a practical method that learns a threshold on the model's confidence, predicting accuracy as the fraction of unlabeled examples.
arXiv Detail & Related papers (2022-01-11T23:01:12Z) - Hidden Biases in Unreliable News Detection Datasets [60.71991809782698]
We show that selection bias during data collection leads to undesired artifacts in the datasets.
We observed a significant drop (>10%) in accuracy for all models tested in a clean split with no train/test source overlap.
We suggest future dataset creation include a simple model as a difficulty/bias probe and future model development use a clean non-overlapping site and date split.
arXiv Detail & Related papers (2021-04-20T17:16:41Z) - Diversity inducing Information Bottleneck in Model Ensembles [73.80615604822435]
In this paper, we target the problem of generating effective ensembles of neural networks by encouraging diversity in prediction.
We explicitly optimize a diversity inducing adversarial loss for learning latent variables and thereby obtain diversity in the output predictions necessary for modeling multi-modal data.
Compared to the most competitive baselines, we show significant improvements in classification accuracy, under a shift in the data distribution.
arXiv Detail & Related papers (2020-03-10T03:10:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.