AWARE-NET: Adaptive Weighted Averaging for Robust Ensemble Network in Deepfake Detection
- URL: http://arxiv.org/abs/2505.00312v1
- Date: Thu, 01 May 2025 05:14:50 GMT
- Title: AWARE-NET: Adaptive Weighted Averaging for Robust Ensemble Network in Deepfake Detection
- Authors: Muhammad Salman, Iqra Tariq, Mishal Zulfiqar, Muqadas Jalal, Sami Aujla, Sumbal Fatima,
- Abstract summary: We propose a novel two-tier ensemble framework for deepfake detection based on deep learning.<n>Our framework employs a unique approach where each architecture is instantiated three times.<n>Experiments achieved state-of-the-art intra-dataset performance.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Deepfake detection has become increasingly important due to the rise of synthetic media, which poses significant risks to digital identity and cyber presence for security and trust. While multiple approaches have improved detection accuracy, challenges remain in achieving consistent performance across diverse datasets and manipulation types. In response, we propose a novel two-tier ensemble framework for deepfake detection based on deep learning that hierarchically combines multiple instances of three state-of-the-art architectures: Xception, Res2Net101, and EfficientNet-B7. Our framework employs a unique approach where each architecture is instantiated three times with different initializations to enhance model diversity, followed by a learnable weighting mechanism that dynamically combines their predictions. Unlike traditional fixed-weight ensembles, our first-tier averages predictions within each architecture family to reduce model variance, while the second tier learns optimal contribution weights through backpropagation, automatically adjusting each architecture's influence based on their detection reliability. Our experiments achieved state-of-the-art intra-dataset performance with AUC scores of 99.22% (FF++) and 100.00% (CelebDF-v2), and F1 scores of 98.06% (FF++) and 99.94% (CelebDF-v2) without augmentation. With augmentation, we achieve AUC scores of 99.47% (FF++) and 100.00% (CelebDF-v2), and F1 scores of 98.43% (FF++) and 99.95% (CelebDF-v2). The framework demonstrates robust cross-dataset generalization, achieving AUC scores of 88.20% and 72.52%, and F1 scores of 93.16% and 80.62% in cross-dataset evaluations.
Related papers
- Hybrid Deepfake Image Detection: A Comprehensive Dataset-Driven Approach Integrating Convolutional and Attention Mechanisms with Frequency Domain Features [0.6700983301090583]
We propose an ensemble-based approach that employs three different neural network architectures for deepfake detection.<n>We empirically demonstrate the effectiveness of these models in grouping real and fake images into cohesive clusters.<n>Our weighted ensemble model achieves an excellent accuracy of 93.23% on the validation dataset of the SP Cup 2025 competition.
arXiv Detail & Related papers (2025-02-15T06:02:11Z) - Decorrelating Structure via Adapters Makes Ensemble Learning Practical for Semi-supervised Learning [50.868594148443215]
In computer vision, traditional ensemble learning methods exhibit either a low training efficiency or the limited performance.
We propose a lightweight, loss-function-free, and architecture-agnostic ensemble learning by the Decorrelating Structure via Adapters (DSA) for various visual tasks.
arXiv Detail & Related papers (2024-08-08T01:31:38Z) - A Scalable and Generalized Deep Learning Framework for Anomaly Detection in Surveillance Videos [0.47279903800557493]
Anomaly detection in videos is challenging due to the complexity, noise, and diverse nature of activities such as violence, shoplifting, and vandalism.
Existing approaches have struggled to apply deep learning models across different anomaly tasks without extensive retraining.
A new DL framework is introduced in this study, consisting of three key components: transfer learning to enhance feature generalization, model fusion to improve feature representation, and multi-task classification.
Empirical evaluations demonstrate the framework's effectiveness, achieving an accuracy of 97.99% on the RLVS dataset (violence detection), 83.59% on the UCF dataset (shoplifting detection
arXiv Detail & Related papers (2024-07-17T22:41:12Z) - The effect of data augmentation and 3D-CNN depth on Alzheimer's Disease
detection [51.697248252191265]
This work summarizes and strictly observes best practices regarding data handling, experimental design, and model evaluation.
We focus on Alzheimer's Disease (AD) detection, which serves as a paradigmatic example of challenging problem in healthcare.
Within this framework, we train predictive 15 models, considering three different data augmentation strategies and five distinct 3D CNN architectures.
arXiv Detail & Related papers (2023-09-13T10:40:41Z) - An Empirical Study of Multimodal Model Merging [148.48412442848795]
Model merging is a technique that fuses multiple models trained on different tasks to generate a multi-task solution.
We conduct our study for a novel goal where we can merge vision, language, and cross-modal transformers of a modality-specific architecture.
We propose two metrics that assess the distance between weights to be merged and can serve as an indicator of the merging outcomes.
arXiv Detail & Related papers (2023-04-28T15:43:21Z) - ERNIE-SPARSE: Learning Hierarchical Efficient Transformer Through
Regularized Self-Attention [48.697458429460184]
Two factors, information bottleneck sensitivity and inconsistency between different attention topologies, could affect the performance of the Sparse Transformer.
This paper proposes a well-designed model named ERNIE-Sparse.
It consists of two distinctive parts: (i) Hierarchical Sparse Transformer (HST) to sequentially unify local and global information, and (ii) Self-Attention Regularization (SAR) to minimize the distance for transformers with different attention topologies.
arXiv Detail & Related papers (2022-03-23T08:47:01Z) - MIO : Mutual Information Optimization using Self-Supervised Binary Contrastive Learning [12.365801596593936]
We model our pre-training task as a binary classification problem to induce an implicit contrastive effect.<n>Unlike existing methods, the proposed loss function optimize the mutual information in positive and negative pairs.<n>The proposed method outperforms SOTA self-supervised contrastive frameworks on benchmark datasets.
arXiv Detail & Related papers (2021-11-24T17:51:29Z) - Precision-Weighted Federated Learning [1.8160945635344528]
We propose a novel algorithm that takes into account the variance of the gradients when computing the weighted average of the parameters of models trained in a Federated Learning setting.
Our method was evaluated using standard image classification datasets with two different data partitioning strategies (IID/non-IID) to measure the performance and speed of our method in resource-constrained environments.
arXiv Detail & Related papers (2021-07-20T17:17:10Z) - Neural Ensemble Search for Uncertainty Estimation and Dataset Shift [67.57720300323928]
Ensembles of neural networks achieve superior performance compared to stand-alone networks in terms of accuracy, uncertainty calibration and robustness to dataset shift.
We propose two methods for automatically constructing ensembles with emphvarying architectures.
We show that the resulting ensembles outperform deep ensembles not only in terms of accuracy but also uncertainty calibration and robustness to dataset shift.
arXiv Detail & Related papers (2020-06-15T17:38:15Z) - ASFD: Automatic and Scalable Face Detector [129.82350993748258]
We propose a novel Automatic and Scalable Face Detector (ASFD)
ASFD is based on a combination of neural architecture search techniques as well as a new loss design.
Our ASFD-D6 outperforms the prior strong competitors, and our lightweight ASFD-D0 runs at more than 120 FPS with Mobilenet for VGA-resolution images.
arXiv Detail & Related papers (2020-03-25T06:00:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.