Sparse deepfake detection promotes better disentanglement
- URL: http://arxiv.org/abs/2510.05696v1
- Date: Tue, 07 Oct 2025 09:03:39 GMT
- Title: Sparse deepfake detection promotes better disentanglement
- Authors: Antoine Teissier, Marie Tahon, Nicolas Dugué, Aghilas Sini,
- Abstract summary: We show that sparse deepfake detection can improve detection performance, with an EER of 23.36% on ASVSpoof5 test set, with 95% of sparsity.<n>We then show that these representations provide better disentanglement, using completeness and modularity metrics based on mutual information.
- Score: 4.901409400999413
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Due to the rapid progress of speech synthesis, deepfake detection has become a major concern in the speech processing community. Because it is a critical task, systems must not only be efficient and robust, but also provide interpretable explanations. Among the different approaches for explainability, we focus on the interpretation of latent representations. In such paper, we focus on the last layer of embeddings of AASIST, a deepfake detection architecture. We use a TopK activation inspired by SAEs on this layer to obtain sparse representations which are used in the decision process. We demonstrate that sparse deepfake detection can improve detection performance, with an EER of 23.36% on ASVSpoof5 test set, with 95% of sparsity. We then show that these representations provide better disentanglement, using completeness and modularity metrics based on mutual information. Notably, some attacks are directly encoded in the latent space.
Related papers
- Generalizable Speech Deepfake Detection via Information Bottleneck Enhanced Adversarial Alignment [48.73836179661632]
Confidence-guided adversarial alignment adaptively suppresses attack-specific artifacts without erasing discriminative cues.<n>IB-CAAN consistently outperforms baseline and state-of-the-art performance on many benchmarks.
arXiv Detail & Related papers (2025-09-28T03:48:49Z) - Diversity Boosts AI-Generated Text Detection [51.56484100374058]
DivEye is a novel framework that captures how unpredictability fluctuates across a text using surprisal-based features.<n>Our method outperforms existing zero-shot detectors by up to 33.2% and achieves competitive performance with fine-tuned baselines.
arXiv Detail & Related papers (2025-09-23T10:21:22Z) - DiffusionFF: Face Forgery Detection via Diffusion-based Artifact Localization [21.139016641596676]
DiffusionFF is a novel framework that enhances face forgery detection through diffusion-based artifact localization.<n>Our method utilizes a denoising diffusion model to generate high-quality Structural Dissimilarity (DSSIM) maps, which effectively capture subtle traces of manipulation.
arXiv Detail & Related papers (2025-08-03T18:06:04Z) - Uncovering Critical Features for Deepfake Detection through the Lottery Ticket Hypothesis [1.723963662326051]
Deepfake technology poses significant challenges to information integrity and social trust.<n>This study investigates the application of the Lottery Ticket Hypothesis (LTH) to deepfake detection.<n>We examine how neural networks can be efficiently pruned while maintaining high detection accuracy.
arXiv Detail & Related papers (2025-07-21T13:58:24Z) - XAI-Based Detection of Adversarial Attacks on Deepfake Detectors [0.0]
We introduce a novel methodology for identifying adversarial attacks on deepfake detectors using XAI.
Our approach contributes not only to the detection of deepfakes but also enhances the understanding of possible adversarial attacks.
arXiv Detail & Related papers (2024-03-05T13:25:30Z) - Transcending Forgery Specificity with Latent Space Augmentation for Generalizable Deepfake Detection [57.646582245834324]
We propose a simple yet effective deepfake detector called LSDA.
It is based on a idea: representations with a wider variety of forgeries should be able to learn a more generalizable decision boundary.
We show that our proposed method is surprisingly effective and transcends state-of-the-art detectors across several widely used benchmarks.
arXiv Detail & Related papers (2023-11-19T09:41:10Z) - CrossDF: Improving Cross-Domain Deepfake Detection with Deep Information Decomposition [53.860796916196634]
We propose a Deep Information Decomposition (DID) framework to enhance the performance of Cross-dataset Deepfake Detection (CrossDF)
Unlike most existing deepfake detection methods, our framework prioritizes high-level semantic features over specific visual artifacts.
It adaptively decomposes facial features into deepfake-related and irrelevant information, only using the intrinsic deepfake-related information for real/fake discrimination.
arXiv Detail & Related papers (2023-09-30T12:30:25Z) - Be Your Own Neighborhood: Detecting Adversarial Example by the
Neighborhood Relations Built on Self-Supervised Learning [64.78972193105443]
This paper presents a novel AE detection framework, named trustworthy for predictions.
performs the detection by distinguishing the AE's abnormal relation with its augmented versions.
An off-the-shelf Self-Supervised Learning (SSL) model is used to extract the representation and predict the label.
arXiv Detail & Related papers (2022-08-31T08:18:44Z) - Detecting Adversarial Perturbations in Multi-Task Perception [32.9951531295576]
We propose a novel adversarial perturbation detection scheme based on multi-task perception of complex vision tasks.
adversarial perturbations are detected by inconsistencies between extracted edges of the input image, the depth output, and the segmentation output.
We show that under an assumption of a 5% false positive rate up to 100% of images are correctly detected as adversarially perturbed, depending on the strength of the perturbation.
arXiv Detail & Related papers (2022-03-02T15:25:17Z) - Detection of Adversarial Supports in Few-shot Classifiers Using Feature
Preserving Autoencoders and Self-Similarity [89.26308254637702]
We propose a detection strategy to highlight adversarial support sets.
We make use of feature preserving autoencoder filtering and also the concept of self-similarity of a support set to perform this detection.
Our method is attack-agnostic and also the first to explore detection for few-shot classifiers to the best of our knowledge.
arXiv Detail & Related papers (2020-12-09T14:13:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.