Related papers: FAME: A Lightweight Spatio-Temporal Network for Model Attribution of Face-Swap Deepfakes

FAME: A Lightweight Spatio-Temporal Network for Model Attribution of Face-Swap Deepfakes

URL: http://arxiv.org/abs/2506.11477v1
Date: Fri, 13 Jun 2025 05:47:09 GMT
Title: FAME: A Lightweight Spatio-Temporal Network for Model Attribution of Face-Swap Deepfakes
Authors: Wasim Ahmad, Yan-Tsung Peng, Yuan-Hao Chang,
Abstract summary: Face-fake Deepfake videos pose growing risks to digital security, privacy, and media integrity.<n>FAME is a framework designed to capture subtle artifacts specific to different face-generative models.<n>Results show that FAME consistently outperforms existing methods in both accuracy and runtime.
Score: 9.462613446025001
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The widespread emergence of face-swap Deepfake videos poses growing risks to digital security, privacy, and media integrity, necessitating effective forensic tools for identifying the source of such manipulations. Although most prior research has focused primarily on binary Deepfake detection, the task of model attribution -- determining which generative model produced a given Deepfake -- remains underexplored. In this paper, we introduce FAME (Fake Attribution via Multilevel Embeddings), a lightweight and efficient spatio-temporal framework designed to capture subtle generative artifacts specific to different face-swap models. FAME integrates spatial and temporal attention mechanisms to improve attribution accuracy while remaining computationally efficient. We evaluate our model on three challenging and diverse datasets: Deepfake Detection and Manipulation (DFDM), FaceForensics++, and FakeAVCeleb. Results show that FAME consistently outperforms existing methods in both accuracy and runtime, highlighting its potential for deployment in real-world forensic and information security applications.

Related papers

CAST: Cross-Attentive Spatio-Temporal feature fusion for Deepfake detection [0.0]
CNNs are effective at capturing spatial artifacts, and Transformers excel at modeling temporal inconsistencies.<n>We propose a unified CAST model that leverages cross-attention to effectively fuse spatial and temporal features.<n>We evaluate the performance of our model using the FaceForensics++, Celeb-DF, and DeepfakeDetection datasets.
arXiv Detail & Related papers (2025-06-26T18:51:17Z)
DFBench: Benchmarking Deepfake Image Detection Capability of Large Multimodal Models [43.86847047796023]
Current deepfake detection methods often depend on datasets with limited generation models and content diversity.<n>We present textbfDFBench, a large-scale DeepFake Benchmark featuring 540,000 images across real, AI-edited, and AI-generated content.<n>We propose textbfMoA-DF, Mixture of Agents for DeepFake detection, leveraging a combined probability strategy from multiple LMMs.
arXiv Detail & Related papers (2025-06-03T15:45:41Z)
Open-Set Deepfake Detection: A Parameter-Efficient Adaptation Method with Forgery Style Mixture [58.60915132222421]
We introduce an approach that is both general and parameter-efficient for face forgery detection. We design a forgery-style mixture formulation that augments the diversity of forgery source domains. We show that the designed model achieves state-of-the-art generalizability with significantly reduced trainable parameters.
arXiv Detail & Related papers (2024-08-23T01:53:36Z)
UniForensics: Face Forgery Detection via General Facial Representation [60.5421627990707]
High-level semantic features are less susceptible to perturbations and not limited to forgery-specific artifacts, thus having stronger generalization. We introduce UniForensics, a novel deepfake detection framework that leverages a transformer-based video network, with a meta-functional face classification for enriched facial representation.
arXiv Detail & Related papers (2024-07-26T20:51:54Z)
DeepFidelity: Perceptual Forgery Fidelity Assessment for Deepfake Detection [67.3143177137102]
Deepfake detection refers to detecting artificially generated or edited faces in images or videos. We propose a novel Deepfake detection framework named DeepFidelity to adaptively distinguish real and fake faces.
arXiv Detail & Related papers (2023-12-07T07:19:45Z)
CrossDF: Improving Cross-Domain Deepfake Detection with Deep Information Decomposition [53.860796916196634]
We propose a Deep Information Decomposition (DID) framework to enhance the performance of Cross-dataset Deepfake Detection (CrossDF) Unlike most existing deepfake detection methods, our framework prioritizes high-level semantic features over specific visual artifacts. It adaptively decomposes facial features into deepfake-related and irrelevant information, only using the intrinsic deepfake-related information for real/fake discrimination.
arXiv Detail & Related papers (2023-09-30T12:30:25Z)
Towards General Visual-Linguistic Face Forgery Detection [95.73987327101143]
Deepfakes are realistic face manipulations that can pose serious threats to security, privacy, and trust. Existing methods mostly treat this task as binary classification, which uses digital labels or mask signals to train the detection model. We propose a novel paradigm named Visual-Linguistic Face Forgery Detection(VLFFD), which uses fine-grained sentence-level prompts as the annotation.
arXiv Detail & Related papers (2023-07-31T10:22:33Z)
M2TR: Multi-modal Multi-scale Transformers for Deepfake Detection [74.19291916812921]
forged images generated by Deepfake techniques pose a serious threat to the trustworthiness of digital information. In this paper, we aim to capture the subtle manipulation artifacts at different scales for Deepfake detection. We introduce a high-quality Deepfake dataset, SR-DF, which consists of 4,000 DeepFake videos generated by state-of-the-art face swapping and facial reenactment methods.
arXiv Detail & Related papers (2021-04-20T05:43:44Z)
DeepFake Detection by Analyzing Convolutional Traces [0.0]
We focus on the analysis of Deepfakes of human faces with the objective of creating a new detection method. The proposed technique, by means of an Expectation Maximization (EM) algorithm, extracts a set of local features specifically addressed to model the underlying convolutional generative process. Results demonstrated the effectiveness of the technique in distinguishing the different architectures and the corresponding generation process.
arXiv Detail & Related papers (2020-04-22T09:02:55Z)

This list is automatically generated from the titles and abstracts of the papers in this site.