Towards More General Video-based Deepfake Detection through Facial Feature Guided Adaptation for Foundation Model
- URL: http://arxiv.org/abs/2404.05583v2
- Date: Wed, 5 Jun 2024 06:29:37 GMT
- Title: Towards More General Video-based Deepfake Detection through Facial Feature Guided Adaptation for Foundation Model
- Authors: Yue-Hua Han, Tai-Ming Huang, Shu-Tzu Lo, Po-Han Huang, Kai-Lung Hua, Jun-Cheng Chen,
- Abstract summary: We propose a novel Deepfake detection approach by adapting the Foundation Models with rich information encoded inside.
Inspired by the recent advances of parameter efficient fine-tuning, we propose a novel side-network-based decoder.
Our approach exhibits superior effectiveness in identifying unseen Deepfake samples, achieving notable performance improvement.
- Score: 15.61920157541529
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: With the rise of deep learning, generative models have enabled the creation of highly realistic synthetic images, presenting challenges due to their potential misuse. While research in Deepfake detection has grown rapidly in response, many detection methods struggle with unseen Deepfakes generated by new synthesis techniques. To address this generalisation challenge, we propose a novel Deepfake detection approach by adapting the Foundation Models with rich information encoded inside, specifically using the image encoder from CLIP which has demonstrated strong zero-shot capability for downstream tasks. Inspired by the recent advances of parameter efficient fine-tuning, we propose a novel side-network-based decoder to extract spatial and temporal cues from the given video clip, with the promotion of the Facial Component Guidance (FCG) to encourage the spatial feature to include features of key facial parts for more robust and general Deepfake detection. Through extensive cross-dataset evaluations, our approach exhibits superior effectiveness in identifying unseen Deepfake samples, achieving notable performance improvement even with limited training samples and manipulation types. Our model secures an average performance enhancement of 0.9\% AUROC in cross-dataset assessments comparing with state-of-the-art methods, especially a significant lead of achieving 4.4\% improvement on the challenging DFDC dataset.
Related papers
- Harnessing Wavelet Transformations for Generalizable Deepfake Forgery Detection [0.0]
Wavelet-CLIP is a deepfake detection framework that integrates wavelet transforms with features derived from the ViT-L/14 architecture, pre-trained in the CLIP fashion.
Our method showcases outstanding performance, achieving an average AUC of 0.749 for cross-data generalization and 0.893 for robustness against unseen deepfakes.
arXiv Detail & Related papers (2024-09-26T21:16:51Z) - Standing on the Shoulders of Giants: Reprogramming Visual-Language Model for General Deepfake Detection [16.21235742118949]
We propose a novel approach that repurposes a well-trained Vision-Language Models (VLMs) for general deepfake detection.
Motivated by the model reprogramming paradigm that manipulates the model prediction via data perturbations, our method can reprogram a pretrained VLM model.
Our superior performances are at less cost of trainable parameters, making it a promising approach for real-world applications.
arXiv Detail & Related papers (2024-09-04T12:46:30Z) - UniForensics: Face Forgery Detection via General Facial Representation [60.5421627990707]
High-level semantic features are less susceptible to perturbations and not limited to forgery-specific artifacts, thus having stronger generalization.
We introduce UniForensics, a novel deepfake detection framework that leverages a transformer-based video network, with a meta-functional face classification for enriched facial representation.
arXiv Detail & Related papers (2024-07-26T20:51:54Z) - DeepFidelity: Perceptual Forgery Fidelity Assessment for Deepfake
Detection [67.3143177137102]
Deepfake detection refers to detecting artificially generated or edited faces in images or videos.
We propose a novel Deepfake detection framework named DeepFidelity to adaptively distinguish real and fake faces.
arXiv Detail & Related papers (2023-12-07T07:19:45Z) - CrossDF: Improving Cross-Domain Deepfake Detection with Deep Information Decomposition [53.860796916196634]
We propose a Deep Information Decomposition (DID) framework to enhance the performance of Cross-dataset Deepfake Detection (CrossDF)
Unlike most existing deepfake detection methods, our framework prioritizes high-level semantic features over specific visual artifacts.
It adaptively decomposes facial features into deepfake-related and irrelevant information, only using the intrinsic deepfake-related information for real/fake discrimination.
arXiv Detail & Related papers (2023-09-30T12:30:25Z) - Small Object Detection via Coarse-to-fine Proposal Generation and
Imitation Learning [52.06176253457522]
We propose a two-stage framework tailored for small object detection based on the Coarse-to-fine pipeline and Feature Imitation learning.
CFINet achieves state-of-the-art performance on the large-scale small object detection benchmarks, SODA-D and SODA-A.
arXiv Detail & Related papers (2023-08-18T13:13:09Z) - Towards General Visual-Linguistic Face Forgery Detection [95.73987327101143]
Deepfakes are realistic face manipulations that can pose serious threats to security, privacy, and trust.
Existing methods mostly treat this task as binary classification, which uses digital labels or mask signals to train the detection model.
We propose a novel paradigm named Visual-Linguistic Face Forgery Detection(VLFFD), which uses fine-grained sentence-level prompts as the annotation.
arXiv Detail & Related papers (2023-07-31T10:22:33Z) - Deep Convolutional Pooling Transformer for Deepfake Detection [54.10864860009834]
We propose a deep convolutional Transformer to incorporate decisive image features both locally and globally.
Specifically, we apply convolutional pooling and re-attention to enrich the extracted features and enhance efficacy.
The proposed solution consistently outperforms several state-of-the-art baselines on both within- and cross-dataset experiments.
arXiv Detail & Related papers (2022-09-12T15:05:41Z) - On Improving Cross-dataset Generalization of Deepfake Detectors [1.0152838128195467]
Facial manipulation by deep fake has caused major security risks and raised severe societal concerns.
We formulate deep fake detection as a hybrid combination of supervised and reinforcement learning (RL) to improve its cross-dataset generalization performance.
We demonstrate the superiority of our method over existing published research in cross-dataset generalization of deep fake detectors, thus obtaining state-of-the-art performance.
arXiv Detail & Related papers (2022-04-08T20:34:53Z) - DeepFake Detection by Analyzing Convolutional Traces [0.0]
We focus on the analysis of Deepfakes of human faces with the objective of creating a new detection method.
The proposed technique, by means of an Expectation Maximization (EM) algorithm, extracts a set of local features specifically addressed to model the underlying convolutional generative process.
Results demonstrated the effectiveness of the technique in distinguishing the different architectures and the corresponding generation process.
arXiv Detail & Related papers (2020-04-22T09:02:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.