Towards More General Video-based Deepfake Detection through Facial Feature Guided Adaptation for Foundation Model
- URL: http://arxiv.org/abs/2404.05583v2
- Date: Wed, 5 Jun 2024 06:29:37 GMT
- Title: Towards More General Video-based Deepfake Detection through Facial Feature Guided Adaptation for Foundation Model
- Authors: Yue-Hua Han, Tai-Ming Huang, Shu-Tzu Lo, Po-Han Huang, Kai-Lung Hua, Jun-Cheng Chen,
- Abstract summary: We propose a novel Deepfake detection approach by adapting the Foundation Models with rich information encoded inside.
Inspired by the recent advances of parameter efficient fine-tuning, we propose a novel side-network-based decoder.
Our approach exhibits superior effectiveness in identifying unseen Deepfake samples, achieving notable performance improvement.
- Score: 15.61920157541529
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: With the rise of deep learning, generative models have enabled the creation of highly realistic synthetic images, presenting challenges due to their potential misuse. While research in Deepfake detection has grown rapidly in response, many detection methods struggle with unseen Deepfakes generated by new synthesis techniques. To address this generalisation challenge, we propose a novel Deepfake detection approach by adapting the Foundation Models with rich information encoded inside, specifically using the image encoder from CLIP which has demonstrated strong zero-shot capability for downstream tasks. Inspired by the recent advances of parameter efficient fine-tuning, we propose a novel side-network-based decoder to extract spatial and temporal cues from the given video clip, with the promotion of the Facial Component Guidance (FCG) to encourage the spatial feature to include features of key facial parts for more robust and general Deepfake detection. Through extensive cross-dataset evaluations, our approach exhibits superior effectiveness in identifying unseen Deepfake samples, achieving notable performance improvement even with limited training samples and manipulation types. Our model secures an average performance enhancement of 0.9\% AUROC in cross-dataset assessments comparing with state-of-the-art methods, especially a significant lead of achieving 4.4\% improvement on the challenging DFDC dataset.
Related papers
- Fiducial Focus Augmentation for Facial Landmark Detection [4.433764381081446]
We propose a novel image augmentation technique to enhance the model's understanding of facial structures.
We employ a Siamese architecture-based training mechanism with a Deep Canonical Correlation Analysis (DCCA)-based loss.
Our approach outperforms multiple state-of-the-art approaches across various benchmark datasets.
arXiv Detail & Related papers (2024-02-23T01:34:00Z) - Masked Conditional Diffusion Model for Enhancing Deepfake Detection [20.018495944984355]
We propose a Masked Conditional Diffusion Model (MCDM) for enhancing deepfake detection.
It generates a variety of forged faces from a masked pristine one, encouraging the deepfake detection model to learn generic and robust representations.
arXiv Detail & Related papers (2024-02-01T12:06:55Z) - DeepFidelity: Perceptual Forgery Fidelity Assessment for Deepfake
Detection [67.3143177137102]
Deepfake detection refers to detecting artificially generated or edited faces in images or videos.
We propose a novel Deepfake detection framework named DeepFidelity to adaptively distinguish real and fake faces.
arXiv Detail & Related papers (2023-12-07T07:19:45Z) - Improving Cross-dataset Deepfake Detection with Deep Information
Decomposition [57.284370468207214]
Deepfake technology poses a significant threat to security and social trust.
Existing detection methods suffer from sharp performance degradation when faced with cross-dataset scenarios.
We propose a deep information decomposition (DID) framework in this paper.
arXiv Detail & Related papers (2023-09-30T12:30:25Z) - Small Object Detection via Coarse-to-fine Proposal Generation and
Imitation Learning [52.06176253457522]
We propose a two-stage framework tailored for small object detection based on the Coarse-to-fine pipeline and Feature Imitation learning.
CFINet achieves state-of-the-art performance on the large-scale small object detection benchmarks, SODA-D and SODA-A.
arXiv Detail & Related papers (2023-08-18T13:13:09Z) - Towards General Visual-Linguistic Face Forgery Detection [95.73987327101143]
Deepfakes are realistic face manipulations that can pose serious threats to security, privacy, and trust.
Existing methods mostly treat this task as binary classification, which uses digital labels or mask signals to train the detection model.
We propose a novel paradigm named Visual-Linguistic Face Forgery Detection(VLFFD), which uses fine-grained sentence-level prompts as the annotation.
arXiv Detail & Related papers (2023-07-31T10:22:33Z) - Deep Convolutional Pooling Transformer for Deepfake Detection [54.10864860009834]
We propose a deep convolutional Transformer to incorporate decisive image features both locally and globally.
Specifically, we apply convolutional pooling and re-attention to enrich the extracted features and enhance efficacy.
The proposed solution consistently outperforms several state-of-the-art baselines on both within- and cross-dataset experiments.
arXiv Detail & Related papers (2022-09-12T15:05:41Z) - On Improving Cross-dataset Generalization of Deepfake Detectors [1.0152838128195467]
Facial manipulation by deep fake has caused major security risks and raised severe societal concerns.
We formulate deep fake detection as a hybrid combination of supervised and reinforcement learning (RL) to improve its cross-dataset generalization performance.
We demonstrate the superiority of our method over existing published research in cross-dataset generalization of deep fake detectors, thus obtaining state-of-the-art performance.
arXiv Detail & Related papers (2022-04-08T20:34:53Z) - Beyond the Spectrum: Detecting Deepfakes via Re-Synthesis [69.09526348527203]
Deep generative models have led to highly realistic media, known as deepfakes, that are commonly indistinguishable from real to human eyes.
We propose a novel fake detection that is designed to re-synthesize testing images and extract visual cues for detection.
We demonstrate the improved effectiveness, cross-GAN generalization, and robustness against perturbations of our approach in a variety of detection scenarios.
arXiv Detail & Related papers (2021-05-29T21:22:24Z) - Learning to Recognize Patch-Wise Consistency for Deepfake Detection [39.186451993950044]
We propose a representation learning approach for this task, called patch-wise consistency learning (PCL)
PCL learns by measuring the consistency of image source features, resulting to representation with good interpretability and robustness to multiple forgery methods.
We evaluate our approach on seven popular Deepfake detection datasets.
arXiv Detail & Related papers (2020-12-16T23:06:56Z) - DeepFake Detection by Analyzing Convolutional Traces [0.0]
We focus on the analysis of Deepfakes of human faces with the objective of creating a new detection method.
The proposed technique, by means of an Expectation Maximization (EM) algorithm, extracts a set of local features specifically addressed to model the underlying convolutional generative process.
Results demonstrated the effectiveness of the technique in distinguishing the different architectures and the corresponding generation process.
arXiv Detail & Related papers (2020-04-22T09:02:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.