Enhancing General Face Forgery Detection via Vision Transformer with
Low-Rank Adaptation
- URL: http://arxiv.org/abs/2303.00917v2
- Date: Mon, 27 Mar 2023 07:42:24 GMT
- Title: Enhancing General Face Forgery Detection via Vision Transformer with
Low-Rank Adaptation
- Authors: Chenqi Kong, Haoliang Li, Shiqi Wang
- Abstract summary: forgery faces pose pressing security concerns over fake news, fraud, impersonation, etc.
This paper designs a more general fake face detection model based on the vision transformer(ViT) architecture.
The proposed method achieves state-of-the-arts detection performances in both cross-manipulation and cross-dataset evaluations.
- Score: 31.780516471483985
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Nowadays, forgery faces pose pressing security concerns over fake news,
fraud, impersonation, etc. Despite the demonstrated success in intra-domain
face forgery detection, existing detection methods lack generalization
capability and tend to suffer from dramatic performance drops when deployed to
unforeseen domains. To mitigate this issue, this paper designs a more general
fake face detection model based on the vision transformer(ViT) architecture. In
the training phase, the pretrained ViT weights are freezed, and only the
Low-Rank Adaptation(LoRA) modules are updated. Additionally, the Single Center
Loss(SCL) is applied to supervise the training process, further improving the
generalization capability of the model. The proposed method achieves
state-of-the-arts detection performances in both cross-manipulation and
cross-dataset evaluations.
Related papers
- Effort: Efficient Orthogonal Modeling for Generalizable AI-Generated Image Detection [66.16595174895802]
Existing AI-generated image (AIGI) detection methods often suffer from limited generalization performance.
In this paper, we identify a crucial yet previously overlooked asymmetry phenomenon in AIGI detection.
arXiv Detail & Related papers (2024-11-23T19:10:32Z) - Open-Set Deepfake Detection: A Parameter-Efficient Adaptation Method with Forgery Style Mixture [58.60915132222421]
We introduce an approach that is both general and parameter-efficient for face forgery detection.
We design a forgery-style mixture formulation that augments the diversity of forgery source domains.
We show that the designed model achieves state-of-the-art generalizability with significantly reduced trainable parameters.
arXiv Detail & Related papers (2024-08-23T01:53:36Z) - UniForensics: Face Forgery Detection via General Facial Representation [60.5421627990707]
High-level semantic features are less susceptible to perturbations and not limited to forgery-specific artifacts, thus having stronger generalization.
We introduce UniForensics, a novel deepfake detection framework that leverages a transformer-based video network, with a meta-functional face classification for enriched facial representation.
arXiv Detail & Related papers (2024-07-26T20:51:54Z) - MoE-FFD: Mixture of Experts for Generalized and Parameter-Efficient Face Forgery Detection [54.545054873239295]
Deepfakes have recently raised significant trust issues and security concerns among the public.
ViT-based methods take advantage of the expressivity of transformers, achieving superior detection performance.
This work introduces Mixture-of-Experts modules for Face Forgery Detection (MoE-FFD), a generalized yet parameter-efficient ViT-based approach.
arXiv Detail & Related papers (2024-04-12T13:02:08Z) - Generalized Face Forgery Detection via Adaptive Learning for Pre-trained Vision Transformer [54.32283739486781]
We present a textbfForgery-aware textbfAdaptive textbfVision textbfTransformer (FA-ViT) under the adaptive learning paradigm.
FA-ViT achieves 93.83% and 78.32% AUC scores on Celeb-DF and DFDC datasets in the cross-dataset evaluation.
arXiv Detail & Related papers (2023-09-20T06:51:11Z) - S-Adapter: Generalizing Vision Transformer for Face Anti-Spoofing with Statistical Tokens [45.06704981913823]
Face Anti-Spoofing (FAS) aims to detect malicious attempts to invade a face recognition system by presenting spoofed faces.
We propose a novel Statistical Adapter (S-Adapter) that gathers local discriminative and statistical information from localized token histograms.
To further improve the generalization of the statistical tokens, we propose a novel Token Style Regularization (TSR)
Our experimental results demonstrate that our proposed S-Adapter and TSR provide significant benefits in both zero-shot and few-shot cross-domain testing, outperforming state-of-the-art methods on several benchmark tests.
arXiv Detail & Related papers (2023-09-07T22:36:22Z) - Benchmarking Detection Transfer Learning with Vision Transformers [60.97703494764904]
complexity of object detection methods can make benchmarking non-trivial when new architectures, such as Vision Transformer (ViT) models, arrive.
We present training techniques that overcome these challenges, enabling the use of standard ViT models as the backbone of Mask R-CNN.
Our results show that recent masking-based unsupervised learning methods may, for the first time, provide convincing transfer learning improvements on COCO.
arXiv Detail & Related papers (2021-11-22T18:59:15Z) - On the Effectiveness of Vision Transformers for Zero-shot Face
Anti-Spoofing [7.665392786787577]
In this work, we use transfer learning from the vision transformer model for the zero-shot anti-spoofing task.
The proposed approach outperforms the state-of-the-art methods in the zero-shot protocols in the HQ-WMCA and SiW-M datasets by a large margin.
arXiv Detail & Related papers (2020-11-16T15:14:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.