AntifakePrompt: Prompt-Tuned Vision-Language Models are Fake Image Detectors
- URL: http://arxiv.org/abs/2310.17419v3
- Date: Wed, 21 Aug 2024 07:42:02 GMT
- Title: AntifakePrompt: Prompt-Tuned Vision-Language Models are Fake Image Detectors
- Authors: You-Ming Chang, Chen Yeh, Wei-Chen Chiu, Ning Yu,
- Abstract summary: Deep generative models can create remarkably fake images while raising concerns about misinformation and copyright infringement.
Deepfake detection technique is developed to distinguish between real and fake images.
We propose a novel approach called AntifakePrompt, using Vision-Language Models and prompt tuning techniques.
- Score: 24.78672820633581
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep generative models can create remarkably photorealistic fake images while raising concerns about misinformation and copyright infringement, known as deepfake threats. Deepfake detection technique is developed to distinguish between real and fake images, where the existing methods typically learn classifiers in the image domain or various feature domains. However, the generalizability of deepfake detection against emerging and more advanced generative models remains challenging. In this paper, being inspired by the zero-shot advantages of Vision-Language Models (VLMs), we propose a novel approach called AntifakePrompt, using VLMs (e.g., InstructBLIP) and prompt tuning techniques to improve the deepfake detection accuracy over unseen data. We formulate deepfake detection as a visual question answering problem, and tune soft prompts for InstructBLIP to answer the real/fake information of a query image. We conduct full-spectrum experiments on datasets from a diversity of 3 held-in and 20 held-out generative models, covering modern text-to-image generation, image editing and adversarial image attacks. These testing datasets provide useful benchmarks in the realm of deepfake detection for further research. Moreover, results demonstrate that (1) the deepfake detection accuracy can be significantly and consistently improved (from 71.06% to 92.11%, in average accuracy over unseen domains) using pretrained vision-language models with prompt tuning; (2) our superior performance is at less cost of training data and trainable parameters, resulting in an effective and efficient solution for deepfake detection. Code and models can be found at https://github.com/nctu-eva-lab/AntifakePrompt.
Related papers
- CrossDF: Improving Cross-Domain Deepfake Detection with Deep Information Decomposition [53.860796916196634]
We propose a Deep Information Decomposition (DID) framework to enhance the performance of Cross-dataset Deepfake Detection (CrossDF)
Unlike most existing deepfake detection methods, our framework prioritizes high-level semantic features over specific visual artifacts.
It adaptively decomposes facial features into deepfake-related and irrelevant information, only using the intrinsic deepfake-related information for real/fake discrimination.
arXiv Detail & Related papers (2023-09-30T12:30:25Z) - Robustness and Generalizability of Deepfake Detection: A Study with
Diffusion Models [35.188364409869465]
We present an investigation into how deepfakes are produced and how they can be identified.
The cornerstone of our research is a rich collection of artificial celebrity faces, titled DeepFakeFace.
This data serves as a robust foundation to train and test algorithms designed to spot deepfakes.
arXiv Detail & Related papers (2023-09-05T13:22:41Z) - Parents and Children: Distinguishing Multimodal DeepFakes from Natural Images [60.34381768479834]
Recent advancements in diffusion models have enabled the generation of realistic deepfakes from textual prompts in natural language.
We pioneer a systematic study on deepfake detection generated by state-of-the-art diffusion models.
arXiv Detail & Related papers (2023-04-02T10:25:09Z) - SeeABLE: Soft Discrepancies and Bounded Contrastive Learning for
Exposing Deepfakes [7.553507857251396]
We propose a novel deepfake detector, called SeeABLE, that formalizes the detection problem as a (one-class) out-of-distribution detection task.
SeeABLE pushes perturbed faces towards predefined prototypes using a novel regression-based bounded contrastive loss.
We show that our model convincingly outperforms competing state-of-the-art detectors, while exhibiting highly encouraging generalization capabilities.
arXiv Detail & Related papers (2022-11-21T09:38:30Z) - Deep Convolutional Pooling Transformer for Deepfake Detection [54.10864860009834]
We propose a deep convolutional Transformer to incorporate decisive image features both locally and globally.
Specifically, we apply convolutional pooling and re-attention to enrich the extracted features and enhance efficacy.
The proposed solution consistently outperforms several state-of-the-art baselines on both within- and cross-dataset experiments.
arXiv Detail & Related papers (2022-09-12T15:05:41Z) - DA-FDFtNet: Dual Attention Fake Detection Fine-tuning Network to Detect
Various AI-Generated Fake Images [21.030153777110026]
It has been much easier to create fake images such as "Deepfakes"
Recent research has introduced few-shot learning, which uses a small amount of training data to produce fake images and videos more effectively.
In this work, we propose Dual Attention Fine-tuning Network (DA-tNet) to detect the manipulated fake face images.
arXiv Detail & Related papers (2021-12-22T16:25:24Z) - Beyond the Spectrum: Detecting Deepfakes via Re-Synthesis [69.09526348527203]
Deep generative models have led to highly realistic media, known as deepfakes, that are commonly indistinguishable from real to human eyes.
We propose a novel fake detection that is designed to re-synthesize testing images and extract visual cues for detection.
We demonstrate the improved effectiveness, cross-GAN generalization, and robustness against perturbations of our approach in a variety of detection scenarios.
arXiv Detail & Related papers (2021-05-29T21:22:24Z) - TAR: Generalized Forensic Framework to Detect Deepfakes using Weakly
Supervised Learning [17.40885531847159]
Deepfakes have become a critical social problem, and detecting them is of utmost importance.
In this work, we introduce a practical digital forensic tool to detect different types of deepfakes simultaneously.
We develop an autoencoder-based detection model with Residual blocks and sequentially perform transfer learning to detect different types of deepfakes simultaneously.
arXiv Detail & Related papers (2021-05-13T07:31:08Z) - M2TR: Multi-modal Multi-scale Transformers for Deepfake Detection [74.19291916812921]
forged images generated by Deepfake techniques pose a serious threat to the trustworthiness of digital information.
In this paper, we aim to capture the subtle manipulation artifacts at different scales for Deepfake detection.
We introduce a high-quality Deepfake dataset, SR-DF, which consists of 4,000 DeepFake videos generated by state-of-the-art face swapping and facial reenactment methods.
arXiv Detail & Related papers (2021-04-20T05:43:44Z) - What makes fake images detectable? Understanding properties that
generalize [55.4211069143719]
Deep networks can still pick up on subtle artifacts in doctored images.
We seek to understand what properties of fake images make them detectable.
We show a technique to exaggerate these detectable properties.
arXiv Detail & Related papers (2020-08-24T17:50:28Z) - FDFtNet: Facing Off Fake Images using Fake Detection Fine-tuning Network [19.246576904646172]
We propose a light-weight fine-tuning neural network-based architecture called FaketNet.
Our approach aims to reuse popular pre-trained models with only a few images for fine-tuning to effectively detect fake images.
Our tNet achieves an overall accuracy of 9029% in detecting fake images generated from the GANs-based dataset.
arXiv Detail & Related papers (2020-01-05T16:04:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.