Generalizable Synthetic Image Detection via Language-guided Contrastive
Learning
- URL: http://arxiv.org/abs/2305.13800v1
- Date: Tue, 23 May 2023 08:13:27 GMT
- Title: Generalizable Synthetic Image Detection via Language-guided Contrastive
Learning
- Authors: Haiwei Wu and Jiantao Zhou and Shile Zhang
- Abstract summary: malevolent use of synthetic images, such as the dissemination of fake news or the creation of fake profiles, raises significant concerns regarding the authenticity of images.
We propose a simple yet very effective synthetic image detection method via a language-guided contrastive learning and a new formulation of the detection problem.
It is shown that our proposed LanguAge-guided SynThEsis Detection (LASTED) model achieves much improved generalizability to unseen image generation models.
- Score: 22.4158195581231
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The heightened realism of AI-generated images can be attributed to the rapid
development of synthetic models, including generative adversarial networks
(GANs) and diffusion models (DMs). The malevolent use of synthetic images, such
as the dissemination of fake news or the creation of fake profiles, however,
raises significant concerns regarding the authenticity of images. Though many
forensic algorithms have been developed for detecting synthetic images, their
performance, especially the generalization capability, is still far from being
adequate to cope with the increasing number of synthetic models. In this work,
we propose a simple yet very effective synthetic image detection method via a
language-guided contrastive learning and a new formulation of the detection
problem. We first augment the training images with carefully-designed textual
labels, enabling us to use a joint image-text contrastive learning for the
forensic feature extraction. In addition, we formulate the synthetic image
detection as an identification problem, which is vastly different from the
traditional classification-based approaches. It is shown that our proposed
LanguAge-guided SynThEsis Detection (LASTED) model achieves much improved
generalizability to unseen image generation models and delivers promising
performance that far exceeds state-of-the-art competitors by +22.66% accuracy
and +15.24% AUC. The code is available at https://github.com/HighwayWu/LASTED.
Related papers
- Time Step Generating: A Universal Synthesized Deepfake Image Detector [0.4488895231267077]
We propose a universal synthetic image detector Time Step Generating (TSG)
TSG does not rely on pre-trained models' reconstructing ability, specific datasets, or sampling algorithms.
We test the proposed TSG on the large-scale GenImage benchmark and it achieves significant improvements in both accuracy and generalizability.
arXiv Detail & Related papers (2024-11-17T09:39:50Z) - Harnessing the Power of Large Vision Language Models for Synthetic Image Detection [14.448350657613364]
This study investigates the effectiveness of using advanced vision-language models (VLMs) for synthetic image identification.
By harnessing the robust understanding capabilities of large VLMs, the aim is to distinguish authentic images from synthetic images produced by diffusion-based models.
arXiv Detail & Related papers (2024-04-03T13:27:54Z) - Bi-LORA: A Vision-Language Approach for Synthetic Image Detection [14.448350657613364]
Deep image synthesis techniques, such as generative adversarial networks (GANs) and diffusion models (DMs) have ushered in an era of generating highly realistic images.
This paper takes inspiration from the potent convergence capabilities between vision and language, coupled with the zero-shot nature of vision-language models (VLMs)
We introduce an innovative method called Bi-LORA that leverages VLMs, combined with low-rank adaptation (LORA) tuning techniques, to enhance the precision of synthetic image detection for unseen model-generated images.
arXiv Detail & Related papers (2024-04-02T13:54:22Z) - Is Synthetic Image Useful for Transfer Learning? An Investigation into Data Generation, Volume, and Utilization [62.157627519792946]
We introduce a novel framework called bridged transfer, which initially employs synthetic images for fine-tuning a pre-trained model to improve its transferability.
We propose dataset style inversion strategy to improve the stylistic alignment between synthetic and real images.
Our proposed methods are evaluated across 10 different datasets and 5 distinct models, demonstrating consistent improvements.
arXiv Detail & Related papers (2024-03-28T22:25:05Z) - Forgery-aware Adaptive Transformer for Generalizable Synthetic Image
Detection [106.39544368711427]
We study the problem of generalizable synthetic image detection, aiming to detect forgery images from diverse generative methods.
We present a novel forgery-aware adaptive transformer approach, namely FatFormer.
Our approach tuned on 4-class ProGAN data attains an average of 98% accuracy to unseen GANs, and surprisingly generalizes to unseen diffusion models with 95% accuracy.
arXiv Detail & Related papers (2023-12-27T17:36:32Z) - Improving Synthetically Generated Image Detection in Cross-Concept
Settings [20.21594285488186]
We focus on the challenge of generalizing across different concept classes, e.g., when training a detector on human faces.
We propose an approach based on the premise that the robustness of the detector can be enhanced by training it on realistic synthetic images.
arXiv Detail & Related papers (2023-04-24T12:45:00Z) - Parents and Children: Distinguishing Multimodal DeepFakes from Natural Images [60.34381768479834]
Recent advancements in diffusion models have enabled the generation of realistic deepfakes from textual prompts in natural language.
We pioneer a systematic study on deepfake detection generated by state-of-the-art diffusion models.
arXiv Detail & Related papers (2023-04-02T10:25:09Z) - Deep Image Fingerprint: Towards Low Budget Synthetic Image Detection and Model Lineage Analysis [8.777277201807351]
We develop a new detection method for images that are indistinguishable from real ones.
Our method can detect images from a known generative model and enable us to establish relationships between fine-tuned generative models.
Our approach achieves comparable performance to state-of-the-art pre-trained detection methods on images generated by Stable Diffusion and Midversa.
arXiv Detail & Related papers (2023-03-19T20:31:38Z) - Is synthetic data from generative models ready for image recognition? [69.42645602062024]
We study whether and how synthetic images generated from state-of-the-art text-to-image generation models can be used for image recognition tasks.
We showcase the powerfulness and shortcomings of synthetic data from existing generative models, and propose strategies for better applying synthetic data for recognition tasks.
arXiv Detail & Related papers (2022-10-14T06:54:24Z) - Identity-Aware CycleGAN for Face Photo-Sketch Synthesis and Recognition [61.87842307164351]
We first propose an Identity-Aware CycleGAN (IACycleGAN) model that applies a new perceptual loss to supervise the image generation network.
It improves CycleGAN on photo-sketch synthesis by paying more attention to the synthesis of key facial regions, such as eyes and nose.
We develop a mutual optimization procedure between the synthesis model and the recognition model, which iteratively synthesizes better images by IACycleGAN.
arXiv Detail & Related papers (2021-03-30T01:30:08Z) - You Only Need Adversarial Supervision for Semantic Image Synthesis [84.83711654797342]
We propose a novel, simplified GAN model, which needs only adversarial supervision to achieve high quality results.
We show that images synthesized by our model are more diverse and follow the color and texture of real images more closely.
arXiv Detail & Related papers (2020-12-08T23:00:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.