Related papers: Community Forensics: Using Thousands of Generators to Train Fake Image Detectors

Community Forensics: Using Thousands of Generators to Train Fake Image Detectors

URL: http://arxiv.org/abs/2411.04125v1
Date: Wed, 06 Nov 2024 18:59:41 GMT
Title: Community Forensics: Using Thousands of Generators to Train Fake Image Detectors
Authors: Jeongsoo Park, Andrew Owens,
Abstract summary: One of the key challenges of detecting AI-generated images is spotting images that have been created by previously unseen generative models. We propose a new dataset that is significantly larger and more diverse than prior work. The resulting dataset contains 2.7M images that have been sampled from 4803 different models.
Score: 15.166026536032142
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: One of the key challenges of detecting AI-generated images is spotting images that have been created by previously unseen generative models. We argue that the limited diversity of the training data is a major obstacle to addressing this problem, and we propose a new dataset that is significantly larger and more diverse than prior work. As part of creating this dataset, we systematically download thousands of text-to-image latent diffusion models and sample images from them. We also collect images from dozens of popular open source and commercial models. The resulting dataset contains 2.7M images that have been sampled from 4803 different models. These images collectively capture a wide range of scene content, generator architectures, and image processing settings. Using this dataset, we study the generalization abilities of fake image detectors. Our experiments suggest that detection performance improves as the number of models in the training set increases, even when these models have similar architectures. We also find that detection performance improves as the diversity of the models increases, and that our trained detectors generalize better than those trained on other datasets.

Related papers

A Large-scale AI-generated Image Inpainting Benchmark [11.216906046169683]
We propose a methodology for creating high-quality inpainting datasets and apply it to create DiQuID. DiQuID comprises over 95,000 inpainted images generated from 78,000 original images sourced from MS-COCO, RAISE, and OpenImages. We provide comprehensive benchmarking results using state-of-the-art forgery detection methods, demonstrating the dataset's effectiveness in evaluating and improving detection algorithms.
arXiv Detail & Related papers (2025-02-10T15:56:28Z)
Few-Shot Learner Generalizes Across AI-Generated Image Detection [14.069833211684715]
Few-Shot Detector (FSD) is a novel AI-generated image detector which learns a specialized metric space to effectively distinguish unseen fake images. Experiments show FSD state-of-the-art performance by $+7.4%$ average ACC on GenImage dataset.
arXiv Detail & Related papers (2025-01-15T12:33:11Z)
Understanding and Improving Training-Free AI-Generated Image Detections with Vision Foundation Models [68.90917438865078]
Deepfake techniques for facial synthesis and editing pose serious risks for generative models. In this paper, we investigate how detection performance varies across model backbones, types, and datasets. We introduce Contrastive Blur, which enhances performance on facial images, and MINDER, which addresses noise type bias, balancing performance across domains.
arXiv Detail & Related papers (2024-11-28T13:04:45Z)
Semi-Truths: A Large-Scale Dataset of AI-Augmented Images for Evaluating Robustness of AI-Generated Image detectors [62.63467652611788]
We introduce SEMI-TRUTHS, featuring 27,600 real images, 223,400 masks, and 1,472,700 AI-augmented images. Each augmented image is accompanied by metadata for standardized and targeted evaluation of detector robustness. Our findings suggest that state-of-the-art detectors exhibit varying sensitivities to the types and degrees of perturbations, data distributions, and augmentation methods used.
arXiv Detail & Related papers (2024-11-12T01:17:27Z)
Zero-Shot Detection of AI-Generated Images [54.01282123570917]
We propose a zero-shot entropy-based detector (ZED) to detect AI-generated images. Inspired by recent works on machine-generated text detection, our idea is to measure how surprising the image under analysis is compared to a model of real images. ZED achieves an average improvement of more than 3% over the SoTA in terms of accuracy.
arXiv Detail & Related papers (2024-09-24T08:46:13Z)
Contrasting Deepfakes Diffusion via Contrastive Learning and Global-Local Similarities [88.398085358514]
Contrastive Deepfake Embeddings (CoDE) is a novel embedding space specifically designed for deepfake detection. CoDE is trained via contrastive learning by additionally enforcing global-local similarities.
arXiv Detail & Related papers (2024-07-29T18:00:10Z)
Regularized Training with Generated Datasets for Name-Only Transfer of Vision-Language Models [36.59260354292177]
Recent advancements in text-to-image generation have inspired researchers to generate datasets tailored for perception models using generative models. We aim to fine-tune vision-language models to a specific classification model without access to any real images. Despite the high fidelity of generated images, we observed a significant performance degradation when fine-tuning the model using the generated datasets.
arXiv Detail & Related papers (2024-06-08T10:43:49Z)
Deep Image Composition Meets Image Forgery [0.0]
Image forgery has been studied for many years. Deep learning models require large amounts of labeled data for training. We use state of the art image composition deep learning models to generate spliced images close to the quality of real-life manipulations.
arXiv Detail & Related papers (2024-04-03T17:54:37Z)
StableLLaVA: Enhanced Visual Instruction Tuning with Synthesized Image-Dialogue Data [129.92449761766025]
We propose a novel data collection methodology that synchronously synthesizes images and dialogues for visual instruction tuning. This approach harnesses the power of generative models, marrying the abilities of ChatGPT and text-to-image generative models. Our research includes comprehensive experiments conducted on various datasets.
arXiv Detail & Related papers (2023-08-20T12:43:52Z)
Effective Data Augmentation With Diffusion Models [65.09758931804478]
We address the lack of diversity in data augmentation with image-to-image transformations parameterized by pre-trained text-to-image diffusion models. Our method edits images to change their semantics using an off-the-shelf diffusion model, and generalizes to novel visual concepts from a few labelled examples. We evaluate our approach on few-shot image classification tasks, and on a real-world weed recognition task, and observe an improvement in accuracy in tested domains.
arXiv Detail & Related papers (2023-02-07T20:42:28Z)
Deepfake Network Architecture Attribution [23.375381198124014]
Existing works on fake image attribution perform multi-class classification on several Generative Adversarial Network (GAN) models. We present the first study on textitDeepfake Network Architecture Attribution to attribute fake images on architecture-level.
arXiv Detail & Related papers (2022-02-28T14:54:30Z)
Vision Models Are More Robust And Fair When Pretrained On Uncurated Images Without Supervision [38.22842778742829]
Discriminative self-supervised learning allows training models on any random group of internet images. We train models on billions of random images without any data pre-processing or prior assumptions about what we want the model to learn. We extensively study and validate our model performance on over 50 benchmarks including fairness, to distribution shift, geographical diversity, fine grained recognition, image copy detection and many image classification datasets.
arXiv Detail & Related papers (2022-02-16T22:26:47Z)
InvGAN: Invertible GANs [88.58338626299837]
InvGAN, short for Invertible GAN, successfully embeds real images to the latent space of a high quality generative model. This allows us to perform image inpainting, merging, and online data augmentation.
arXiv Detail & Related papers (2021-12-08T21:39:00Z)
Six-channel Image Representation for Cross-domain Object Detection [17.854940064699985]
Deep learning models are data-driven and the excellent performance is highly dependent on the abundant and diverse datasets. Some image-to-image translation techniques are employed to generate some fake data of some specific scenes to train the models. We propose to inspire the original 3-channel images and their corresponding GAN-generated fake images to form 6-channel representations of the dataset.
arXiv Detail & Related papers (2021-01-03T04:50:03Z)

This list is automatically generated from the titles and abstracts of the papers in this site.