Related papers: AEROBLADE: Training-Free Detection of Latent Diffusion Images Using Autoencoder Reconstruction Error

AEROBLADE: Training-Free Detection of Latent Diffusion Images Using Autoencoder Reconstruction Error

URL: http://arxiv.org/abs/2401.17879v2
Date: Wed, 27 Mar 2024 09:17:14 GMT
Title: AEROBLADE: Training-Free Detection of Latent Diffusion Images Using Autoencoder Reconstruction Error
Authors: Jonas Ricker, Denis Lukovnikov, Asja Fischer,
Abstract summary: A key enabler for generating high-resolution images with low computational cost has been the development of latent diffusion models (LDMs) LDMs perform the denoising process in the low-dimensional latent space of a pre-trained autoencoder (AE) instead of the high-dimensional image space. We propose a novel detection method which exploits an inherent component of LDMs: the AE used to transform images between image and latent space.
Score: 15.46508882889489
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: With recent text-to-image models, anyone can generate deceptively realistic images with arbitrary contents, fueling the growing threat of visual disinformation. A key enabler for generating high-resolution images with low computational cost has been the development of latent diffusion models (LDMs). In contrast to conventional diffusion models, LDMs perform the denoising process in the low-dimensional latent space of a pre-trained autoencoder (AE) instead of the high-dimensional image space. Despite their relevance, the forensic analysis of LDMs is still in its infancy. In this work we propose AEROBLADE, a novel detection method which exploits an inherent component of LDMs: the AE used to transform images between image and latent space. We find that generated images can be more accurately reconstructed by the AE than real images, allowing for a simple detection approach based on the reconstruction error. Most importantly, our method is easy to implement and does not require any training, yet nearly matches the performance of detectors that rely on extensive training. We empirically demonstrate that AEROBLADE is effective against state-of-the-art LDMs, including Stable Diffusion and Midjourney. Beyond detection, our approach allows for the qualitative analysis of images, which can be leveraged for identifying inpainted regions. We release our code and data at https://github.com/jonasricker/aeroblade .

Related papers

HFI: A unified framework for training-free detection and implicit watermarking of latent diffusion model generated images [32.4045133529788]
Current AI-generated image detection methods assume the availability of real/AI-generated images for training. We propose HFI, which measures the extent of aliasing, a distortion of high-frequency information. We show that HFI can successfully detect the images generated from the specified LDM as a means of implicit watermarking.
arXiv Detail & Related papers (2024-12-30T04:34:42Z)
Understanding and Improving Training-Free AI-Generated Image Detections with Vision Foundation Models [68.90917438865078]
Deepfake techniques for facial synthesis and editing pose serious risks for generative models. In this paper, we investigate how detection performance varies across model backbones, types, and datasets. We introduce Contrastive Blur, which enhances performance on facial images, and MINDER, which addresses noise type bias, balancing performance across domains.
arXiv Detail & Related papers (2024-11-28T13:04:45Z)
Detecting AutoEncoder is Enough to Catch LDM Generated Images [0.0]
This paper proposes a novel method for detecting images generated by Latent Diffusion Models (LDM) by identifying artifacts introduced by their autoencoders. By training a detector to distinguish between real images and those reconstructed by the LDM autoencoder, the method enables detection of generated images without directly training on them. Experimental results show high detection accuracy with minimal false positives, making this approach a promising tool for combating fake images.
arXiv Detail & Related papers (2024-11-10T12:17:32Z)
On the Effectiveness of Dataset Alignment for Fake Image Detection [28.68129042301801]
A good detector should focus on the generative models fingerprints while ignoring image properties such as semantic content, resolution, file format, etc. In this work, we argue that in addition to these algorithmic choices, we also require a well aligned dataset of real/fake images to train a robust detector. For the family of LDMs, we propose a very simple way to achieve this: we reconstruct all the real images using the LDMs autoencoder, without any denoising operation. We then train a model to separate these real images from their reconstructions.
arXiv Detail & Related papers (2024-10-15T17:58:07Z)
Zero-Shot Detection of AI-Generated Images [54.01282123570917]
We propose a zero-shot entropy-based detector (ZED) to detect AI-generated images. Inspired by recent works on machine-generated text detection, our idea is to measure how surprising the image under analysis is compared to a model of real images. ZED achieves an average improvement of more than 3% over the SoTA in terms of accuracy.
arXiv Detail & Related papers (2024-09-24T08:46:13Z)
DiffUHaul: A Training-Free Method for Object Dragging in Images [78.93531472479202]
We propose a training-free method, dubbed DiffUHaul, for the object dragging task. We first apply attention masking in each denoising step to make the generation more disentangled across different objects. In the early denoising steps, we interpolate the attention features between source and target images to smoothly fuse new layouts with the original appearance.
arXiv Detail & Related papers (2024-06-03T17:59:53Z)
RIGID: A Training-free and Model-Agnostic Framework for Robust AI-Generated Image Detection [60.960988614701414]
RIGID is a training-free and model-agnostic method for robust AI-generated image detection. RIGID significantly outperforms existing trainingbased and training-free detectors.
arXiv Detail & Related papers (2024-05-30T14:49:54Z)
Robust CLIP-Based Detector for Exposing Diffusion Model-Generated Images [13.089550724738436]
Diffusion models (DMs) have revolutionized image generation, producing high-quality images with applications spanning various fields. Their ability to create hyper-realistic images poses significant challenges in distinguishing between real and synthetic content. This work introduces a robust detection framework that integrates image and text features extracted by CLIP model with a Multilayer Perceptron (MLP) classifier.
arXiv Detail & Related papers (2024-04-19T14:30:41Z)
DiAD: A Diffusion-based Framework for Multi-class Anomaly Detection [55.48770333927732]
We propose a Difusion-based Anomaly Detection (DiAD) framework for multi-class anomaly detection. It consists of a pixel-space autoencoder, a latent-space Semantic-Guided (SG) network with a connection to the stable diffusion's denoising network, and a feature-space pre-trained feature extractor. Experiments on MVTec-AD and VisA datasets demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2023-12-11T18:38:28Z)
Denoising Diffusion Autoencoders are Unified Self-supervised Learners [58.194184241363175]
This paper shows that the networks in diffusion models, namely denoising diffusion autoencoders (DDAE), are unified self-supervised learners. DDAE has already learned strongly linear-separable representations within its intermediate layers without auxiliary encoders. Our diffusion-based approach achieves 95.9% and 50.0% linear evaluation accuracies on CIFAR-10 and Tiny-ImageNet.
arXiv Detail & Related papers (2023-03-17T04:20:47Z)
DIRE for Diffusion-Generated Image Detection [128.95822613047298]
We propose a novel representation called DIffusion Reconstruction Error (DIRE) DIRE measures the error between an input image and its reconstruction counterpart by a pre-trained diffusion model. It provides a hint that DIRE can serve as a bridge to distinguish generated and real images.
arXiv Detail & Related papers (2023-03-16T13:15:03Z)
DOLCE: A Model-Based Probabilistic Diffusion Framework for Limited-Angle CT Reconstruction [42.028139152832466]
Limited-Angle Computed Tomography (LACT) is a non-destructive evaluation technique used in a variety of applications ranging from security to medicine. We present DOLCE, a new deep model-based framework for LACT that uses a conditional diffusion model as an image prior.
arXiv Detail & Related papers (2022-11-22T15:30:38Z)

This list is automatically generated from the titles and abstracts of the papers in this site.