HFI: A unified framework for training-free detection and implicit watermarking of latent diffusion model generated images
- URL: http://arxiv.org/abs/2412.20704v1
- Date: Mon, 30 Dec 2024 04:34:42 GMT
- Title: HFI: A unified framework for training-free detection and implicit watermarking of latent diffusion model generated images
- Authors: Sungik Choi, Sungwoo Park, Jaehoon Lee, Seunghyun Kim, Stanley Jungkyu Choi, Moontae Lee,
- Abstract summary: Current AI-generated image detection methods assume the availability of real/AI-generated images for training.
We propose HFI, which measures the extent of aliasing, a distortion of high-frequency information.
We show that HFI can successfully detect the images generated from the specified LDM as a means of implicit watermarking.
- Score: 32.4045133529788
- License:
- Abstract: Dramatic advances in the quality of the latent diffusion models (LDMs) also led to the malicious use of AI-generated images. While current AI-generated image detection methods assume the availability of real/AI-generated images for training, this is practically limited given the vast expressibility of LDMs. This motivates the training-free detection setup where no related data are available in advance. The existing LDM-generated image detection method assumes that images generated by LDM are easier to reconstruct using an autoencoder than real images. However, we observe that this reconstruction distance is overfitted to background information, leading the current method to underperform in detecting images with simple backgrounds. To address this, we propose a novel method called HFI. Specifically, by viewing the autoencoder of LDM as a downsampling-upsampling kernel, HFI measures the extent of aliasing, a distortion of high-frequency information that appears in the reconstructed image. HFI is training-free, efficient, and consistently outperforms other training-free methods in detecting challenging images generated by various generative models. We also show that HFI can successfully detect the images generated from the specified LDM as a means of implicit watermarking. HFI outperforms the best baseline method while achieving magnitudes of
Related papers
- Understanding and Improving Training-Free AI-Generated Image Detections with Vision Foundation Models [68.90917438865078]
Deepfake techniques for facial synthesis and editing pose serious risks for generative models.
In this paper, we investigate how detection performance varies across model backbones, types, and datasets.
We introduce Contrastive Blur, which enhances performance on facial images, and MINDER, which addresses noise type bias, balancing performance across domains.
arXiv Detail & Related papers (2024-11-28T13:04:45Z) - Time Step Generating: A Universal Synthesized Deepfake Image Detector [0.4488895231267077]
We propose a universal synthetic image detector Time Step Generating (TSG)
TSG does not rely on pre-trained models' reconstructing ability, specific datasets, or sampling algorithms.
We test the proposed TSG on the large-scale GenImage benchmark and it achieves significant improvements in both accuracy and generalizability.
arXiv Detail & Related papers (2024-11-17T09:39:50Z) - Detecting AutoEncoder is Enough to Catch LDM Generated Images [0.0]
This paper proposes a novel method for detecting images generated by Latent Diffusion Models (LDM) by identifying artifacts introduced by their autoencoders.
By training a detector to distinguish between real images and those reconstructed by the LDM autoencoder, the method enables detection of generated images without directly training on them.
Experimental results show high detection accuracy with minimal false positives, making this approach a promising tool for combating fake images.
arXiv Detail & Related papers (2024-11-10T12:17:32Z) - One-step Generative Diffusion for Realistic Extreme Image Rescaling [47.89362819768323]
We propose a novel framework called One-Step Image Rescaling Diffusion (OSIRDiff) for extreme image rescaling.
OSIRDiff performs rescaling operations in the latent space of a pre-trained autoencoder.
It effectively leverages powerful natural image priors learned by a pre-trained text-to-image diffusion model.
arXiv Detail & Related papers (2024-08-17T09:51:42Z) - RIGID: A Training-free and Model-Agnostic Framework for Robust AI-Generated Image Detection [60.960988614701414]
RIGID is a training-free and model-agnostic method for robust AI-generated image detection.
RIGID significantly outperforms existing trainingbased and training-free detectors.
arXiv Detail & Related papers (2024-05-30T14:49:54Z) - Robust CLIP-Based Detector for Exposing Diffusion Model-Generated Images [13.089550724738436]
Diffusion models (DMs) have revolutionized image generation, producing high-quality images with applications spanning various fields.
Their ability to create hyper-realistic images poses significant challenges in distinguishing between real and synthetic content.
This work introduces a robust detection framework that integrates image and text features extracted by CLIP model with a Multilayer Perceptron (MLP) classifier.
arXiv Detail & Related papers (2024-04-19T14:30:41Z) - Coarse-to-Fine Latent Diffusion for Pose-Guided Person Image Synthesis [65.7968515029306]
We propose a novel Coarse-to-Fine Latent Diffusion (CFLD) method for Pose-Guided Person Image Synthesis (PGPIS)
A perception-refined decoder is designed to progressively refine a set of learnable queries and extract semantic understanding of person images as a coarse-grained prompt.
arXiv Detail & Related papers (2024-02-28T06:07:07Z) - AEROBLADE: Training-Free Detection of Latent Diffusion Images Using Autoencoder Reconstruction Error [15.46508882889489]
A key enabler for generating high-resolution images with low computational cost has been the development of latent diffusion models (LDMs)
LDMs perform the denoising process in the low-dimensional latent space of a pre-trained autoencoder (AE) instead of the high-dimensional image space.
We propose a novel detection method which exploits an inherent component of LDMs: the AE used to transform images between image and latent space.
arXiv Detail & Related papers (2024-01-31T14:36:49Z) - Self-correcting LLM-controlled Diffusion Models [83.26605445217334]
We introduce Self-correcting LLM-controlled Diffusion (SLD)
SLD is a framework that generates an image from the input prompt, assesses its alignment with the prompt, and performs self-corrections on the inaccuracies in the generated image.
Our approach can rectify a majority of incorrect generations, particularly in generative numeracy, attribute binding, and spatial relationships.
arXiv Detail & Related papers (2023-11-27T18:56:37Z) - Exposing the Fake: Effective Diffusion-Generated Images Detection [14.646957596560076]
This paper proposes a novel detection method called Stepwise Error for Diffusion-generated Image Detection (SeDID)
SeDID exploits the unique attributes of diffusion models, namely deterministic reverse and deterministic denoising errors.
Our work makes a pivotal contribution to distinguishing diffusion model-generated images, marking a significant step in the domain of artificial intelligence security.
arXiv Detail & Related papers (2023-07-12T16:16:37Z) - DIRE for Diffusion-Generated Image Detection [128.95822613047298]
We propose a novel representation called DIffusion Reconstruction Error (DIRE)
DIRE measures the error between an input image and its reconstruction counterpart by a pre-trained diffusion model.
It provides a hint that DIRE can serve as a bridge to distinguish generated and real images.
arXiv Detail & Related papers (2023-03-16T13:15:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.