Explainable Synthetic Image Detection through Diffusion Timestep Ensembling
- URL: http://arxiv.org/abs/2503.06201v1
- Date: Sat, 08 Mar 2025 13:04:20 GMT
- Title: Explainable Synthetic Image Detection through Diffusion Timestep Ensembling
- Authors: Yixin Wu, Feiran Zhang, Tianyuan Shi, Ruicheng Yin, Zhenghua Wang, Zhenliang Gan, Xiaohua Wang, Changze Lv, Xiaoqing Zheng, Xuanjing Huang,
- Abstract summary: Recent advances in diffusion models have enabled the creation of deceptively real images.<n>Recent advances in diffusion models have enabled the creation of deceptively real images, posing significant security risks when misused.
- Score: 30.298198387824275
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent advances in diffusion models have enabled the creation of deceptively real images, posing significant security risks when misused. In this study, we reveal that natural and synthetic images exhibit distinct differences in the high-frequency domains of their Fourier power spectra after undergoing iterative noise perturbations through an inverse multi-step denoising process, suggesting that such noise can provide additional discriminative information for identifying synthetic images. Based on this observation, we propose a novel detection method that amplifies these differences by progressively adding noise to the original images across multiple timesteps, and train an ensemble of classifiers on these noised images. To enhance human comprehension, we introduce an explanation generation and refinement module to identify flaws located in AI-generated images. Additionally, we construct two new datasets, GenHard and GenExplain, derived from the GenImage benchmark, providing detection samples of greater difficulty and high-quality rationales for fake images. Extensive experiments show that our method achieves state-of-the-art performance with 98.91% and 95.89% detection accuracy on regular and harder samples, increasing a minimal of 2.51% and 3.46% compared to baselines. Furthermore, our method also generalizes effectively to images generated by other diffusion models. Our code and datasets will be made publicly available.
Related papers
- DiffDoctor: Diagnosing Image Diffusion Models Before Treating [57.82359018425674]
We propose DiffDoctor, a two-stage pipeline to assist image diffusion models in generating fewer artifacts.<n>We collect a dataset of over 1M flawed synthesized images and set up an efficient human-in-the-loop annotation process.<n>The learned artifact detector is then involved in the second stage to tune the diffusion model through assigning a per-pixel confidence map for each image.
arXiv Detail & Related papers (2025-01-21T18:56:41Z) - Understanding and Improving Training-Free AI-Generated Image Detections with Vision Foundation Models [68.90917438865078]
Deepfake techniques for facial synthesis and editing pose serious risks for generative models.<n>In this paper, we investigate how detection performance varies across model backbones, types, and datasets.<n>We introduce Contrastive Blur, which enhances performance on facial images, and MINDER, which addresses noise type bias, balancing performance across domains.
arXiv Detail & Related papers (2024-11-28T13:04:45Z) - Time Step Generating: A Universal Synthesized Deepfake Image Detector [0.4488895231267077]
We propose a universal synthetic image detector Time Step Generating (TSG)
TSG does not rely on pre-trained models' reconstructing ability, specific datasets, or sampling algorithms.
We test the proposed TSG on the large-scale GenImage benchmark and it achieves significant improvements in both accuracy and generalizability.
arXiv Detail & Related papers (2024-11-17T09:39:50Z) - StealthDiffusion: Towards Evading Diffusion Forensic Detection through Diffusion Model [62.25424831998405]
StealthDiffusion is a framework that modifies AI-generated images into high-quality, imperceptible adversarial examples.
It is effective in both white-box and black-box settings, transforming AI-generated images into high-quality adversarial forgeries.
arXiv Detail & Related papers (2024-08-11T01:22:29Z) - Multi-Sensor Diffusion-Driven Optical Image Translation for Large-Scale Applications [3.4085512042262374]
We propose a method that super-resolves large-scale low spatial resolution images into high-resolution equivalents from disparate optical sensors.<n>Our approach provides precise domain adaptation, preserving image content while improving radiometric accuracy and feature representation.<n>We reach a mean Learned Perceptual Image Patch Similarity (mLPIPS) of 0.1884 and a Fr'echet Inception Distance (FID) of 45.64, expressively outperforming all compared methods.
arXiv Detail & Related papers (2024-04-17T10:49:00Z) - Diffusion Noise Feature: Accurate and Fast Generated Image Detection [28.262273539251172]
Generative models have reached an advanced stage where they can produce remarkably realistic images.
Existing image detectors for generated images encounter challenges such as low accuracy and limited generalization.
This paper seeks to address this issue by seeking a representation with strong generalization capabilities to enhance the detection of generated images.
arXiv Detail & Related papers (2023-12-05T10:01:11Z) - Diffusion Reconstruction of Ultrasound Images with Informative
Uncertainty [5.375425938215277]
Enhancing ultrasound image quality involves balancing concurrent factors like contrast, resolution, and speckle preservation.
We propose a hybrid approach leveraging advances in diffusion models.
We conduct comprehensive experiments on simulated, in-vitro, and in-vivo data, demonstrating the efficacy of our approach.
arXiv Detail & Related papers (2023-10-31T16:51:40Z) - Simultaneous Image-to-Zero and Zero-to-Noise: Diffusion Models with Analytical Image Attenuation [53.04220377034574]
We propose incorporating an analytical image attenuation process into the forward diffusion process for high-quality (un)conditioned image generation.<n>Our method represents the forward image-to-noise mapping as simultaneous textitimage-to-zero mapping and textitzero-to-noise mapping.<n>We have conducted experiments on unconditioned image generation, textite.g., CIFAR-10 and CelebA-HQ-256, and image-conditioned downstream tasks such as super-resolution, saliency detection, edge detection, and image inpainting.
arXiv Detail & Related papers (2023-06-23T18:08:00Z) - Parents and Children: Distinguishing Multimodal DeepFakes from Natural Images [60.34381768479834]
Recent advancements in diffusion models have enabled the generation of realistic deepfakes from textual prompts in natural language.
We pioneer a systematic study on deepfake detection generated by state-of-the-art diffusion models.
arXiv Detail & Related papers (2023-04-02T10:25:09Z) - SAR Despeckling using a Denoising Diffusion Probabilistic Model [52.25981472415249]
The presence of speckle degrades the image quality and adversely affects the performance of SAR image understanding applications.
We introduce SAR-DDPM, a denoising diffusion probabilistic model for SAR despeckling.
The proposed method achieves significant improvements in both quantitative and qualitative results over the state-of-the-art despeckling methods.
arXiv Detail & Related papers (2022-06-09T14:00:26Z) - Poisson2Sparse: Self-Supervised Poisson Denoising From a Single Image [34.27748767631027]
We present a novel self-supervised learning method for single-image denoising.
We approximate traditional iterative optimization algorithms for image denoising with a recurrent neural network.
Our method outperforms the state-of-the-art approaches in terms of PSNR and SSIM.
arXiv Detail & Related papers (2022-06-04T00:08:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.