Trinity Detector:text-assisted and attention mechanisms based spectral fusion for diffusion generation image detection
- URL: http://arxiv.org/abs/2404.17254v1
- Date: Fri, 26 Apr 2024 08:50:35 GMT
- Title: Trinity Detector:text-assisted and attention mechanisms based spectral fusion for diffusion generation image detection
- Authors: Jiawei Song, Dengpan Ye, Yunming Zhang,
- Abstract summary: Trinity Detector incorporates coarse-grained text features through a CLIP encoder, coherently integrating them with fine-grained artifacts in the pixel domain for comprehensive multimodal detection.
Our performance is competitive across all datasets and up to 17.6% improvement in transferability in the diffusion datasets.
- Score: 3.8473504915778745
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Artificial Intelligence Generated Content (AIGC) techniques, represented by text-to-image generation, have led to a malicious use of deep forgeries, raising concerns about the trustworthiness of multimedia content. Adapting traditional forgery detection methods to diffusion models proves challenging. Thus, this paper proposes a forgery detection method explicitly designed for diffusion models called Trinity Detector. Trinity Detector incorporates coarse-grained text features through a CLIP encoder, coherently integrating them with fine-grained artifacts in the pixel domain for comprehensive multimodal detection. To heighten sensitivity to diffusion-generated image features, a Multi-spectral Channel Attention Fusion Unit (MCAF) is designed, extracting spectral inconsistencies through adaptive fusion of diverse frequency bands and further integrating spatial co-occurrence of the two modalities. Extensive experimentation validates that our Trinity Detector method outperforms several state-of-the-art methods, our performance is competitive across all datasets and up to 17.6\% improvement in transferability in the diffusion datasets.
Related papers
- StealthDiffusion: Towards Evading Diffusion Forensic Detection through Diffusion Model [62.25424831998405]
StealthDiffusion is a framework that modifies AI-generated images into high-quality, imperceptible adversarial examples.
It is effective in both white-box and black-box settings, transforming AI-generated images into high-quality adversarial forgeries.
arXiv Detail & Related papers (2024-08-11T01:22:29Z) - Beyond the Visible: Jointly Attending to Spectral and Spatial Dimensions with HSI-Diffusion for the FINCH Spacecraft [2.5057561650768814]
The FINCH mission aims to monitor crop residue cover in agricultural fields.
Hyperspectral imaging captures both spectral and spatial information.
It is prone to various types of noise, including random noise, stripe noise, and dead pixels.
arXiv Detail & Related papers (2024-06-15T19:34:18Z) - DA-HFNet: Progressive Fine-Grained Forgery Image Detection and Localization Based on Dual Attention [12.36906630199689]
We construct a DA-HFNet forged image dataset guided by text or image-assisted GAN and Diffusion model.
Our goal is to utilize a hierarchical progressive network to capture forged artifacts at different scales for detection and localization.
arXiv Detail & Related papers (2024-06-03T16:13:33Z) - Diffusion Facial Forgery Detection [56.69763252655695]
This paper introduces DiFF, a comprehensive dataset dedicated to face-focused diffusion-generated images.
We conduct extensive experiments on the DiFF dataset via a human test and several representative forgery detection methods.
The results demonstrate that the binary detection accuracy of both human observers and automated detectors often falls below 30%.
arXiv Detail & Related papers (2024-01-29T03:20:19Z) - DiAD: A Diffusion-based Framework for Multi-class Anomaly Detection [55.48770333927732]
We propose a Difusion-based Anomaly Detection (DiAD) framework for multi-class anomaly detection.
It consists of a pixel-space autoencoder, a latent-space Semantic-Guided (SG) network with a connection to the stable diffusion's denoising network, and a feature-space pre-trained feature extractor.
Experiments on MVTec-AD and VisA datasets demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2023-12-11T18:38:28Z) - Multimodal Transformer Using Cross-Channel attention for Object Detection in Remote Sensing Images [1.662438436885552]
Multi-modal fusion has been determined to enhance the accuracy by fusing data from multiple modalities.
We propose a novel multi-modal fusion strategy for mapping relationships between different channels at the early stage.
By addressing fusion in the early stage, as opposed to mid or late-stage methods, our method achieves competitive and even superior performance compared to existing techniques.
arXiv Detail & Related papers (2023-10-21T00:56:11Z) - DiffUCD:Unsupervised Hyperspectral Image Change Detection with Semantic
Correlation Diffusion Model [46.68717345017946]
Hyperspectral image change detection (HSI-CD) has emerged as a crucial research area in remote sensing.
We propose a novel unsupervised HSI-CD with semantic correlation diffusion model (DiffUCD)
Our method can achieve comparable results to those fully supervised methods requiring numerous samples.
arXiv Detail & Related papers (2023-05-21T09:21:41Z) - Multimodal Industrial Anomaly Detection via Hybrid Fusion [59.16333340582885]
We propose a novel multimodal anomaly detection method with hybrid fusion scheme.
Our model outperforms the state-of-the-art (SOTA) methods on both detection and segmentation precision on MVTecD-3 AD dataset.
arXiv Detail & Related papers (2023-03-01T15:48:27Z) - Target-aware Dual Adversarial Learning and a Multi-scenario
Multi-Modality Benchmark to Fuse Infrared and Visible for Object Detection [65.30079184700755]
This study addresses the issue of fusing infrared and visible images that appear differently for object detection.
Previous approaches discover commons underlying the two modalities and fuse upon the common space either by iterative optimization or deep networks.
This paper proposes a bilevel optimization formulation for the joint problem of fusion and detection, and then unrolls to a target-aware Dual Adversarial Learning (TarDAL) network for fusion and a commonly used detection network.
arXiv Detail & Related papers (2022-03-30T11:44:56Z) - Unsupervised Change Detection in Hyperspectral Images using Feature
Fusion Deep Convolutional Autoencoders [15.978029004247617]
The proposed work aims to build a novel feature extraction system using a feature fusion deep convolutional autoencoder.
It is found that the proposed method clearly outperformed the state of the art methods in unsupervised change detection for all the datasets.
arXiv Detail & Related papers (2021-09-10T16:52:31Z) - M2TR: Multi-modal Multi-scale Transformers for Deepfake Detection [74.19291916812921]
forged images generated by Deepfake techniques pose a serious threat to the trustworthiness of digital information.
In this paper, we aim to capture the subtle manipulation artifacts at different scales for Deepfake detection.
We introduce a high-quality Deepfake dataset, SR-DF, which consists of 4,000 DeepFake videos generated by state-of-the-art face swapping and facial reenactment methods.
arXiv Detail & Related papers (2021-04-20T05:43:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.