Exploring the Collaborative Advantage of Low-level Information on Generalizable AI-Generated Image Detection
- URL: http://arxiv.org/abs/2504.00463v1
- Date: Tue, 01 Apr 2025 06:38:08 GMT
- Title: Exploring the Collaborative Advantage of Low-level Information on Generalizable AI-Generated Image Detection
- Authors: Ziyin Zhou, Ke Sun, Zhongxi Chen, Xianming Lin, Yunpeng Luo, Ke Yan, Shouhong Ding, Xiaoshuai Sun,
- Abstract summary: Existing AI-generated image detection methods consider only a single type of low-level information.<n>Different low-level information often exhibits generalization capabilities for different types of forgeries.<n>We propose the Adaptive Low-level Experts Injection framework, enabling the backbone network to accept and learn knowledge from different low-level information.
- Score: 46.5480496076675
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Existing state-of-the-art AI-Generated image detection methods mostly consider extracting low-level information from RGB images to help improve the generalization of AI-Generated image detection, such as noise patterns. However, these methods often consider only a single type of low-level information, which may lead to suboptimal generalization. Through empirical analysis, we have discovered a key insight: different low-level information often exhibits generalization capabilities for different types of forgeries. Furthermore, we found that simple fusion strategies are insufficient to leverage the detection advantages of each low-level and high-level information for various forgery types. Therefore, we propose the Adaptive Low-level Experts Injection (ALEI) framework. Our approach introduces Lora Experts, enabling the backbone network, which is trained with high-level semantic RGB images, to accept and learn knowledge from different low-level information. We utilize a cross-attention method to adaptively fuse these features at intermediate layers. To prevent the backbone network from losing the modeling capabilities of different low-level features during the later stages of modeling, we developed a Low-level Information Adapter that interacts with the features extracted by the backbone network. Finally, we propose Dynamic Feature Selection, which dynamically selects the most suitable features for detecting the current image to maximize generalization detection capability. Extensive experiments demonstrate that our method, finetuned on only four categories of mainstream ProGAN data, performs excellently and achieves state-of-the-art results on multiple datasets containing unseen GAN and Diffusion methods.
Related papers
- Hierarchical Information Flow for Generalized Efficient Image Restoration [108.83750852785582]
We propose a hierarchical information flow mechanism for image restoration, dubbed Hi-IR.<n>Hi-IR constructs a hierarchical information tree representing the degraded image across three levels.<n>In seven common image restoration tasks, Hi-IR achieves its effectiveness and generalizability.
arXiv Detail & Related papers (2024-11-27T18:30:08Z) - Diffusion Model Based Visual Compensation Guidance and Visual Difference Analysis for No-Reference Image Quality Assessment [78.21609845377644]
We propose a novel class of state-of-the-art (SOTA) generative model, which exhibits the capability to model intricate relationships.<n>We devise a new diffusion restoration network that leverages the produced enhanced image and noise-containing images.<n>Two visual evaluation branches are designed to comprehensively analyze the obtained high-level feature information.
arXiv Detail & Related papers (2024-02-22T09:39:46Z) - Simple Image-level Classification Improves Open-vocabulary Object
Detection [27.131298903486474]
Open-Vocabulary Object Detection (OVOD) aims to detect novel objects beyond a given set of base categories on which the detection model is trained.
Recent OVOD methods focus on adapting the image-level pre-trained vision-language models (VLMs), such as CLIP, to a region-level object detection task via, eg., region-level knowledge distillation, regional prompt learning, or region-text pre-training.
These methods have demonstrated remarkable performance in recognizing regional visual concepts, but they are weak in exploiting the VLMs' powerful global scene understanding ability learned from the billion-scale
arXiv Detail & Related papers (2023-12-16T13:06:15Z) - Fusing Global and Local Features for Generalized AI-Synthesized Image
Detection [31.35052580048599]
We design a two-branch model to combine global spatial information from the whole image and local informative features from patches selected by a novel patch selection module.
We collect a highly diverse dataset synthesized by 19 models with various objects and resolutions to evaluate our model.
arXiv Detail & Related papers (2022-03-26T01:55:37Z) - Adversarial Feature Augmentation and Normalization for Visual
Recognition [109.6834687220478]
Recent advances in computer vision take advantage of adversarial data augmentation to ameliorate the generalization ability of classification models.
Here, we present an effective and efficient alternative that advocates adversarial augmentation on intermediate feature embeddings.
We validate the proposed approach across diverse visual recognition tasks with representative backbone networks.
arXiv Detail & Related papers (2021-03-22T20:36:34Z) - Unifying Remote Sensing Image Retrieval and Classification with Robust
Fine-tuning [3.6526118822907594]
We aim at unifying remote sensing image retrieval and classification with a new large-scale training and testing dataset, SF300.
We show that our framework systematically achieves a boost of retrieval and classification performance on nine different datasets compared to an ImageNet pretrained baseline.
arXiv Detail & Related papers (2021-02-26T11:01:30Z) - Learning Deep Interleaved Networks with Asymmetric Co-Attention for
Image Restoration [65.11022516031463]
We present a deep interleaved network (DIN) that learns how information at different states should be combined for high-quality (HQ) images reconstruction.
In this paper, we propose asymmetric co-attention (AsyCA) which is attached at each interleaved node to model the feature dependencies.
Our presented DIN can be trained end-to-end and applied to various image restoration tasks.
arXiv Detail & Related papers (2020-10-29T15:32:00Z) - Attention Model Enhanced Network for Classification of Breast Cancer
Image [54.83246945407568]
AMEN is formulated in a multi-branch fashion with pixel-wised attention model and classification submodular.
To focus more on subtle detail information, the sample image is enhanced by the pixel-wised attention map generated from former branch.
Experiments conducted on three benchmark datasets demonstrate the superiority of the proposed method under various scenarios.
arXiv Detail & Related papers (2020-10-07T08:44:21Z) - Understanding Anomaly Detection with Deep Invertible Networks through
Hierarchies of Distributions and Features [4.25227087152716]
Convolutional networks learn similar low-level feature distributions when trained on any natural image dataset.
When the discriminative features between inliers and outliers are on a high-level, anomaly detection becomes particularly challenging.
We propose two methods to remove the negative impact of model bias and domain prior on detecting high-level differences.
arXiv Detail & Related papers (2020-06-18T20:56:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.