Related papers: Diversity over Uniformity: Rethinking Representation in Generated Image Detection

Diversity over Uniformity: Rethinking Representation in Generated Image Detection

URL: http://arxiv.org/abs/2603.00717v1
Date: Sat, 28 Feb 2026 15:42:12 GMT
Title: Diversity over Uniformity: Rethinking Representation in Generated Image Detection
Authors: Qinghui He, Haifeng Zhang, Qiao Qin, Bo Liu, Xiuli Bi, Bin Xiao,
Abstract summary: We argue that reliably generated image detection should not depend on a single decision path but should preserve multiple judgment perspectives.<n>We propose an anti-feature-collapse learning framework that filters task-irrelevant components and suppresses excessive overlap among different forgery cues in the representation space.<n>This design maintains diverse and complementary evidence within the model, reduces reliance on a small set of salient cues, and enhances robustness under unseen generative settings.
Score: 22.020742109848317
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: With the rapid advancement of generative models, generated image detection has become an important task in visual forensics. Although existing methods have achieved remarkable progress, they often rely, after training, on only a small subset of highly salient forgery cues, which limits their ability to generalize to unseen generative mechanisms. We argue that reliably generated image detection should not depend on a single decision path but should preserve multiple judgment perspectives, enabling the model to understand the differences between real and generated images from diverse viewpoints. Based on this idea, we propose an anti-feature-collapse learning framework that filters task-irrelevant components and suppresses excessive overlap among different forgery cues in the representation space, preventing discriminative information from collapsing into a few dominant feature directions. This design maintains diverse and complementary evidence within the model, reduces reliance on a small set of salient cues, and enhances robustness under unseen generative settings. Extensive experiments on multiple public benchmarks demonstrate that the proposed method significantly outperforms the state-of-the-art approaches in cross-model scenarios, achieving an accuracy improvement of 5.02% and exhibiting superior generalization and detection reliability. The source code is available at https://github.com/Yanmou-Hui/DoU.

Related papers

ThinkFake: Reasoning in Multimodal Large Language Models for AI-Generated Image Detection [51.93101033997245]
Increasing realism of AI-generated images has raised serious concerns about misinformation and privacy violations.<n>We propose ThinkFake, a novel reasoning-based and generalizable framework for AI-generated image detection.<n>We show that ThinkFake outperforms state-of-the-art methods on the GenImage benchmark and demonstrates strong zero-shot generalization on the challenging LOKI benchmark.
arXiv Detail & Related papers (2025-09-24T07:34:09Z)
BIDO: A Unified Approach to Address Obfuscation and Concept Drift Challenges in Image-based Malware Detection [15.388728305777908]
BIDO is a hybrid image-based malware detector designed to enhance robustness against both obfuscation and concept drift simultaneously.<n> Specifically, to improve the discriminative power of image features, we introduce a local feature selection module.<n>Third, to ensure feature compactness, we design a learnable metric that pulls samples with identical labels closer.
arXiv Detail & Related papers (2025-09-04T01:48:03Z)
MiraGe: Multimodal Discriminative Representation Learning for Generalizable AI-Generated Image Detection [32.662682253295486]
We propose Multimodal Discriminative Learning for Generalizable AI-generated Image Detection (MiraGegenerator)<n>We apply multimodal prompt learning to further refine these principles into CLIP, leveraging text embeddings as semantic anchors for effective discriminative representation learning.<n>MiraGegenerator achieves state-of-the-art performance, maintaining robustness even against unseen generators like Sora.
arXiv Detail & Related papers (2025-08-03T00:19:18Z)
A Meaningful Perturbation Metric for Evaluating Explainability Methods [55.09730499143998]
We introduce a novel approach, which harnesses image generation models to perform targeted perturbation.<n> Specifically, we focus on inpainting only the high-relevance pixels of an input image to modify the model's predictions while preserving image fidelity.<n>This is in contrast to existing approaches, which often produce out-of-distribution modifications, leading to unreliable results.
arXiv Detail & Related papers (2025-04-09T11:46:41Z)
Understanding and Improving Training-Free AI-Generated Image Detections with Vision Foundation Models [68.90917438865078]
Deepfake techniques for facial synthesis and editing pose serious risks for generative models.<n>In this paper, we investigate how detection performance varies across model backbones, types, and datasets.<n>We introduce Contrastive Blur, which enhances performance on facial images, and MINDER, which addresses noise type bias, balancing performance across domains.
arXiv Detail & Related papers (2024-11-28T13:04:45Z)
RIGID: A Training-free and Model-Agnostic Framework for Robust AI-Generated Image Detection [60.960988614701414]
RIGID is a training-free and model-agnostic method for robust AI-generated image detection. RIGID significantly outperforms existing trainingbased and training-free detectors.
arXiv Detail & Related papers (2024-05-30T14:49:54Z)
Towards Robust GAN-generated Image Detection: a Multi-view Completion Representation [27.483031588071942]
GAN-generated image detection now becomes the first line of defense against the malicious uses of machine-synthesized image manipulations such as deepfakes. We propose a robust detection framework based on a novel multi-view image completion representation. We evaluate the generalization ability of our framework across six popular GANs at different resolutions and its robustness against a broad range of perturbation attacks.
arXiv Detail & Related papers (2023-06-02T08:38:02Z)
Hierarchical Forgery Classifier On Multi-modality Face Forgery Clues [61.37306431455152]
We propose a novel Hierarchical Forgery for Multi-modality Face Forgery Detection (HFC-MFFD) The HFC-MFFD learns robust patches-based hybrid representation to enhance forgery authentication in multiple-modality scenarios. The specific hierarchical face forgery is proposed to alleviate the class imbalance problem and further boost detection performance.
arXiv Detail & Related papers (2022-12-30T10:54:29Z)
Few-shot Forgery Detection via Guided Adversarial Interpolation [56.59499187594308]
Existing forgery detection methods suffer from significant performance drops when applied to unseen novel forgery approaches. We propose Guided Adversarial Interpolation (GAI) to overcome the few-shot forgery detection problem. Our method is validated to be robust to choices of majority and minority forgery approaches.
arXiv Detail & Related papers (2022-04-12T16:05:10Z)
Multi-view Contrastive Coding of Remote Sensing Images at Pixel-level [5.64497799927668]
A pixel-wise contrastive approach based on an unlabeled multi-view setting is proposed to overcome this limitation. A pseudo-Siamese ResUnet is trained to learn a representation that aims to align features from the shifted positive pairs. Results demonstrate both improvements in efficiency and accuracy over the state-of-the-art multi-view contrastive methods.
arXiv Detail & Related papers (2021-05-18T13:28:46Z)
SPatchGAN: A Statistical Feature Based Discriminator for Unsupervised Image-to-Image Translation [4.466402706561989]
For unsupervised image-to-image translation, we propose a discriminator architecture which focuses on the statistical features instead of individual patches. We show that the proposed method outperforms the existing state-of-the-art models in various challenging applications including selfie-to-anime, male-to-female and glasses removal.
arXiv Detail & Related papers (2021-03-30T10:03:07Z)

This list is automatically generated from the titles and abstracts of the papers in this site.