Rethinking Cross-Generator Image Forgery Detection through DINOv3
- URL: http://arxiv.org/abs/2511.22471v1
- Date: Thu, 27 Nov 2025 14:01:50 GMT
- Title: Rethinking Cross-Generator Image Forgery Detection through DINOv3
- Authors: Zhenglin Huang, Jason Li, Haiquan Wen, Tianxiao Li, Xi Yang, Lu Qi, Bei Peng, Xiaowei Huang, Ming-Hsuan Yang, Guangliang Cheng,
- Abstract summary: Cross-generator detection has emerged as a new challenge forgenerative models.<n>We show that frozen visual foundation models, especially DINOv3, already exhibit strong cross-generator detection capability.<n>We introduce a training-free token-ranking strategy followed by a lightweight linear probe to select a small subset of authenticity-relevant tokens.
- Score: 62.80415066351157
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: As generative models become increasingly diverse and powerful, cross-generator detection has emerged as a new challenge. Existing detection methods often memorize artifacts of specific generative models rather than learning transferable cues, leading to substantial failures on unseen generators. Surprisingly, this work finds that frozen visual foundation models, especially DINOv3, already exhibit strong cross-generator detection capability without any fine-tuning. Through systematic studies on frequency, spatial, and token perspectives, we observe that DINOv3 tends to rely on global, low-frequency structures as weak but transferable authenticity cues instead of high-frequency, generator-specific artifacts. Motivated by this insight, we introduce a simple, training-free token-ranking strategy followed by a lightweight linear probe to select a small subset of authenticity-relevant tokens. This token subset consistently improves detection accuracy across all evaluated datasets. Our study provides empirical evidence and a feasible hypothesis for understanding why foundation models generalize across diverse generators, offering a universal, efficient, and interpretable baseline for image forgery detection.
Related papers
- AdaptPrompt: Parameter-Efficient Adaptation of VLMs for Generalizable Deepfake Detection [7.76090543025328]
Recent advances in image generation have led to the widespread availability of highly realistic synthetic media, increasing the difficulty of reliable deepfake detection.<n>A key challenge is generalization, as detectors trained on a narrow class of generators often fail when confronted with unseen models.<n>We address the pressing need for generalizable detection by leveraging large vision-language models, specifically CLIP, to identify synthetic content across diverse generative techniques.
arXiv Detail & Related papers (2025-12-19T16:06:03Z) - Beyond Semantic Features: Pixel-level Mapping for Generalized AI-Generated Image Detection [30.53429368921365]
A critical limitation of current detectors is their failure to generalize to images from unseen generative models.<n>We introduce a simple yet remarkably effective pixel-level mapping pre-processing step to disrupt the pixel value distribution of images.<n>We show that our approach significantly boosts the cross-generator performance of state-of-the-art detectors.
arXiv Detail & Related papers (2025-12-19T08:47:09Z) - Foundation Visual Encoders Are Secretly Few-Shot Anomaly Detectors [58.75916798814376]
We develop a few-shot anomaly detector termed FoundAD.<n>We observe that the anomaly amount in an image directly correlates with the difference in the learnt embeddings.<n>The simple operator acts as an effective tool for anomaly detection to characterize and identify out-of-distribution regions in an image.
arXiv Detail & Related papers (2025-10-02T11:53:20Z) - Breaking Latent Prior Bias in Detectors for Generalizable AIGC Image Detection [11.907536189598577]
Current AIGC detectors often achieve near-perfect accuracy on images produced by the same generator used for training but struggle to generalize to outputs from unseen generators.<n>We trace this failure in part to latent prior bias: detectors learn shortcuts tied to patterns stemming from the initial noise vector rather than learning robust generative artifacts.<n>We propose On-Manifold Adversarial Training (OMAT), which generates adversarial examples that remain on the generator's output manifold.
arXiv Detail & Related papers (2025-06-01T07:20:45Z) - Open-Set Deepfake Detection: A Parameter-Efficient Adaptation Method with Forgery Style Mixture [81.93945602120453]
We introduce an approach that is both general and parameter-efficient for face forgery detection.<n>We design a forgery-style mixture formulation that augments the diversity of forgery source domains.<n>We show that the designed model achieves state-of-the-art generalizability with significantly reduced trainable parameters.
arXiv Detail & Related papers (2024-08-23T01:53:36Z) - D$^3$: Scaling Up Deepfake Detection by Learning from Discrepancy [29.919663502808575]
Existing literature emphasizes the generalization capability of deepfake detection on unseen generators.<n>This work seeks a step toward a universal deepfake detection system with better generalization and robustness.
arXiv Detail & Related papers (2024-04-06T10:45:02Z) - GenFace: A Large-Scale Fine-Grained Face Forgery Benchmark and Cross Appearance-Edge Learning [50.7702397913573]
The rapid advancement of photorealistic generators has reached a critical juncture where the discrepancy between authentic and manipulated images is increasingly indistinguishable.
Although there have been a number of publicly available face forgery datasets, the forgery faces are mostly generated using GAN-based synthesis technology.
We propose a large-scale, diverse, and fine-grained high-fidelity dataset, namely GenFace, to facilitate the advancement of deepfake detection.
arXiv Detail & Related papers (2024-02-03T03:13:50Z) - Weakly-supervised deepfake localization in diffusion-generated images [4.548755617115687]
We propose a weakly-supervised localization problem based on the Xception network as the backbone architecture.
We show that the best performing detection method (based on local scores) is less sensitive to the looser supervision than to the mismatch in terms of dataset or generator.
arXiv Detail & Related papers (2023-11-08T10:27:36Z) - Augment and Criticize: Exploring Informative Samples for Semi-Supervised
Monocular 3D Object Detection [64.65563422852568]
We improve the challenging monocular 3D object detection problem with a general semi-supervised framework.
We introduce a novel, simple, yet effective Augment and Criticize' framework that explores abundant informative samples from unlabeled data.
The two new detectors, dubbed 3DSeMo_DLE and 3DSeMo_FLEX, achieve state-of-the-art results with remarkable improvements for over 3.5% AP_3D/BEV (Easy) on KITTI.
arXiv Detail & Related papers (2023-03-20T16:28:15Z) - Beyond the Spectrum: Detecting Deepfakes via Re-Synthesis [69.09526348527203]
Deep generative models have led to highly realistic media, known as deepfakes, that are commonly indistinguishable from real to human eyes.
We propose a novel fake detection that is designed to re-synthesize testing images and extract visual cues for detection.
We demonstrate the improved effectiveness, cross-GAN generalization, and robustness against perturbations of our approach in a variety of detection scenarios.
arXiv Detail & Related papers (2021-05-29T21:22:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.