Related papers: Beyond Binary Classification: A Semi-supervised Approach to Generalized AI-generated Image Detection

Beyond Binary Classification: A Semi-supervised Approach to Generalized AI-generated Image Detection

URL: http://arxiv.org/abs/2511.19499v1
Date: Sun, 23 Nov 2025 16:02:27 GMT
Title: Beyond Binary Classification: A Semi-supervised Approach to Generalized AI-generated Image Detection
Authors: Hong-Hanh Nguyen-Le, Van-Tuan Tran, Dinh-Thuc Nguyen, Nhien-An Le-Khac,
Abstract summary: A critical vulnerability in current forensics is the failure of detectors to achieve cross-generator generalization.<n>We propose a semi-supervised approach that enhances binary classification by discovering latent architectural patterns within the "fake" class.
Score: 1.189955933770711
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: The rapid advancement of generators (e.g., StyleGAN, Midjourney, DALL-E) has produced highly realistic synthetic images, posing significant challenges to digital media authenticity. These generators are typically based on a few core architectural families, primarily Generative Adversarial Networks (GANs) and Diffusion Models (DMs). A critical vulnerability in current forensics is the failure of detectors to achieve cross-generator generalization, especially when crossing architectural boundaries (e.g., from GANs to DMs). We hypothesize that this gap stems from fundamental differences in the artifacts produced by these \textbf{distinct architectures}. In this work, we provide a theoretical analysis explaining how the distinct optimization objectives of the GAN and DM architectures lead to different manifold coverage behaviors. We demonstrate that GANs permit partial coverage, often leading to boundary artifacts, while DMs enforce complete coverage, resulting in over-smoothing patterns. Motivated by this analysis, we propose the \textbf{Tri}archy \textbf{Detect}or (TriDetect), a semi-supervised approach that enhances binary classification by discovering latent architectural patterns within the "fake" class. TriDetect employs balanced cluster assignment via the Sinkhorn-Knopp algorithm and a cross-view consistency mechanism, encouraging the model to learn fundamental architectural distincts. We evaluate our approach on two standard benchmarks and three in-the-wild datasets against 13 baselines to demonstrate its generalization capability to unseen generators.

Related papers

UniG2U-Bench: Do Unified Models Advance Multimodal Understanding? [50.92401586025528]
Unified multimodal models have recently demonstrated strong generative capabilities, yet whether and when generation improves understanding remains unclear.<n>We introduce UniG2U-Bench, a comprehensive benchmark categorizing generation-to-understanding (G2U) evaluation into 7 regimes and 30 subtasks.
arXiv Detail & Related papers (2026-03-03T18:36:16Z)
OFA-MAS: One-for-All Multi-Agent System Topology Design based on Mixture-of-Experts Graph Generative Models [57.94189874119267]
Multi-Agent Systems (MAS) offer a powerful paradigm for solving complex problems.<n>Current graph learning-based design methodologies often adhere to a "one-for-one" paradigm.<n>We propose OFA-TAD, a one-for-all framework that generates adaptive collaboration graphs for any task described in natural language.
arXiv Detail & Related papers (2026-01-19T12:23:44Z)
AdaptPrompt: Parameter-Efficient Adaptation of VLMs for Generalizable Deepfake Detection [7.76090543025328]
Recent advances in image generation have led to the widespread availability of highly realistic synthetic media, increasing the difficulty of reliable deepfake detection.<n>A key challenge is generalization, as detectors trained on a narrow class of generators often fail when confronted with unseen models.<n>We address the pressing need for generalizable detection by leveraging large vision-language models, specifically CLIP, to identify synthetic content across diverse generative techniques.
arXiv Detail & Related papers (2025-12-19T16:06:03Z)
Scaling Up AI-Generated Image Detection via Generator-Aware Prototypes [15.99138549265524]
Generator-Aware Prototype Learning (GAPL) is a framework that constrains representation with a structured learning paradigm.<n>GAPL achieves state-of-the-art performance, showing superior detection accuracy across a wide variety of GAN and diffusion-based generators.
arXiv Detail & Related papers (2025-12-15T04:58:08Z)
OmniAID: Decoupling Semantic and Artifacts for Universal AI-Generated Image Detection in the Wild [40.585896860010656]
A universal AI-generated image detector must simultaneously generalize across diverse generative models and varied semantic content.<n>We propose OmniAID, a novel framework centered on a decoupled Mixture-of-Experts architecture.<n>Our model surpasses existing monolithic detectors, establishing a new, robust standard for AIGI authentication against modern, in-the-wild threats.
arXiv Detail & Related papers (2025-11-11T16:33:49Z)
Breaking Latent Prior Bias in Detectors for Generalizable AIGC Image Detection [11.907536189598577]
Current AIGC detectors often achieve near-perfect accuracy on images produced by the same generator used for training but struggle to generalize to outputs from unseen generators.<n>We trace this failure in part to latent prior bias: detectors learn shortcuts tied to patterns stemming from the initial noise vector rather than learning robust generative artifacts.<n>We propose On-Manifold Adversarial Training (OMAT), which generates adversarial examples that remain on the generator's output manifold.
arXiv Detail & Related papers (2025-06-01T07:20:45Z)
Generative Classifier for Domain Generalization [84.92088101715116]
Domain generalization aims to the generalizability of computer vision models toward distribution shifts.<n>We propose Generative-driven Domain Generalization (GCDG)<n>GCDG consists of three key modules: Heterogeneity Learning(HLC), Spurious Correlation(SCB), and Diverse Component Balancing(DCB)
arXiv Detail & Related papers (2025-04-03T04:38:33Z)
Towards Generalizable Forgery Detection and Reasoning [23.858913560970866]
We formulate detection and explanation as a unified Forgery Detection and Reasoning task (FDR-Task)<n>We introduce the Multi-Modal Forgery Reasoning dataset (MMFR-Dataset), a large-scale dataset containing 120K images across 10 generative models, with 378K reasoning annotations on forgery attributes.<n>Experiments across multiple generative models demonstrate that FakeReasoning achieves robust generalization and outperforms state-of-the-art methods on both detection and reasoning tasks.
arXiv Detail & Related papers (2025-03-27T06:54:06Z)
GenFace: A Large-Scale Fine-Grained Face Forgery Benchmark and Cross Appearance-Edge Learning [50.7702397913573]
The rapid advancement of photorealistic generators has reached a critical juncture where the discrepancy between authentic and manipulated images is increasingly indistinguishable. Although there have been a number of publicly available face forgery datasets, the forgery faces are mostly generated using GAN-based synthesis technology. We propose a large-scale, diverse, and fine-grained high-fidelity dataset, namely GenFace, to facilitate the advancement of deepfake detection.
arXiv Detail & Related papers (2024-02-03T03:13:50Z)
Rethinking the Up-Sampling Operations in CNN-based Generative Network for Generalizable Deepfake Detection [86.97062579515833]
We introduce the concept of Neighboring Pixel Relationships(NPR) as a means to capture and characterize the generalized structural artifacts stemming from up-sampling operations. A comprehensive analysis is conducted on an open-world dataset, comprising samples generated by tft28 distinct generative models. This analysis culminates in the establishment of a novel state-of-the-art performance, showcasing a remarkable tft11.6% improvement over existing methods.
arXiv Detail & Related papers (2023-12-16T14:27:06Z)
Closed-Form Factorization of Latent Semantics in GANs [65.42778970898534]
A rich set of interpretable dimensions has been shown to emerge in the latent space of the Generative Adversarial Networks (GANs) trained for synthesizing images. In this work, we examine the internal representation learned by GANs to reveal the underlying variation factors in an unsupervised manner. We propose a closed-form factorization algorithm for latent semantic discovery by directly decomposing the pre-trained weights.
arXiv Detail & Related papers (2020-07-13T18:05:36Z)

This list is automatically generated from the titles and abstracts of the papers in this site.