Related papers: Navigating the Challenges of AI-Generated Image Detection in the Wild: What Truly Matters?

Navigating the Challenges of AI-Generated Image Detection in the Wild: What Truly Matters?

URL: http://arxiv.org/abs/2507.10236v1
Date: Mon, 14 Jul 2025 12:56:55 GMT
Title: Navigating the Challenges of AI-Generated Image Detection in the Wild: What Truly Matters?
Authors: Despina Konstantinidou, Dimitrios Karageorgiou, Christos Koutlis, Olga Papadopoulou, Emmanouil Schinas, Symeon Papadopoulos,
Abstract summary: We introduce ITW-SM, a new dataset of real and AI-generated images collected from major social media platforms.<n>We identify four key factors that influence AID performance in real-world scenarios.<n>Our modifications result in an average AUC improvement of 26.87% across various AID models under real-world conditions.
Score: 9.916527862912941
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The rapid advancement of generative technologies presents both unprecedented creative opportunities and significant challenges, particularly in maintaining social trust and ensuring the integrity of digital information. Following these concerns, the challenge of AI-Generated Image Detection (AID) becomes increasingly critical. As these technologies become more sophisticated, the quality of AI-generated images has reached a level that can easily deceive even the most discerning observers. Our systematic evaluation highlights a critical weakness in current AI-Generated Image Detection models: while they perform exceptionally well on controlled benchmark datasets, they struggle significantly with real-world variations. To assess this, we introduce ITW-SM, a new dataset of real and AI-generated images collected from major social media platforms. In this paper, we identify four key factors that influence AID performance in real-world scenarios: backbone architecture, training data composition, pre-processing strategies and data augmentation combinations. By systematically analyzing these components, we shed light on their impact on detection efficacy. Our modifications result in an average AUC improvement of 26.87% across various AID models under real-world conditions.

Related papers

LAID: Lightweight AI-Generated Image Detection in Spatial and Spectral Domains [6.676901499867856]
Current state-of-the-art AIGI detection methods rely on large, deep neural architectures.<n>We introduce LAID, the first framework that benchmarks and evaluates the detection performance and efficiency of off-the-shelf lightweight neural networks.<n>Our results demonstrate that lightweight models can achieve competitive accuracy, even under adversarial conditions.
arXiv Detail & Related papers (2025-07-07T16:18:19Z)
Quality Assessment and Distortion-aware Saliency Prediction for AI-Generated Omnidirectional Images [70.49595920462579]
This work studies the quality assessment and distortion-aware saliency prediction problems for AIGODIs.<n>We propose two models with shared encoders based on the BLIP-2 model to evaluate the human visual experience and predict distortion-aware saliency for AI-generated omnidirectional images.
arXiv Detail & Related papers (2025-06-27T05:36:04Z)
A Deep Learning Approach for Facial Attribute Manipulation and Reconstruction in Surveillance and Reconnaissance [5.980822697955566]
Surveillance systems play a critical role in security and reconnaissance, but their performance is often compromised by low-quality images and videos.<n>Existing AI-based facial analysis models suffer from biases related to skin tone variations and partially occluded faces.<n>We propose a data-driven platform that enhances surveillance capabilities by generating synthetic training data tailored to compensate for dataset biases.
arXiv Detail & Related papers (2025-06-06T23:09:17Z)
RAID: A Dataset for Testing the Adversarial Robustness of AI-Generated Image Detectors [57.81012948133832]
We present RAID (Robust evaluation of AI-generated image Detectors), a dataset of 72k diverse and highly transferable adversarial examples.<n>Our methodology generates adversarial images that transfer with a high success rate to unseen detectors.<n>Our findings indicate that current state-of-the-art AI-generated image detectors can be easily deceived by adversarial examples.
arXiv Detail & Related papers (2025-06-04T14:16:00Z)
Is Artificial Intelligence Generated Image Detection a Solved Problem? [10.839070838139401]
AIGIBench is a benchmark designed to rigorously evaluate the robustness and generalization capabilities of state-of-the-art AIGI detectors.<n>It includes 23 diverse fake image subsets that span both advanced and widely adopted image generation techniques.<n>Experiments on 11 advanced detectors demonstrate that, despite their high reported accuracy in controlled settings, these detectors suffer significant performance drops on real-world data.
arXiv Detail & Related papers (2025-05-18T10:00:39Z)
Towards Explainable Partial-AIGC Image Quality Assessment [51.42831861127991]
Despite extensive research on image quality assessment (IQA) for AI-generated images (AGIs), most studies focus on fully AI-generated outputs.<n>We construct the first large-scale PAI dataset towards explainable partial-AIGC image quality assessment (EPAIQA)<n>Our work represents a pioneering effort in the perceptual IQA field for comprehensive PAI quality assessment.
arXiv Detail & Related papers (2025-04-12T17:27:50Z)
D-Judge: How Far Are We? Evaluating the Discrepancies Between AI-synthesized Images and Natural Images through Multimodal Guidance [19.760989919485894]
We introduce an AI-Natural Image Discrepancy accessing benchmark (textitD-Judge)<n>We construct textitD-ANI, a dataset with 5,000 natural images and over 440,000 AIGIs generated by nine models using Text-to-Image (T2I), Image-to-Image (I2I), and Text and Image-to-Image (TI2I) prompts.<n>Our framework evaluates the discrepancy across five dimensions: naive image quality, semantic alignment, aesthetic appeal, downstream applicability, and human validation.
arXiv Detail & Related papers (2024-12-23T15:08:08Z)
AI-generated Image Quality Assessment in Visual Communication [72.11144790293086]
AIGI-VC is a quality assessment database for AI-generated images in visual communication.<n>The dataset consists of 2,500 images spanning 14 advertisement topics and 8 emotion types.<n>It provides coarse-grained human preference annotations and fine-grained preference descriptions, benchmarking the abilities of IQA methods in preference prediction, interpretation, and reasoning.
arXiv Detail & Related papers (2024-12-20T08:47:07Z)
Addressing Vulnerabilities in AI-Image Detection: Challenges and Proposed Solutions [0.0]
This study evaluates the effectiveness of convolutional neural networks (CNNs) and DenseNet architectures for detecting AI-generated images.<n>We analyze the impact of updates and modifications such as Gaussian blurring, prompt text changes, and Low-Rank Adaptation (LoRA) on detection accuracy.<n>The findings highlight vulnerabilities in current detection methods and propose strategies to enhance the robustness and reliability of AI-image detection systems.
arXiv Detail & Related papers (2024-11-26T06:35:26Z)
Improving Interpretability and Robustness for the Detection of AI-Generated Images [6.116075037154215]
We analyze existing state-of-the-art AIGI detection methods based on frozen CLIP embeddings. We show how to interpret them, shedding light on how images produced by various AI generators differ from real ones.
arXiv Detail & Related papers (2024-06-21T10:33:09Z)
RIGID: A Training-free and Model-Agnostic Framework for Robust AI-Generated Image Detection [60.960988614701414]
RIGID is a training-free and model-agnostic method for robust AI-generated image detection. RIGID significantly outperforms existing trainingbased and training-free detectors.
arXiv Detail & Related papers (2024-05-30T14:49:54Z)
AIGCOIQA2024: Perceptual Quality Assessment of AI Generated Omnidirectional Images [70.42666704072964]
We establish a large-scale AI generated omnidirectional image IQA database named AIGCOIQA2024. A subjective IQA experiment is conducted to assess human visual preferences from three perspectives. We conduct a benchmark experiment to evaluate the performance of state-of-the-art IQA models on our database.
arXiv Detail & Related papers (2024-04-01T10:08:23Z)

This list is automatically generated from the titles and abstracts of the papers in this site.