We are not able to identify AI-generated images
- URL: http://arxiv.org/abs/2512.22236v1
- Date: Tue, 23 Dec 2025 11:55:40 GMT
- Title: We are not able to identify AI-generated images
- Authors: Adrien Pavão,
- Abstract summary: Our dataset contains 120 difficult cases: real images sampled from CC12M, and carefully curated AI-generated counterparts produced with MidJourney.<n>Our results indicate that, even on relatively simple portrait images, humans struggle to reliably detect AI-generated content.<n>These findings highlight the need for greater awareness and ethical guidelines as AI-generated media becomes increasingly indistinguishable from reality.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: AI-generated images are now pervasive online, yet many people believe they can easily tell them apart from real photographs. We test this assumption through an interactive web experiment where participants classify 20 images as real or AI-generated. Our dataset contains 120 difficult cases: real images sampled from CC12M, and carefully curated AI-generated counterparts produced with MidJourney. In total, 165 users completed 233 sessions. Their average accuracy was 54%, only slightly above random guessing, with limited improvement across repeated attempts. Response times averaged 7.3 seconds, and some images were consistently more deceptive than others. These results indicate that, even on relatively simple portrait images, humans struggle to reliably detect AI-generated content. As synthetic media continues to improve, human judgment alone is becoming insufficient for distinguishing real from artificial data. These findings highlight the need for greater awareness and ethical guidelines as AI-generated media becomes increasingly indistinguishable from reality.
Related papers
- Bridging the Gap Between Ideal and Real-world Evaluation: Benchmarking AI-Generated Image Detection in Challenging Scenarios [54.07895223545793]
This paper introduces the Real-World Robustness dataset (RRDataset) for comprehensive evaluation of detection models across three dimensions.<n>RRDataset includes high-quality images from seven major scenarios.<n>We benchmarked 17 detectors and 10 vision-language models (VLMs) on RRDataset and conducted a large-scale human study.
arXiv Detail & Related papers (2025-09-11T06:15:52Z) - RAID: A Dataset for Testing the Adversarial Robustness of AI-Generated Image Detectors [57.81012948133832]
We present RAID (Robust evaluation of AI-generated image Detectors), a dataset of 72k diverse and highly transferable adversarial examples.<n>Our methodology generates adversarial images that transfer with a high success rate to unseen detectors.<n>Our findings indicate that current state-of-the-art AI-generated image detectors can be easily deceived by adversarial examples.
arXiv Detail & Related papers (2025-06-04T14:16:00Z) - Security Benefits and Side Effects of Labeling AI-Generated Images [27.771584371064968]
We study the implications of labels, including the possibility of mislabeling.<n>We conduct a pre-registered online survey with over 1300 U.S. and EU participants.<n>We find the undesired side effect that human-made images conveying inaccurate claims were perceived as more credible in the presence of labels.
arXiv Detail & Related papers (2025-05-28T20:24:45Z) - Could AI Trace and Explain the Origins of AI-Generated Images and Text? [53.11173194293537]
AI-generated content is increasingly prevalent in the real world.<n> adversaries might exploit large multimodal models to create images that violate ethical or legal standards.<n>Paper reviewers may misuse large language models to generate reviews without genuine intellectual effort.
arXiv Detail & Related papers (2025-04-05T20:51:54Z) - Almost AI, Almost Human: The Challenge of Detecting AI-Polished Writing [55.2480439325792]
This study systematically evaluations twelve state-of-the-art AI-text detectors using our AI-Polished-Text Evaluation dataset.<n>Our findings reveal that detectors frequently flag even minimally polished text as AI-generated, struggle to differentiate between degrees of AI involvement, and exhibit biases against older and smaller models.
arXiv Detail & Related papers (2025-02-21T18:45:37Z) - D-Judge: How Far Are We? Assessing the Discrepancies Between AI-synthesized and Natural Images through Multimodal Guidance [19.760989919485894]
We construct a large-scale multimodal dataset, D-ANI, comprising 5,000 natural images and over 440,000 AIGI samples.<n>We then introduce an AI-Natural Image Discrepancy assessment benchmark (D-Judge) to address the critical question: how far are AI-generated images (AIGIs) from truly realistic images?
arXiv Detail & Related papers (2024-12-23T15:08:08Z) - Human vs. AI: A Novel Benchmark and a Comparative Study on the Detection of Generated Images and the Impact of Prompts [5.222694057785324]
This work examines the influence of the prompt's level of detail on the detectability of fake images.<n>We create a novel dataset, COCOXGEN, which consists of real photos from the COCO dataset as well as images generated with SDXL and Fooocus.<n>Our user study with 200 participants shows that images generated with longer, more detailed prompts are detected significantly more easily than those generated with short prompts.
arXiv Detail & Related papers (2024-12-12T20:37:52Z) - A Sanity Check for AI-generated Image Detection [49.08585395873425]
We propose AIDE (AI-generated Image DEtector with Hybrid Features) to detect AI-generated images.<n>AIDE achieves +3.5% and +4.6% improvements to state-of-the-art methods.
arXiv Detail & Related papers (2024-06-27T17:59:49Z) - Seeing is not always believing: Benchmarking Human and Model Perception
of AI-Generated Images [66.20578637253831]
There is a growing concern that the advancement of artificial intelligence (AI) technology may produce fake photos.
This study aims to comprehensively evaluate agents for distinguishing state-of-the-art AI-generated visual content.
arXiv Detail & Related papers (2023-04-25T17:51:59Z) - Explainable AI for Natural Adversarial Images [4.387699521196243]
Humans tend to assume that the AI's decision process mirrors their own.
Here we evaluate if methods from explainable AI can disrupt this assumption to help participants predict AI classifications for adversarial and standard images.
We find that both saliency maps and examples facilitate catching AI errors, but their effects are not additive, and saliency maps are more effective than examples.
arXiv Detail & Related papers (2021-06-16T20:19:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.