Related papers: As Good As A Coin Toss: Human detection of AI-generated images, videos, audio, and audiovisual stimuli

As Good As A Coin Toss: Human detection of AI-generated images, videos, audio, and audiovisual stimuli

URL: http://arxiv.org/abs/2403.16760v5
Date: Thu, 10 Apr 2025 20:30:04 GMT
Title: As Good As A Coin Toss: Human detection of AI-generated images, videos, audio, and audiovisual stimuli
Authors: Di Cooke, Abigail Edwards, Sophia Barkoff, Kathryn Kelly,
Abstract summary: We conducted a perceptual study with 1276 participants to assess how capable people were at distinguishing between authentic and synthetic media.<n>We find that on average, people struggled to distinguish between synthetic and authentic media, with the mean detection performance close to a chance level performance of 50%.<n>We also find that accuracy rates worsen when the stimuli contain any degree of synthetic content, features foreign languages, and the media type is a single modality.
Score: 0.0
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: One of the current principal defenses against weaponized synthetic media continues to be the ability of the targeted individual to visually or auditorily recognize AI-generated content when they encounter it. However, as the realism of synthetic media continues to rapidly improve, it is vital to have an accurate understanding of just how susceptible people currently are to potentially being misled by convincing but false AI generated content. We conducted a perceptual study with 1276 participants to assess how capable people were at distinguishing between authentic and synthetic images, audio, video, and audiovisual media. We find that on average, people struggled to distinguish between synthetic and authentic media, with the mean detection performance close to a chance level performance of 50%. We also find that accuracy rates worsen when the stimuli contain any degree of synthetic content, features foreign languages, and the media type is a single modality. People are also less accurate at identifying synthetic images when they feature human faces, and when audiovisual stimuli have heterogeneous authenticity. Finally, we find that higher degrees of prior knowledgeability about synthetic media does not significantly impact detection accuracy rates, but age does, with older individuals performing worse than their younger counterparts. Collectively, these results highlight that it is no longer feasible to rely on the perceptual capabilities of people to protect themselves against the growing threat of weaponized synthetic media, and that the need for alternative countermeasures is more critical than ever before.

Related papers

Non-verbal Real-time Human-AI Interaction in Constrained Robotic Environments [6.623088068354071]
We study the debate regarding the statistical fidelity of AI-generated data compared to human-generated data in the context of non-verbal communication using full body motion.<n>We introduce the first framework that generates a natural non-verbal interaction between Human and AI in real-time from 2D body keypoints.<n>Our results demonstrate that statistically distinguishable differences persist between Human and AI motion.
arXiv Detail & Related papers (2026-03-02T12:38:43Z)
Can You Tell It's AI? Human Perception of Synthetic Voices in Vishing Scenarios [3.2976205772213123]
Large Language Models and commercial speech synthesis systems now enable highly realistic AI-generated voice scams (vishing)<n>Yet it remains unclear whether individuals can reliably distinguish AI-generated speech from human-recorded voices in realistic scam contexts.<n>We conducted a controlled online study in which 22 participants evaluated 16 vishing-style audio clips and classified each as human or AI.
arXiv Detail & Related papers (2026-02-23T17:17:53Z)
Measuring the Robustness of Audio Deepfake Detectors [59.09338266364506]
This work systematically evaluates the robustness of 10 audio deepfake detection models against 16 common corruptions. Using both traditional deep learning models and state-of-the-art foundation models, we make four unique observations.
arXiv Detail & Related papers (2025-03-21T23:21:17Z)
Steganography Beyond Space-Time with Chain of Multimodal AI [8.095373104009868]
Steganography is the art and science of covert writing. As artificial intelligence continues to evolve, its ability to synthesise realistic content emerges as a threat in the hands of cybercriminals. This study proposes a paradigm in steganography for audiovisual media, where messages are concealed beyond both spatial and temporal domains.
arXiv Detail & Related papers (2025-02-25T15:56:09Z)
Adult learners recall and recognition performance and affective feedback when learning from an AI-generated synthetic video [1.7742433461734404]
The current study recruited 500 participants to investigate adult learners recall and recognition performances as well as their affective feedback on the AI-generated synthetic video. The results indicated no statistically significant difference amongst conditions on recall and recognition performance. However, adult learners preferred to learn from the video formats rather than text materials.
arXiv Detail & Related papers (2024-11-28T21:40:28Z)
Unmasking Illusions: Understanding Human Perception of Audiovisual Deepfakes [49.81915942821647]
This paper aims to evaluate the human ability to discern deepfake videos through a subjective study. We present our findings by comparing human observers to five state-ofthe-art audiovisual deepfake detection models. We found that all AI models performed better than humans when evaluated on the same 40 videos.
arXiv Detail & Related papers (2024-05-07T07:57:15Z)
A Comparative Study of Perceptual Quality Metrics for Audio-driven Talking Head Videos [81.54357891748087]
We collect talking head videos generated from four generative methods. We conduct controlled psychophysical experiments on visual quality, lip-audio synchronization, and head movement naturalness. Our experiments validate consistency between model predictions and human annotations, identifying metrics that align better with human opinions than widely-used measures.
arXiv Detail & Related papers (2024-03-11T04:13:38Z)
A Representative Study on Human Detection of Artificially Generated Media Across Countries [28.99277150719848]
State-of-the-art forgeries are almost indistinguishable from "real" media. The majority of participants simply guessing when asked to rate them as human- or machine-generated. In addition, AI-generated media receive is voted more human like across all media types and all countries.
arXiv Detail & Related papers (2023-12-10T19:34:52Z)
Learning Human Action Recognition Representations Without Real Humans [66.61527869763819]
We present a benchmark that leverages real-world videos with humans removed and synthetic data containing virtual humans to pre-train a model. We then evaluate the transferability of the representation learned on this data to a diverse set of downstream action recognition benchmarks. Our approach outperforms previous baselines by up to 5%.
arXiv Detail & Related papers (2023-11-10T18:38:14Z)
Training Robust Deep Physiological Measurement Models with Synthetic Video-based Data [11.31971398273479]
We propose measures to add real-world noise to synthetic physiological signals and corresponding facial videos. Our results show that we were able to reduce the average MAE from 6.9 to 2.0.
arXiv Detail & Related papers (2023-11-09T13:55:45Z)
The Age of Synthetic Realities: Challenges and Opportunities [85.058932103181]
We highlight the crucial need for the development of forensic techniques capable of identifying harmful synthetic creations and distinguishing them from reality. Our focus extends to various forms of media, such as images, videos, audio, and text, as we examine how synthetic realities are crafted and explore approaches to detecting these malicious creations. This study is of paramount importance due to the rapid progress of AI generative techniques and their impact on the fundamental principles of Forensic Science.
arXiv Detail & Related papers (2023-06-09T15:55:10Z)
Seeing is not always believing: Benchmarking Human and Model Perception of AI-Generated Images [66.20578637253831]
There is a growing concern that the advancement of artificial intelligence (AI) technology may produce fake photos. This study aims to comprehensively evaluate agents for distinguishing state-of-the-art AI-generated visual content.
arXiv Detail & Related papers (2023-04-25T17:51:59Z)
Fighting Malicious Media Data: A Survey on Tampering Detection and Deepfake Detection [115.83992775004043]
Recent advances in deep learning, particularly deep generative models, open the doors for producing perceptually convincing images and videos at a low cost. This paper provides a comprehensive review of the current media tampering detection approaches, and discusses the challenges and trends in this field for future research.
arXiv Detail & Related papers (2022-12-12T02:54:08Z)
Deep Learning and Synthetic Media [0.0]
I argue that "deepfakes" and related synthetic media produced with such pipelines do not merely offer incremental improvements over previous methods. I argue that "deepfakes" and related synthetic media produced with such pipelines pave the way for genuinely novel kinds of audiovisual media.
arXiv Detail & Related papers (2022-05-11T20:28:09Z)
Data-driven emotional body language generation for social robotics [58.88028813371423]
In social robotics, endowing humanoid robots with the ability to generate bodily expressions of affect can improve human-robot interaction and collaboration. We implement a deep learning data-driven framework that learns from a few hand-designed robotic bodily expressions. The evaluation study found that the anthropomorphism and animacy of the generated expressions are not perceived differently from the hand-designed ones.
arXiv Detail & Related papers (2022-05-02T09:21:39Z)
Audio-Visual Person-of-Interest DeepFake Detection [77.04789677645682]
The aim of this work is to propose a deepfake detector that can cope with the wide variety of manipulation methods and scenarios encountered in the real world. We leverage a contrastive learning paradigm to learn the moving-face and audio segment embeddings that are most discriminative for each identity. Our method can detect both single-modality (audio-only, video-only) and multi-modality (audio-video) attacks, and is robust to low-quality or corrupted videos.
arXiv Detail & Related papers (2022-04-06T20:51:40Z)
Watch Those Words: Video Falsification Detection Using Word-Conditioned Facial Motion [82.06128362686445]
We propose a multi-modal semantic forensic approach to handle both cheapfakes and visually persuasive deepfakes. We leverage the idea of attribution to learn person-specific biometric patterns that distinguish a given speaker from others. Unlike existing person-specific approaches, our method is also effective against attacks that focus on lip manipulation.
arXiv Detail & Related papers (2021-12-21T01:57:04Z)
SynFace: Face Recognition with Synthetic Data [83.15838126703719]
We devise the SynFace with identity mixup (IM) and domain mixup (DM) to mitigate the performance gap. We also perform a systematically empirical analysis on synthetic face images to provide some insights on how to effectively utilize synthetic data for face recognition.
arXiv Detail & Related papers (2021-08-18T03:41:54Z)
More Real than Real: A Study on Human Visual Perception of Synthetic Faces [7.25613186882905]
We describe a perceptual experiment where volunteers have been exposed to synthetic face images produced by state-of-the-art Generative Adversarial Networks. Experiment outcomes reveal how strongly we should call into question our human ability to discriminate real faces from synthetic ones generated through modern AI.
arXiv Detail & Related papers (2021-06-14T08:27:25Z)
Are GAN generated images easy to detect? A critical analysis of the state-of-the-art [22.836654317217324]
With the increased level of photorealism, synthetic media are becoming hardly distinguishable from real ones. It is important to develop automated tools to reliably and timely detect synthetic media.
arXiv Detail & Related papers (2021-04-06T15:54:26Z)

This list is automatically generated from the titles and abstracts of the papers in this site.