Human Perception of Audio Deepfakes
- URL: http://arxiv.org/abs/2107.09667v1
- Date: Tue, 20 Jul 2021 09:19:42 GMT
- Title: Human Perception of Audio Deepfakes
- Authors: Nicolas M. M\"uller, Karla Markert, Konstantin B\"ottinger
- Abstract summary: We compare the ability of humans and machines in detecting audio deepfakes.
We found that the machine generally outperforms the humans in detecting audio deepfakes.
Younger participants are on average better at detecting audio deepfakes than older participants, while IT-professionals hold no advantage over laymen.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The recent emergence of deepfakes, computerized realistic multimedia fakes,
brought the detection of manipulated and generated content to the forefront.
While many machine learning models for deepfakes detection have been proposed,
the human detection capabilities have remained far less explored. This is of
special importance as human perception differs from machine perception and
deepfakes are generally designed to fool the human. So far, this issue has only
been addressed in the area of images and video.
To compare the ability of humans and machines in detecting audio deepfakes,
we conducted an online gamified experiment in which we asked users to discern
bonda-fide audio samples from spoofed audio, generated with a variety of
algorithms. 200 users competed for 8976 game rounds with an artificial
intelligence (AI) algorithm trained for audio deepfake detection. With the
collected data we found that the machine generally outperforms the humans in
detecting audio deepfakes, but that the converse holds for a certain attack
type, for which humans are still more accurate. Furthermore, we found that
younger participants are on average better at detecting audio deepfakes than
older participants, while IT-professionals hold no advantage over laymen. We
conclude that it is important to combine human and machine knowledge in order
to improve audio deepfake detection.
Related papers
- Unmasking Illusions: Understanding Human Perception of Audiovisual Deepfakes [49.81915942821647]
This paper aims to evaluate the human ability to discern deepfake videos through a subjective study.
We present our findings by comparing human observers to five state-ofthe-art audiovisual deepfake detection models.
We found that all AI models performed better than humans when evaluated on the same 40 videos.
arXiv Detail & Related papers (2024-05-07T07:57:15Z) - Human Brain Exhibits Distinct Patterns When Listening to Fake Versus Real Audio: Preliminary Evidence [10.773283625658513]
In this paper we study the variations in human brain activity when listening to real and fake audio.
Preliminary results suggest that the representations learned by a state-of-the-art deepfake audio detection algorithm, do not exhibit clear distinct patterns between real and fake audio.
arXiv Detail & Related papers (2024-02-22T21:44:58Z) - System Fingerprint Recognition for Deepfake Audio: An Initial Dataset
and Investigation [51.06875680387692]
We present the first deepfake audio dataset for system fingerprint recognition (SFR)
We collected the dataset from the speech synthesis systems of seven Chinese vendors that use the latest state-of-the-art deep learning technologies.
arXiv Detail & Related papers (2022-08-21T05:15:40Z) - Deepfake Caricatures: Amplifying attention to artifacts increases
deepfake detection by humans and machines [17.7858728343141]
Deepfakes pose a serious threat to digital well-being by fueling misinformation.
We introduce a framework for amplifying artifacts in deepfake videos to make them more detectable by people.
We propose a novel, semi-supervised Artifact Attention module, which is trained on human responses to create attention maps that highlight video artifacts.
arXiv Detail & Related papers (2022-06-01T14:43:49Z) - Audio-Visual Person-of-Interest DeepFake Detection [77.04789677645682]
The aim of this work is to propose a deepfake detector that can cope with the wide variety of manipulation methods and scenarios encountered in the real world.
We leverage a contrastive learning paradigm to learn the moving-face and audio segment embeddings that are most discriminative for each identity.
Our method can detect both single-modality (audio-only, video-only) and multi-modality (audio-video) attacks, and is robust to low-quality or corrupted videos.
arXiv Detail & Related papers (2022-04-06T20:51:40Z) - Watch Those Words: Video Falsification Detection Using Word-Conditioned
Facial Motion [82.06128362686445]
We propose a multi-modal semantic forensic approach to handle both cheapfakes and visually persuasive deepfakes.
We leverage the idea of attribution to learn person-specific biometric patterns that distinguish a given speaker from others.
Unlike existing person-specific approaches, our method is also effective against attacks that focus on lip manipulation.
arXiv Detail & Related papers (2021-12-21T01:57:04Z) - Evaluation of an Audio-Video Multimodal Deepfake Dataset using Unimodal
and Multimodal Detectors [18.862258543488355]
Deepfakes can cause security and privacy issues.
New domain of cloning human voices using deep-learning technologies is also emerging.
To develop a good deepfake detector, we need a detector that can detect deepfakes of multiple modalities.
arXiv Detail & Related papers (2021-09-07T11:00:20Z) - WildDeepfake: A Challenging Real-World Dataset for Deepfake Detection [82.42495493102805]
We introduce a new dataset WildDeepfake which consists of 7,314 face sequences extracted from 707 deepfake videos collected completely from the internet.
We conduct a systematic evaluation of a set of baseline detection networks on both existing and our WildDeepfake datasets, and show that WildDeepfake is indeed a more challenging dataset, where the detection performance can decrease drastically.
arXiv Detail & Related papers (2021-01-05T11:10:32Z) - Deepfake detection: humans vs. machines [4.485016243130348]
We present a subjective study conducted in a crowdsourcing-like scenario, which systematically evaluates how hard it is for humans to see if the video is deepfake or not.
For each video, a simple question: "Is face of the person in the video real of fake?" was answered on average by 19 na"ive subjects.
The evaluation demonstrates that while the human perception is very different from the perception of a machine, both successfully but in different ways are fooled by deepfakes.
arXiv Detail & Related papers (2020-09-07T15:20:37Z) - Emotions Don't Lie: An Audio-Visual Deepfake Detection Method Using
Affective Cues [75.1731999380562]
We present a learning-based method for detecting real and fake deepfake multimedia content.
We extract and analyze the similarity between the two audio and visual modalities from within the same video.
We compare our approach with several SOTA deepfake detection methods and report per-video AUC of 84.4% on the DFDC and 96.6% on the DF-TIMIT datasets.
arXiv Detail & Related papers (2020-03-14T22:07:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.