Video Manipulations Beyond Faces: A Dataset with Human-Machine Analysis
- URL: http://arxiv.org/abs/2207.13064v2
- Date: Wed, 27 Jul 2022 02:50:45 GMT
- Title: Video Manipulations Beyond Faces: A Dataset with Human-Machine Analysis
- Authors: Trisha Mittal, Ritwik Sinha, Viswanathan Swaminathan, John Collomosse,
Dinesh Manocha
- Abstract summary: We present VideoSham, a dataset consisting of 826 videos (413 real and 413 manipulated).
Many of the existing deepfake datasets focus exclusively on two types of facial manipulations -- swapping with a different subject's face or altering the existing face.
Our analysis shows that state-of-the-art manipulation detection algorithms only work for a few specific attacks and do not scale well on VideoSham.
- Score: 60.13902294276283
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: As tools for content editing mature, and artificial intelligence (AI) based
algorithms for synthesizing media grow, the presence of manipulated content
across online media is increasing. This phenomenon causes the spread of
misinformation, creating a greater need to distinguish between "real" and
"manipulated" content. To this end, we present VideoSham, a dataset consisting
of 826 videos (413 real and 413 manipulated). Many of the existing deepfake
datasets focus exclusively on two types of facial manipulations -- swapping
with a different subject's face or altering the existing face. VideoSham, on
the other hand, contains more diverse, context-rich, and human-centric,
high-resolution videos manipulated using a combination of 6 different spatial
and temporal attacks. Our analysis shows that state-of-the-art manipulation
detection algorithms only work for a few specific attacks and do not scale well
on VideoSham. We performed a user study on Amazon Mechanical Turk with 1200
participants to understand if they can differentiate between the real and
manipulated videos in VideoSham. Finally, we dig deeper into the strengths and
weaknesses of performances by humans and SOTA-algorithms to identify gaps that
need to be filled with better AI algorithms.
Related papers
- Deepfake detection in videos with multiple faces using geometric-fakeness features [79.16635054977068]
Deepfakes of victims or public figures can be used by fraudsters for blackmailing, extorsion and financial fraud.
In our research we propose to use geometric-fakeness features (GFF) that characterize a dynamic degree of a face presence in a video.
We employ our approach to analyze videos with multiple faces that are simultaneously present in a video.
arXiv Detail & Related papers (2024-10-10T13:10:34Z) - A Multimodal Framework for Deepfake Detection [0.0]
Deepfakes, synthetic media created using AI, can convincingly alter videos and audio to misrepresent reality.
Our research addresses the critical issue of deepfakes through an innovative multimodal approach.
Our framework combines visual and auditory analyses, yielding an accuracy of 94%.
arXiv Detail & Related papers (2024-10-04T14:59:10Z) - What Matters in Detecting AI-Generated Videos like Sora? [51.05034165599385]
Gap between synthetic and real-world videos remains under-explored.
In this study, we compare real-world videos with those generated by a state-of-the-art AI model, Stable Video Diffusion.
Our model is capable of detecting videos generated by Sora with high accuracy, even without exposure to any Sora videos during training.
arXiv Detail & Related papers (2024-06-27T23:03:58Z) - A Video Is Worth 4096 Tokens: Verbalize Videos To Understand Them In
Zero Shot [67.00455874279383]
We propose verbalizing long videos to generate descriptions in natural language, then performing video-understanding tasks on the generated story as opposed to the original video.
Our method, despite being zero-shot, achieves significantly better results than supervised baselines for video understanding.
To alleviate a lack of story understanding benchmarks, we publicly release the first dataset on a crucial task in computational social science on persuasion strategy identification.
arXiv Detail & Related papers (2023-05-16T19:13:11Z) - How Would The Viewer Feel? Estimating Wellbeing From Video Scenarios [73.24092762346095]
We introduce two large-scale datasets with over 60,000 videos annotated for emotional response and subjective wellbeing.
The Video Cognitive Empathy dataset contains annotations for distributions of fine-grained emotional responses, allowing models to gain a detailed understanding of affective states.
The Video to Valence dataset contains annotations of relative pleasantness between videos, which enables predicting a continuous spectrum of wellbeing.
arXiv Detail & Related papers (2022-10-18T17:58:25Z) - Audio-Visual Person-of-Interest DeepFake Detection [77.04789677645682]
The aim of this work is to propose a deepfake detector that can cope with the wide variety of manipulation methods and scenarios encountered in the real world.
We leverage a contrastive learning paradigm to learn the moving-face and audio segment embeddings that are most discriminative for each identity.
Our method can detect both single-modality (audio-only, video-only) and multi-modality (audio-video) attacks, and is robust to low-quality or corrupted videos.
arXiv Detail & Related papers (2022-04-06T20:51:40Z) - Detecting Deepfake Videos Using Euler Video Magnification [1.8506048493564673]
Deepfake videos are manipulating videos using advanced machine learning techniques.
In this paper, we examine a technique for possible identification of deepfake videos.
Our approach uses features extracted from the Euler technique to train three models to classify counterfeit and unaltered videos.
arXiv Detail & Related papers (2021-01-27T17:37:23Z) - Detecting Deep-Fake Videos from Appearance and Behavior [0.0]
We describe a biometric-based forensic technique for detecting face-swap deep fakes.
We show the efficacy of this approach across several large-scale video datasets.
arXiv Detail & Related papers (2020-04-29T21:38:22Z) - Video Face Manipulation Detection Through Ensemble of CNNs [17.051112469244778]
We tackle the problem of face manipulation detection in video sequences targeting modern facial manipulation techniques.
In particular, we study the ensembling of different trained Convolutional Neural Network (CNN) models.
We show that combining these networks leads to promising face manipulation detection results on two publicly available datasets.
arXiv Detail & Related papers (2020-04-16T14:19:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.