Related papers: Improving Video Deepfake Detection: A DCT-Based Approach with Patch-Level Analysis

Improving Video Deepfake Detection: A DCT-Based Approach with Patch-Level Analysis

URL: http://arxiv.org/abs/2310.11204v2
Date: Tue, 9 Jan 2024 08:57:08 GMT
Title: Improving Video Deepfake Detection: A DCT-Based Approach with Patch-Level Analysis
Authors: Luca Guarnera (1), Salvatore Manganello (1), Sebastiano Battiato (1) ((1) University of Catania)
Abstract summary: The I-frames were extracted in order to provide faster computation and analysis than approaches described in the literature. To identify the discriminating regions within individual video frames, the entire frame, background, face, eyes, nose, mouth, and face frame were analyzed separately. Experimental results show that the eye and mouth regions are those most discriminative and able to determine the nature of the video under analysis.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: A new algorithm for the detection of deepfakes in digital videos is presented. The I-frames were extracted in order to provide faster computation and analysis than approaches described in the literature. To identify the discriminating regions within individual video frames, the entire frame, background, face, eyes, nose, mouth, and face frame were analyzed separately. From the Discrete Cosine Transform (DCT), the Beta components were extracted from the AC coefficients and used as input to standard classifiers. Experimental results show that the eye and mouth regions are those most discriminative and able to determine the nature of the video under analysis.

Related papers

Zero-Shot Video Semantic Segmentation based on Pre-Trained Diffusion Models [96.97910688908956]
We introduce the first zero-shot approach for Video Semantic (VSS) based on pre-trained diffusion models. We propose a framework tailored for VSS based on pre-trained image and video diffusion models. Experiments show that our proposed approach outperforms existing zero-shot image semantic segmentation approaches.
arXiv Detail & Related papers (2024-05-27T08:39:38Z)
AVTENet: A Human-Cognition-Inspired Audio-Visual Transformer-Based Ensemble Network for Video Deepfake Detection [49.81915942821647]
This study introduces the audio-visual transformer-based ensemble network (AVTENet) to detect deepfake videos.<n>For evaluation, we use the recently released benchmark multimodal audio-video FakeAVCeleb dataset.<n>For a detailed analysis, we evaluate AVTENet, its variants, and several existing methods on multiple test sets of the FakeAVCeleb dataset.
arXiv Detail & Related papers (2023-10-19T19:01:26Z)
Glitch in the Matrix: A Large Scale Benchmark for Content Driven Audio-Visual Forgery Detection and Localization [20.46053083071752]
We propose and benchmark a new dataset, Localized Visual DeepFake (LAV-DF) LAV-DF consists of strategic content-driven audio, visual and audio-visual manipulations. The proposed baseline method, Boundary Aware Temporal Forgery Detection (BA-TFD), is a 3D Convolutional Neural Network-based architecture.
arXiv Detail & Related papers (2023-05-03T08:48:45Z)
Adaptive occlusion sensitivity analysis for visually explaining video recognition networks [12.75077781554099]
Occlusion sensitivity analysis is commonly used to analyze single image classification. This paper proposes a method for visually explaining the decision-making process of video recognition networks.
arXiv Detail & Related papers (2022-07-26T12:42:51Z)
Spatial-Temporal Frequency Forgery Clue for Video Forgery Detection in VIS and NIR Scenario [87.72258480670627]
Existing face forgery detection methods based on frequency domain find that the GAN forged images have obvious grid-like visual artifacts in the frequency spectrum compared to the real images. This paper proposes a Cosine Transform-based Forgery Clue Augmentation Network (FCAN-DCT) to achieve a more comprehensive spatial-temporal feature representation.
arXiv Detail & Related papers (2022-07-05T09:27:53Z)
Video Salient Object Detection via Contrastive Features and Attention Modules [106.33219760012048]
We propose a network with attention modules to learn contrastive features for video salient object detection. A co-attention formulation is utilized to combine the low-level and high-level features. We show that the proposed method requires less computation, and performs favorably against the state-of-the-art approaches.
arXiv Detail & Related papers (2021-11-03T17:40:32Z)
HighlightMe: Detecting Highlights from Human-Centric Videos [52.84233165201391]
We present a domain- and user-preference-agnostic approach to detect highlightable excerpts from human-centric videos. We use an autoencoder network equipped with spatial-temporal graph convolutions to detect human activities and interactions. We observe a 4-12% improvement in the mean average precision of matching the human-annotated highlights over state-of-the-art methods.
arXiv Detail & Related papers (2021-10-05T01:18:15Z)
Shot boundary detection method based on a new extensive dataset and mixed features [68.8204255655161]
Shot boundary detection in video is one of the key stages of video data processing. New method for shot boundary detection based on several video features, such as color histograms and object boundaries, has been proposed.
arXiv Detail & Related papers (2021-09-02T16:19:24Z)
Sharp Multiple Instance Learning for DeepFake Video Detection [54.12548421282696]
We introduce a new problem of partial face attack in DeepFake video, where only video-level labels are provided but not all the faces in the fake videos are manipulated. A sharp MIL (S-MIL) is proposed which builds direct mapping from instance embeddings to bag prediction. Experiments on FFPMS and widely used DFDC dataset verify that S-MIL is superior to other counterparts for partially attacked DeepFake video detection.
arXiv Detail & Related papers (2020-08-11T08:52:17Z)
Dynamic texture analysis for detecting fake faces in video sequences [6.1356022122903235]
This work explores the analysis of texture-temporal dynamics of the video signal. The goal is to characterizing and distinguishing real fake sequences. We propose to build multiple binary decision on the joint analysis of temporal segments.
arXiv Detail & Related papers (2020-07-30T07:21:24Z)
Detecting Forged Facial Videos using convolutional neural network [0.0]
We propose to use smaller (fewer parameters to learn) convolutional neural networks (CNN) for a data-driven approach to forged video detection. To validate our approach, we investigate the FaceForensics public dataset detailing both frame-based and video-based results.
arXiv Detail & Related papers (2020-05-17T19:04:59Z)

This list is automatically generated from the titles and abstracts of the papers in this site.