Deep Ensemble Learning with Frame Skipping for Face Anti-Spoofing
- URL: http://arxiv.org/abs/2307.02858v2
- Date: Tue, 11 Jul 2023 07:06:07 GMT
- Title: Deep Ensemble Learning with Frame Skipping for Face Anti-Spoofing
- Authors: Usman Muhammad, Md Ziaul Hoque, Mourad Oussalah and Jorma Laaksonen
- Abstract summary: Face presentation attacks (PA), also known as spoofing attacks, pose a substantial threat to biometric systems.
Several video-based methods have been presented in the literature that analyze facial motion in successive video frames.
In this paper, we rephrase the face anti-spoofing task as a motion prediction problem and introduce a deep ensemble learning model with a frame skipping mechanism.
- Score: 5.543184872682789
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Face presentation attacks (PA), also known as spoofing attacks, pose a
substantial threat to biometric systems that rely on facial recognition
systems, such as access control systems, mobile payments, and identity
verification systems. To mitigate the spoofing risk, several video-based
methods have been presented in the literature that analyze facial motion in
successive video frames. However, estimating the motion between adjacent frames
is a challenging task and requires high computational cost. In this paper, we
rephrase the face anti-spoofing task as a motion prediction problem and
introduce a deep ensemble learning model with a frame skipping mechanism. In
particular, the proposed frame skipping adopts a uniform sampling approach by
dividing the original video into video clips of fixed size. By doing so, every
nth frame of the clip is selected to ensure that the temporal patterns can
easily be perceived during the training of three different recurrent neural
networks (RNNs). Motivated by the performance of individual RNNs, a meta-model
is developed to improve the overall detection performance by combining the
prediction of individual RNNs. Extensive experiments were performed on four
datasets, and state-of-the-art performance is reported on MSU-MFSD (3.12%),
Replay-Attack (11.19%), and OULU-NPU (12.23%) databases by using half total
error rates (HTERs) in the most challenging cross-dataset testing scenario.
Related papers
- UniForensics: Face Forgery Detection via General Facial Representation [60.5421627990707]
High-level semantic features are less susceptible to perturbations and not limited to forgery-specific artifacts, thus having stronger generalization.
We introduce UniForensics, a novel deepfake detection framework that leverages a transformer-based video network, with a meta-functional face classification for enriched facial representation.
arXiv Detail & Related papers (2024-07-26T20:51:54Z) - Patch Spatio-Temporal Relation Prediction for Video Anomaly Detection [19.643936110623653]
Video Anomaly Detection (VAD) aims to identify abnormalities within a specific context and timeframe.
Recent deep learning-based VAD models have shown promising results by generating high-resolution frames.
We propose a self-supervised learning approach for VAD through an inter-patch relationship prediction task.
arXiv Detail & Related papers (2024-03-28T03:07:16Z) - Face Anti-Spoofing from the Perspective of Data Sampling [0.342658286826597]
Face presentation attack detection plays a vital role in providing secure facial access to digital devices.
Most existing video-based PAD countermeasures lack the ability to cope with long-range temporal variations in videos.
This paper proposes a video processing scheme that models the long-range temporal variations based on Gaussian Weighting Function.
arXiv Detail & Related papers (2022-08-28T07:54:30Z) - NSNet: Non-saliency Suppression Sampler for Efficient Video Recognition [89.84188594758588]
A novel Non-saliency Suppression Network (NSNet) is proposed to suppress the responses of non-salient frames.
NSNet achieves the state-of-the-art accuracy-efficiency trade-off and presents a significantly faster (2.44.3x) practical inference speed than state-of-the-art methods.
arXiv Detail & Related papers (2022-07-21T09:41:22Z) - Shuffled Patch-Wise Supervision for Presentation Attack Detection [12.031796234206135]
Face anti-spoofing is essential to prevent false facial verification by using a photo, video, mask, or a different substitute for an authorized person's face.
Most presentation attack detection systems suffer from overfitting, where they achieve near-perfect scores on a single dataset but fail on a different dataset with more realistic data.
We propose a new PAD approach, which combines pixel-wise binary supervision with patch-based CNN.
arXiv Detail & Related papers (2021-09-08T08:14:13Z) - Frame-rate Up-conversion Detection Based on Convolutional Neural Network
for Learning Spatiotemporal Features [7.895528973776606]
This paper proposes a frame-rate conversion detection network (FCDNet) that learns forensic features caused by FRUC in an end-to-end fashion.
FCDNet uses a stack of consecutive frames as the input and effectively learns artifacts using network blocks to learn features.
arXiv Detail & Related papers (2021-03-25T08:47:46Z) - Aurora Guard: Reliable Face Anti-Spoofing via Mobile Lighting System [103.5604680001633]
Anti-spoofing against high-resolution rendering replay of paper photos or digital videos remains an open problem.
We propose a simple yet effective face anti-spoofing system, termed Aurora Guard (AG)
arXiv Detail & Related papers (2021-02-01T09:17:18Z) - Robust Unsupervised Video Anomaly Detection by Multi-Path Frame
Prediction [61.17654438176999]
We propose a novel and robust unsupervised video anomaly detection method by frame prediction with proper design.
Our proposed method obtains the frame-level AUROC score of 88.3% on the CUHK Avenue dataset.
arXiv Detail & Related papers (2020-11-05T11:34:12Z) - Sharp Multiple Instance Learning for DeepFake Video Detection [54.12548421282696]
We introduce a new problem of partial face attack in DeepFake video, where only video-level labels are provided but not all the faces in the fake videos are manipulated.
A sharp MIL (S-MIL) is proposed which builds direct mapping from instance embeddings to bag prediction.
Experiments on FFPMS and widely used DFDC dataset verify that S-MIL is superior to other counterparts for partially attacked DeepFake video detection.
arXiv Detail & Related papers (2020-08-11T08:52:17Z) - Uncertainty-Aware Weakly Supervised Action Detection from Untrimmed
Videos [82.02074241700728]
In this paper, we present a prohibitive-level action recognition model that is trained with only video-frame labels.
Our method per person detectors have been trained on large image datasets within Multiple Instance Learning framework.
We show how we can apply our method in cases where the standard Multiple Instance Learning assumption, that each bag contains at least one instance with the specified label, is invalid.
arXiv Detail & Related papers (2020-07-21T10:45:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.