TransRPPG: Remote Photoplethysmography Transformer for 3D Mask Face
Presentation Attack Detection
- URL: http://arxiv.org/abs/2104.07419v1
- Date: Thu, 15 Apr 2021 12:33:13 GMT
- Title: TransRPPG: Remote Photoplethysmography Transformer for 3D Mask Face
Presentation Attack Detection
- Authors: Zitong Yu, Xiaobai Li, Pichao Wang, Guoying Zhao
- Abstract summary: 3D mask face presentation attack detection (PAD) plays a vital role in securing face recognition systems from 3D mask attacks.
We propose a pure r transformer (TransR) framework for learning live intrinsicness representation efficiently.
Our TransR is lightweight and efficient (with only 547K parameters and 763MOPs) which is promising for mobile-level applications.
- Score: 53.98866801690342
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: 3D mask face presentation attack detection (PAD) plays a vital role in
securing face recognition systems from emergent 3D mask attacks. Recently,
remote photoplethysmography (rPPG) has been developed as an intrinsic liveness
clue for 3D mask PAD without relying on the mask appearance. However, the rPPG
features for 3D mask PAD are still needed expert knowledge to design manually,
which limits its further progress in the deep learning and big data era. In
this letter, we propose a pure rPPG transformer (TransRPPG) framework for
learning intrinsic liveness representation efficiently. At first, rPPG-based
multi-scale spatial-temporal maps (MSTmap) are constructed from facial skin and
background regions. Then the transformer fully mines the global relationship
within MSTmaps for liveness representation, and gives a binary prediction for
3D mask detection. Comprehensive experiments are conducted on two benchmark
datasets to demonstrate the efficacy of the TransRPPG on both intra- and
cross-dataset testings. Our TransRPPG is lightweight and efficient (with only
547K parameters and 763M FLOPs), which is promising for mobile-level
applications.
Related papers
- Triple Point Masking [49.39218611030084]
Existing 3D mask learning methods encounter performance bottlenecks under limited data.
We introduce a triple point masking scheme, named TPM, which serves as a scalable framework for pre-training of masked autoencoders.
Extensive experiments show that the four baselines equipped with the proposed TPM achieve comprehensive performance improvements on various downstream tasks.
arXiv Detail & Related papers (2024-09-26T05:33:30Z) - A transfer learning approach with convolutional neural network for Face
Mask Detection [0.30693357740321775]
We propose a mask recognition system based on transfer learning and Inception v3 architecture.
In addition to masked and unmasked faces, it can also detect cases of incorrect use of mask.
arXiv Detail & Related papers (2023-10-29T07:38:33Z) - Flow-Attention-based Spatio-Temporal Aggregation Network for 3D Mask
Detection [12.160085404239446]
We propose a novel 3D mask detection framework called FASTEN.
We tailor the network for focusing more on fine details in large movements, which can eliminate redundant-temporal feature interference.
FASTEN only requires five frames input and outperforms eight competitors for both intra-dataset and cross-dataset evaluations.
arXiv Detail & Related papers (2023-10-25T11:54:21Z) - Masked Motion Predictors are Strong 3D Action Representation Learners [143.9677635274393]
In 3D human action recognition, limited supervised data makes it challenging to fully tap into the modeling potential of powerful networks such as transformers.
We show that instead of following the prevalent pretext to perform masked self-component reconstruction in human joints, explicit contextual motion modeling is key to the success of learning effective feature representation for 3D action recognition.
arXiv Detail & Related papers (2023-08-14T11:56:39Z) - Mask Attack Detection Using Vascular-weighted Motion-robust rPPG Signals [21.884783786547782]
R-based face anti-spoofing methods often suffer from performance degradation due to unstable face alignment in the video sequence.
A landmark-anchored face stitching method is proposed to align the faces robustly and precisely at the pixel-wise level by using both SIFT keypoints and facial landmarks.
A lightweight EfficientNet with a Gated Recurrent Unit (GRU) is designed to extract both spatial and temporal features for classification.
arXiv Detail & Related papers (2023-05-25T11:22:17Z) - Total Scale: Face-to-Body Detail Reconstruction from Sparse RGBD Sensors [52.38220261632204]
Flat facial surfaces frequently occur in the PIFu-based reconstruction results.
We propose a two-scale PIFu representation to enhance the quality of the reconstructed facial details.
Experiments demonstrate the effectiveness of our approach in vivid facial details and deforming body shapes.
arXiv Detail & Related papers (2021-12-03T18:46:49Z) - Contrastive Context-Aware Learning for 3D High-Fidelity Mask Face
Presentation Attack Detection [103.7264459186552]
Face presentation attack detection (PAD) is essential to secure face recognition systems.
Most existing 3D mask PAD benchmarks suffer from several drawbacks.
We introduce a largescale High-Fidelity Mask dataset to bridge the gap to real-world applications.
arXiv Detail & Related papers (2021-04-13T12:48:38Z) - A 3D model-based approach for fitting masks to faces in the wild [9.958467179573235]
We present a 3D model-based approach called WearMask3D for augmenting face images of various poses to the masked face counterparts.
Our method proceeds by first fitting a 3D morphable model on the input image, second overlaying the mask surface onto the face model and warping the respective mask texture, and last projecting the 3D mask back to 2D.
Experimental results demonstrate WearMask3D produces more realistic masked images, and utilizing these images for training leads to improved recognition accuracy of masked faces.
arXiv Detail & Related papers (2021-03-01T06:50:18Z) - Revisiting 3D Context Modeling with Supervised Pre-training for
Universal Lesion Detection in CT Slices [48.85784310158493]
We propose a Modified Pseudo-3D Feature Pyramid Network (MP3D FPN) to efficiently extract 3D context enhanced 2D features for universal lesion detection in CT slices.
With the novel pre-training method, the proposed MP3D FPN achieves state-of-the-art detection performance on the DeepLesion dataset.
The proposed 3D pre-trained weights can potentially be used to boost the performance of other 3D medical image analysis tasks.
arXiv Detail & Related papers (2020-12-16T07:11:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.