M3FAS: An Accurate and Robust MultiModal Mobile Face Anti-Spoofing System
- URL: http://arxiv.org/abs/2301.12831v3
- Date: Thu, 21 Mar 2024 05:39:44 GMT
- Title: M3FAS: An Accurate and Robust MultiModal Mobile Face Anti-Spoofing System
- Authors: Chenqi Kong, Kexin Zheng, Yibing Liu, Shiqi Wang, Anderson Rocha, Haoliang Li,
- Abstract summary: Face presentation attacks (FPA) have brought increasing concerns to the public through various malicious applications.
We devise an accurate and robust MultiModal Mobile Face Anti-Spoofing system named M3FAS.
- Score: 39.37647248710612
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Face presentation attacks (FPA), also known as face spoofing, have brought increasing concerns to the public through various malicious applications, such as financial fraud and privacy leakage. Therefore, safeguarding face recognition systems against FPA is of utmost importance. Although existing learning-based face anti-spoofing (FAS) models can achieve outstanding detection performance, they lack generalization capability and suffer significant performance drops in unforeseen environments. Many methodologies seek to use auxiliary modality data (e.g., depth and infrared maps) during the presentation attack detection (PAD) to address this limitation. However, these methods can be limited since (1) they require specific sensors such as depth and infrared cameras for data capture, which are rarely available on commodity mobile devices, and (2) they cannot work properly in practical scenarios when either modality is missing or of poor quality. In this paper, we devise an accurate and robust MultiModal Mobile Face Anti-Spoofing system named M3FAS to overcome the issues above. The primary innovation of this work lies in the following aspects: (1) To achieve robust PAD, our system combines visual and auditory modalities using three commonly available sensors: camera, speaker, and microphone; (2) We design a novel two-branch neural network with three hierarchical feature aggregation modules to perform cross-modal feature fusion; (3). We propose a multi-head training strategy, allowing the model to output predictions from the vision, acoustic, and fusion heads, resulting in a more flexible PAD. Extensive experiments have demonstrated the accuracy, robustness, and flexibility of M3FAS under various challenging experimental settings. The source code and dataset are available at: https://github.com/ChenqiKONG/M3FAS/
Related papers
- BIG-MoE: Bypass Isolated Gating MoE for Generalized Multimodal Face Anti-Spoofing [45.59998158610864]
multimodal Face Anti-Spoofing (FAS) is essential for countering presentation attacks.
Existing technologies encounter challenges due to modality biases and imbalances, as well as domain shifts.
Our research introduces a Mixture of Experts (MoE) model to address these issues effectively.
arXiv Detail & Related papers (2024-12-24T00:28:28Z) - A Multi-Modal Approach for Face Anti-Spoofing in Non-Calibrated Systems using Disparity Maps [0.6144680854063939]
Face recognition technologies are vulnerable to face spoofing attacks.
stereo-depth cameras can detect such attacks effectively, but their high-cost limits their widespread adoption.
We propose a method to overcome this challenge by leveraging facial attributes to derive disparity information.
arXiv Detail & Related papers (2024-10-31T15:29:51Z) - Robust Multimodal 3D Object Detection via Modality-Agnostic Decoding and Proximity-based Modality Ensemble [15.173314907900842]
Existing 3D object detection methods rely heavily on the LiDAR sensor.
We propose MEFormer to address the LiDAR over-reliance problem.
Our MEFormer achieves state-of-the-art performance of 73.9% NDS and 71.5% mAP in the nuScenes validation set.
arXiv Detail & Related papers (2024-07-27T03:21:44Z) - Flow-Attention-based Spatio-Temporal Aggregation Network for 3D Mask
Detection [12.160085404239446]
We propose a novel 3D mask detection framework called FASTEN.
We tailor the network for focusing more on fine details in large movements, which can eliminate redundant-temporal feature interference.
FASTEN only requires five frames input and outperforms eight competitors for both intra-dataset and cross-dataset evaluations.
arXiv Detail & Related papers (2023-10-25T11:54:21Z) - Towards Effective Adversarial Textured 3D Meshes on Physical Face
Recognition [42.60954035488262]
The goal of this work is to develop a more reliable technique that can carry out an end-to-end evaluation of adversarial robustness for commercial systems.
We design adversarial textured 3D meshes (AT3D) with an elaborate topology on a human face, which can be 3D-printed and pasted on the attacker's face to evade the defenses.
To deviate from the mesh-based space, we propose to perturb the low-dimensional coefficient space based on 3D Morphable Model.
arXiv Detail & Related papers (2023-03-28T08:42:54Z) - Face Presentation Attack Detection [59.05779913403134]
Face recognition technology has been widely used in daily interactive applications such as checking-in and mobile payment.
However, its vulnerability to presentation attacks (PAs) limits its reliable use in ultra-secure applicational scenarios.
arXiv Detail & Related papers (2022-12-07T14:51:17Z) - Dual Spoof Disentanglement Generation for Face Anti-spoofing with Depth
Uncertainty Learning [54.15303628138665]
Face anti-spoofing (FAS) plays a vital role in preventing face recognition systems from presentation attacks.
Existing face anti-spoofing datasets lack diversity due to the insufficient identity and insignificant variance.
We propose Dual Spoof Disentanglement Generation framework to tackle this challenge by "anti-spoofing via generation"
arXiv Detail & Related papers (2021-12-01T15:36:59Z) - Face Anti-Spoofing with Human Material Perception [76.4844593082362]
Face anti-spoofing (FAS) plays a vital role in securing the face recognition systems from presentation attacks.
We rephrase face anti-spoofing as a material recognition problem and combine it with classical human material perception.
We propose the Bilateral Convolutional Networks (BCN), which is able to capture intrinsic material-based patterns.
arXiv Detail & Related papers (2020-07-04T18:25:53Z) - ASFD: Automatic and Scalable Face Detector [129.82350993748258]
We propose a novel Automatic and Scalable Face Detector (ASFD)
ASFD is based on a combination of neural architecture search techniques as well as a new loss design.
Our ASFD-D6 outperforms the prior strong competitors, and our lightweight ASFD-D0 runs at more than 120 FPS with Mobilenet for VGA-resolution images.
arXiv Detail & Related papers (2020-03-25T06:00:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.