Mixture-of-Attack-Experts with Class Regularization for Unified Physical-Digital Face Attack Detection
- URL: http://arxiv.org/abs/2504.00458v1
- Date: Tue, 01 Apr 2025 06:33:30 GMT
- Title: Mixture-of-Attack-Experts with Class Regularization for Unified Physical-Digital Face Attack Detection
- Authors: Shunxin Chen, Ajian Liu, Junze Zheng, Jun Wan, Kailai Peng, Sergio Escalera, Zhen Lei,
- Abstract summary: Facial recognition systems in real-world scenarios are susceptible to both digital and physical attacks.<n>Previous methods have attempted to achieve classification by learning a comprehensive feature space.<n>We propose the Fine-Grained MoE with Class-Aware Regularization CLIP framework (FG-MoE-CLIP-CAR)<n> Experimental results on two unified physical-digital attack datasets demonstrate that the proposed method achieves state-of-the-art (SOTA) performance.
- Score: 45.59451145116377
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Facial recognition systems in real-world scenarios are susceptible to both digital and physical attacks. Previous methods have attempted to achieve classification by learning a comprehensive feature space. However, these methods have not adequately accounted for the inherent characteristics of physical and digital attack data, particularly the large intra class variation in attacks and the small inter-class variation between live and fake faces. To address these limitations, we propose the Fine-Grained MoE with Class-Aware Regularization CLIP framework (FG-MoE-CLIP-CAR), incorporating key improvements at both the feature and loss levels. At the feature level, we employ a Soft Mixture of Experts (Soft MoE) architecture to leverage different experts for specialized feature processing. Additionally, we refine the Soft MoE to capture more subtle differences among various types of fake faces. At the loss level, we introduce two constraint modules: the Disentanglement Module (DM) and the Cluster Distillation Module (CDM). The DM enhances class separability by increasing the distance between the centers of live and fake face classes. However, center-to-center constraints alone are insufficient to ensure distinctive representations for individual features. Thus, we propose the CDM to further cluster features around their respective class centers while maintaining separation from other classes. Moreover, specific attacks that significantly deviate from common attack patterns are often overlooked. To address this issue, our distance calculation prioritizes more distant features. Experimental results on two unified physical-digital attack datasets demonstrate that the proposed method achieves state-of-the-art (SOTA) performance.
Related papers
- SUEDE:Shared Unified Experts for Physical-Digital Face Attack Detection Enhancement [19.140558657697866]
Face recognition systems are vulnerable to physical attacks and digital threats.
The inherent differences among various attack types present significant challenges in identifying a common feature space.
We propose SUEDE, the Shared Unified Experts for Physical-Digital Face Attack Detection Enhancement.
arXiv Detail & Related papers (2025-04-07T08:17:54Z) - La-SoftMoE CLIP for Unified Physical-Digital Face Attack Detection [27.020392407198948]
Facial recognition systems are susceptible to both physical and digital attacks.
We propose a novel approach that uses the sparse model to handle sparse data.
We introduce a flexible self-adapting weighting mechanism, enabling the model to better fit and adapt.
arXiv Detail & Related papers (2024-08-23T02:12:13Z) - Unified Physical-Digital Face Attack Detection [66.14645299430157]
Face Recognition (FR) systems can suffer from physical (i.e., print photo) and digital (i.e., DeepFake) attacks.
Previous related work rarely considers both situations at the same time.
We propose a Unified Attack Detection framework based on Vision-Language Models (VLMs)
arXiv Detail & Related papers (2024-01-31T09:38:44Z) - Hyperbolic Face Anti-Spoofing [21.981129022417306]
We propose to learn richer hierarchical and discriminative spoofing cues in hyperbolic space.
For unimodal FAS learning, the feature embeddings are projected into the Poincar'e ball, and then the hyperbolic binary logistic regression layer is cascaded for classification.
To alleviate the vanishing gradient problem in hyperbolic space, a new feature clipping method is proposed to enhance the training stability of hyperbolic models.
arXiv Detail & Related papers (2023-08-17T17:18:21Z) - Multi-spectral Class Center Network for Face Manipulation Detection and Localization [52.569170436393165]
We propose a novel Multi-Spectral Class Center Network (MSCCNet) for face manipulation detection and localization.
Based on the features of different frequency bands, the MSCC module collects multi-spectral class centers and computes pixel-to-class relations.
Applying multi-spectral class-level representations suppresses the semantic information of the visual concepts which is insensitive to manipulated regions of forgery images.
arXiv Detail & Related papers (2023-05-18T08:09:20Z) - DiGeo: Discriminative Geometry-Aware Learning for Generalized Few-Shot
Object Detection [39.937724871284665]
Generalized few-shot object detection aims to achieve precise detection on both base classes with abundant annotations and novel classes with limited training data.
Existing approaches enhance few-shot generalization with the sacrifice of base-class performance.
We propose a new training framework, DiGeo, to learn Geometry-aware features of inter-class separation and intra-class compactness.
arXiv Detail & Related papers (2023-03-16T22:37:09Z) - Dual Contrastive Learning for General Face Forgery Detection [64.41970626226221]
We propose a novel face forgery detection framework, named Dual Contrastive Learning (DCL), which constructs positive and negative paired data.
To explore the essential discrepancies, Intra-Instance Contrastive Learning (Intra-ICL) is introduced to focus on the local content inconsistencies prevalent in the forged faces.
arXiv Detail & Related papers (2021-12-27T05:44:40Z) - Semi-supervised Domain Adaptive Structure Learning [72.01544419893628]
Semi-supervised domain adaptation (SSDA) is a challenging problem requiring methods to overcome both 1) overfitting towards poorly annotated data and 2) distribution shift across domains.
We introduce an adaptive structure learning method to regularize the cooperation of SSL and DA.
arXiv Detail & Related papers (2021-12-12T06:11:16Z) - Frequency-aware Discriminative Feature Learning Supervised by
Single-Center Loss for Face Forgery Detection [89.43987367139724]
Face forgery detection is raising ever-increasing interest in computer vision.
Recent works have reached sound achievements, but there are still unignorable problems.
A novel frequency-aware discriminative feature learning framework is proposed in this paper.
arXiv Detail & Related papers (2021-03-16T14:17:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.