Benchmarking Unified Face Attack Detection via Hierarchical Prompt Tuning
- URL: http://arxiv.org/abs/2505.13327v2
- Date: Tue, 20 May 2025 02:07:48 GMT
- Title: Benchmarking Unified Face Attack Detection via Hierarchical Prompt Tuning
- Authors: Ajian Liu, Haocheng Yuan, Xiao Guo, Hui Ma, Wanyi Zhuang, Changtao Miao, Yan Hong, Chuanbiao Song, Jun Lan, Qi Chu, Tao Gong, Yanyan Liang, Weiqiang Wang, Jun Wan, Xiaoming Liu, Zhen Lei,
- Abstract summary: Presentation Attack Detection and Face Forgery Detection are designed to protect face data from physical media-based Presentation Attacks and digital editing-based DeepFakes respectively.<n>Separate training of these two models makes them vulnerable to unknown attacks and burdens deployment environments.<n>We present a novel Visual-Language Model-based Hierarchical Prompt Tuning Framework (HiPTune) that adaptively explores multiple classification criteria from different semantic spaces.
- Score: 58.16354555208417
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Presentation Attack Detection and Face Forgery Detection are designed to protect face data from physical media-based Presentation Attacks and digital editing-based DeepFakes respectively. But separate training of these two models makes them vulnerable to unknown attacks and burdens deployment environments. The lack of a Unified Face Attack Detection model to handle both types of attacks is mainly due to two factors. First, there's a lack of adequate benchmarks for models to explore. Existing UAD datasets have limited attack types and samples, restricting the model's ability to address advanced threats. To address this, we propose UniAttackDataPlus (UniAttackData+), the most extensive and sophisticated collection of forgery techniques to date. It includes 2,875 identities and their 54 kinds of falsified samples, totaling 697,347 videos. Second, there's a lack of a reliable classification criterion. Current methods try to find an arbitrary criterion within the same semantic space, which fails when encountering diverse attacks. So, we present a novel Visual-Language Model-based Hierarchical Prompt Tuning Framework (HiPTune) that adaptively explores multiple classification criteria from different semantic spaces. We build a Visual Prompt Tree to explore various classification rules hierarchically. Then, by adaptively pruning the prompts, the model can select the most suitable prompts to guide the encoder to extract discriminative features at different levels in a coarse-to-fine way. Finally, to help the model understand the classification criteria in visual space, we propose a Dynamically Prompt Integration module to project the visual prompts to the text encoder for more accurate semantics. Experiments on 12 datasets have shown the potential to inspire further innovations in the UAD field.
Related papers
- CoPS: Conditional Prompt Synthesis for Zero-Shot Anomaly Detection [6.1568149026052374]
Conditional Prompt Synthesis (CoPS) is a novel framework that synthesizes dynamic prompts conditioned on visual features to enhance ZSAD performance.<n>CoPS surpasses state-of-the-art methods by 2.5% AUROC in both classification and segmentation across 13 industrial and medical datasets.
arXiv Detail & Related papers (2025-08-05T13:47:45Z) - Adaptation Method for Misinformation Identification [8.581136866856255]
We propose ADOSE, an Active Domain Adaptation (ADA) framework for multimodal fake news detection.<n>ADOSE actively annotates a small subset of target samples to improve detection performance.<n>ADOSE outperforms existing ADA methods by 2.72% $sim$ 14.02%, indicating the superiority of our model.
arXiv Detail & Related papers (2025-04-19T04:18:32Z) - Generalized Semantic Contrastive Learning via Embedding Side Information for Few-Shot Object Detection [52.490375806093745]
The objective of few-shot object detection (FSOD) is to detect novel objects with few training samples.<n>We introduce the side information to alleviate the negative influences derived from the feature space and sample viewpoints.<n>Our model outperforms the previous state-of-the-art methods, significantly improving the ability of FSOD in most shots/splits.
arXiv Detail & Related papers (2025-04-09T17:24:05Z) - AdvQDet: Detecting Query-Based Adversarial Attacks with Adversarial Contrastive Prompt Tuning [93.77763753231338]
Adversarial Contrastive Prompt Tuning (ACPT) is proposed to fine-tune the CLIP image encoder to extract similar embeddings for any two intermediate adversarial queries.
We show that ACPT can detect 7 state-of-the-art query-based attacks with $>99%$ detection rate within 5 shots.
We also show that ACPT is robust to 3 types of adaptive attacks.
arXiv Detail & Related papers (2024-08-04T09:53:50Z) - UNICAD: A Unified Approach for Attack Detection, Noise Reduction and Novel Class Identification [5.570086931219838]
UNICAD is proposed as a novel framework that integrates a variety of techniques to provide an adaptive solution.
For the targeted image classification, UNICAD achieves accurate image classification, detects unseen classes, and recovers from adversarial attacks.
Our experiments performed on the CIFAR-10 dataset highlight UNICAD's effectiveness in adversarial mitigation and unseen class classification, outperforming traditional models.
arXiv Detail & Related papers (2024-06-24T10:10:03Z) - Unified Physical-Digital Face Attack Detection [66.14645299430157]
Face Recognition (FR) systems can suffer from physical (i.e., print photo) and digital (i.e., DeepFake) attacks.
Previous related work rarely considers both situations at the same time.
We propose a Unified Attack Detection framework based on Vision-Language Models (VLMs)
arXiv Detail & Related papers (2024-01-31T09:38:44Z) - Model Stealing Attack against Graph Classification with Authenticity, Uncertainty and Diversity [80.16488817177182]
GNNs are vulnerable to the model stealing attack, a nefarious endeavor geared towards duplicating the target model via query permissions.
We introduce three model stealing attacks to adapt to different actual scenarios.
arXiv Detail & Related papers (2023-12-18T05:42:31Z) - Hyperbolic Face Anti-Spoofing [21.981129022417306]
We propose to learn richer hierarchical and discriminative spoofing cues in hyperbolic space.
For unimodal FAS learning, the feature embeddings are projected into the Poincar'e ball, and then the hyperbolic binary logistic regression layer is cascaded for classification.
To alleviate the vanishing gradient problem in hyperbolic space, a new feature clipping method is proposed to enhance the training stability of hyperbolic models.
arXiv Detail & Related papers (2023-08-17T17:18:21Z) - Towards General Visual-Linguistic Face Forgery Detection [95.73987327101143]
Deepfakes are realistic face manipulations that can pose serious threats to security, privacy, and trust.
Existing methods mostly treat this task as binary classification, which uses digital labels or mask signals to train the detection model.
We propose a novel paradigm named Visual-Linguistic Face Forgery Detection(VLFFD), which uses fine-grained sentence-level prompts as the annotation.
arXiv Detail & Related papers (2023-07-31T10:22:33Z) - Dynamic Prototype Mask for Occluded Person Re-Identification [88.7782299372656]
Existing methods mainly address this issue by employing body clues provided by an extra network to distinguish the visible part.
We propose a novel Dynamic Prototype Mask (DPM) based on two self-evident prior knowledge.
Under this condition, the occluded representation could be well aligned in a selected subspace spontaneously.
arXiv Detail & Related papers (2022-07-19T03:31:13Z) - Differentiable Language Model Adversarial Attacks on Categorical
Sequence Classifiers [0.0]
An adversarial attack paradigm explores various scenarios for the vulnerability of deep learning models.
We use a fine-tuning of a language model for adversarial attacks as a generator of adversarial examples.
Our model works for diverse datasets on bank transactions, electronic health records, and NLP datasets.
arXiv Detail & Related papers (2020-06-19T11:25:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.