Adaptive Transformers for Robust Few-shot Cross-domain Face
Anti-spoofing
- URL: http://arxiv.org/abs/2203.12175v2
- Date: Fri, 28 Jul 2023 18:21:00 GMT
- Title: Adaptive Transformers for Robust Few-shot Cross-domain Face
Anti-spoofing
- Authors: Hsin-Ping Huang, Deqing Sun, Yaojie Liu, Wen-Sheng Chu, Taihong Xiao,
Jinwei Yuan, Hartwig Adam, Ming-Hsuan Yang
- Abstract summary: We present adaptive vision transformers (ViT) for robust cross-domain face antispoofing.
We adopt ViT as a backbone to exploit its strength to account for long-range dependencies among pixels.
Experiments on several benchmark datasets show that the proposed models achieve both robust and competitive performance.
- Score: 71.06718651013965
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: While recent face anti-spoofing methods perform well under the intra-domain
setups, an effective approach needs to account for much larger appearance
variations of images acquired in complex scenes with different sensors for
robust performance. In this paper, we present adaptive vision transformers
(ViT) for robust cross-domain face antispoofing. Specifically, we adopt ViT as
a backbone to exploit its strength to account for long-range dependencies among
pixels. We further introduce the ensemble adapters module and feature-wise
transformation layers in the ViT to adapt to different domains for robust
performance with a few samples. Experiments on several benchmark datasets show
that the proposed models achieve both robust and competitive performance
against the state-of-the-art methods for cross-domain face anti-spoofing using
a few samples.
Related papers
- MoE-FFD: Mixture of Experts for Generalized and Parameter-Efficient Face Forgery Detection [54.545054873239295]
Deepfakes have recently raised significant trust issues and security concerns among the public.
ViT-based methods take advantage of the expressivity of transformers, achieving superior detection performance.
This work introduces Mixture-of-Experts modules for Face Forgery Detection (MoE-FFD), a generalized yet parameter-efficient ViT-based approach.
arXiv Detail & Related papers (2024-04-12T13:02:08Z) - Fine-Grained Unsupervised Cross-Modality Domain Adaptation for
Vestibular Schwannoma Segmentation [3.0081059328558624]
We focus on introducing a fine-grained unsupervised framework for domain adaptation.
We propose to use a vector to control the generator to synthesize a fake image with given features.
And then, we can apply various augmentations to the dataset by searching the feature dictionary.
arXiv Detail & Related papers (2023-11-25T18:08:59Z) - FLIP: Cross-domain Face Anti-spoofing with Language Guidance [19.957293190322332]
Face anti-spoofing (FAS) or presentation attack detection is an essential component of face recognition systems.
Recent vision transformer (ViT) models have been shown to be effective for the FAS task.
We propose a novel approach for robust cross-domain FAS by grounding visual representations with the help of natural language.
arXiv Detail & Related papers (2023-09-28T17:53:20Z) - MVP: Meta Visual Prompt Tuning for Few-Shot Remote Sensing Image Scene
Classification [15.780372479483235]
PMF has achieved promising results in few-shot image classification by utilizing pre-trained vision transformer models.
We propose the Meta Visual Prompt Tuning (MVP) method, which updates only the newly added prompt parameters while keeping the pre-trained backbone frozen.
We introduce a novel data augmentation strategy based on patch embedding recombination to enhance the representation and diversity of scenes for classification purposes.
arXiv Detail & Related papers (2023-09-17T13:51:05Z) - ViTransPAD: Video Transformer using convolution and self-attention for
Face Presentation Attack Detection [15.70621878093133]
Face Presentation Attack Detection (PAD) is an important measure to prevent spoof attacks for face biometric systems.
Many works based on Convolution Neural Networks (CNNs) for face PAD formulate the problem as an image-level binary task without considering the context.
We propose a Video-based Transformer for face PAD (ViTransPAD) with shorttemporal/range-attention which can not only focus on local details with short attention within a frame but also capture long-range dependencies over frames.
arXiv Detail & Related papers (2022-03-03T08:23:20Z) - Adaptive Image Transformations for Transfer-based Adversarial Attack [73.74904401540743]
We propose a novel architecture, called Adaptive Image Transformation Learner (AITL)
Our elaborately designed learner adaptively selects the most effective combination of image transformations specific to the input image.
Our method significantly improves the attack success rates on both normally trained models and defense models under various settings.
arXiv Detail & Related papers (2021-11-27T08:15:44Z) - PnP-DETR: Towards Efficient Visual Analysis with Transformers [146.55679348493587]
Recently, DETR pioneered the solution vision tasks with transformers, it directly translates the image feature map into the object result.
Recent transformer-based image recognition model andTT show consistent efficiency gain.
arXiv Detail & Related papers (2021-09-15T01:10:30Z) - PIT: Position-Invariant Transform for Cross-FoV Domain Adaptation [53.428312630479816]
We observe that the Field of View (FoV) gap induces noticeable instance appearance differences between the source and target domains.
Motivated by the observations, we propose the textbfPosition-Invariant Transform (PIT) to better align images in different domains.
arXiv Detail & Related papers (2021-08-16T15:16:47Z) - Visual Saliency Transformer [127.33678448761599]
We develop a novel unified model based on a pure transformer, Visual Saliency Transformer (VST), for both RGB and RGB-D salient object detection (SOD)
It takes image patches as inputs and leverages the transformer to propagate global contexts among image patches.
Experimental results show that our model outperforms existing state-of-the-art results on both RGB and RGB-D SOD benchmark datasets.
arXiv Detail & Related papers (2021-04-25T08:24:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.