ViT Unified: Joint Fingerprint Recognition and Presentation Attack
Detection
- URL: http://arxiv.org/abs/2305.07602v1
- Date: Fri, 12 May 2023 16:51:14 GMT
- Title: ViT Unified: Joint Fingerprint Recognition and Presentation Attack
Detection
- Authors: Steven A. Grosz, Kanishka P. Wijewardena, and Anil K. Jain
- Abstract summary: We leverage a vision transformer architecture for joint spoof detection and matching.
We report competitive results with state-of-the-art (SOTA) models for both a sequential system and a unified architecture.
We demonstrate the capability of our unified model to achieve an average integrated matching (IM) accuracy of 98.87% across LivDet 2013 and 2015 CrossMatch sensors.
- Score: 36.05807963935458
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: A secure fingerprint recognition system must contain both a presentation
attack (i.e., spoof) detection and recognition module in order to protect users
against unwanted access by malicious users. Traditionally, these tasks would be
carried out by two independent systems; however, recent studies have
demonstrated the potential to have one unified system architecture in order to
reduce the computational burdens on the system, while maintaining high
accuracy. In this work, we leverage a vision transformer architecture for joint
spoof detection and matching and report competitive results with
state-of-the-art (SOTA) models for both a sequential system (two ViT models
operating independently) and a unified architecture (a single ViT model for
both tasks). ViT models are particularly well suited for this task as the ViT's
global embedding encodes features useful for recognition, whereas the
individual, local embeddings are useful for spoof detection. We demonstrate the
capability of our unified model to achieve an average integrated matching (IM)
accuracy of 98.87% across LivDet 2013 and 2015 CrossMatch sensors. This is
comparable to IM accuracy of 98.95% of our sequential dual-ViT system, but with
~50% of the parameters and ~58% of the latency.
Related papers
- Visual Agents as Fast and Slow Thinkers [88.6691504568041]
We introduce FaST, which incorporates the Fast and Slow Thinking mechanism into visual agents.
FaST employs a switch adapter to dynamically select between System 1/2 modes.
It tackles uncertain and unseen objects by adjusting model confidence and integrating new contextual data.
arXiv Detail & Related papers (2024-08-16T17:44:02Z) - Towards Robust Vision Transformer via Masked Adaptive Ensemble [23.986968861837813]
Adversarial training (AT) can help improve the robustness of Vision Transformers (ViT) against adversarial attacks.
This paper proposes a novel ViT architecture, including a detector and a classifier bridged by our newly developed adaptive ensemble.
Experimental results exhibit that our ViT architecture, on CIFAR-10, achieves the best standard accuracy and adversarial robustness of 90.3% and 49.8%, respectively.
arXiv Detail & Related papers (2024-07-22T05:28:29Z) - Joint Identity Verification and Pose Alignment for Partial Fingerprints [33.05877729161858]
We propose a novel framework for joint identity verification and pose alignment of partial fingerprint pairs.
Our method achieves state-of-the-art performance in both partial fingerprint verification and relative pose estimation.
arXiv Detail & Related papers (2024-05-07T02:45:50Z) - Evaluating the Efficacy of Prompt-Engineered Large Multimodal Models Versus Fine-Tuned Vision Transformers in Image-Based Security Applications [2.8161155726745237]
Large Multimodal Models (LMMs) are designed to interpret and analyze complex data by integrating multiple modalities such as text and images.
This paper investigates the applicability and effectiveness of prompt-engineered LMMs that process both images and text, compared to fine-tuned Vision Transformer (ViT) models.
For the visually non-evident task, the results highlight a significant divergence in performance, with ViT models achieving F1-scores of 97.11% in predicting 25 malware classes and 97.61% in predicting 5 malware families.
arXiv Detail & Related papers (2024-03-26T15:20:49Z) - Generalized Face Forgery Detection via Adaptive Learning for Pre-trained Vision Transformer [54.32283739486781]
We present a textbfForgery-aware textbfAdaptive textbfVision textbfTransformer (FA-ViT) under the adaptive learning paradigm.
FA-ViT achieves 93.83% and 78.32% AUC scores on Celeb-DF and DFDC datasets in the cross-dataset evaluation.
arXiv Detail & Related papers (2023-09-20T06:51:11Z) - Towards General Visual-Linguistic Face Forgery Detection [95.73987327101143]
Deepfakes are realistic face manipulations that can pose serious threats to security, privacy, and trust.
Existing methods mostly treat this task as binary classification, which uses digital labels or mask signals to train the detection model.
We propose a novel paradigm named Visual-Linguistic Face Forgery Detection(VLFFD), which uses fine-grained sentence-level prompts as the annotation.
arXiv Detail & Related papers (2023-07-31T10:22:33Z) - AFR-Net: Attention-Driven Fingerprint Recognition Network [47.87570819350573]
We improve initial studies on the use of vision transformers (ViT) for biometric recognition, including fingerprint recognition.
We propose a realignment strategy using local embeddings extracted from intermediate feature maps within the networks to refine the global embeddings in low certainty situations.
This strategy can be applied as a wrapper to any existing deep learning network (including attention-based, CNN-based, or both) to boost its performance.
arXiv Detail & Related papers (2022-11-25T05:10:39Z) - Fingerprint recognition with embedded presentation attacks detection:
are we ready? [6.0168714922994075]
The diffusion of fingerprint verification systems for security applications makes it urgent to investigate the embedding of software-based presentation attack algorithms (PAD) into such systems.
Current research did not state much about their effectiveness when embedded in fingerprint verification systems.
This paper proposes a performance simulator based on the probabilistic modeling of the relationships among the Receiver Operating Characteristics (ROC) of the two individual systems when PAD and verification stages are implemented sequentially.
arXiv Detail & Related papers (2021-10-20T13:53:16Z) - Vision Transformers are Robust Learners [65.91359312429147]
We study the robustness of the Vision Transformer (ViT) against common corruptions and perturbations, distribution shifts, and natural adversarial examples.
We present analyses that provide both quantitative and qualitative indications to explain why ViTs are indeed more robust learners.
arXiv Detail & Related papers (2021-05-17T02:39:22Z) - A Unified Model for Fingerprint Authentication and Presentation Attack
Detection [1.9703625025720706]
We reformulate the workings of a typical fingerprint recognition system.
We propose a joint model for spoof detection and matching to simultaneously perform both tasks.
This reduces the time and memory requirements of the fingerprint recognition system by 50% and 40%, respectively.
arXiv Detail & Related papers (2021-04-07T16:57:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.