AFR-Net: Attention-Driven Fingerprint Recognition Network
- URL: http://arxiv.org/abs/2211.13897v1
- Date: Fri, 25 Nov 2022 05:10:39 GMT
- Title: AFR-Net: Attention-Driven Fingerprint Recognition Network
- Authors: Steven A. Grosz and Anil K. Jain
- Abstract summary: We improve initial studies on the use of vision transformers (ViT) for biometric recognition, including fingerprint recognition.
We propose a realignment strategy using local embeddings extracted from intermediate feature maps within the networks to refine the global embeddings in low certainty situations.
This strategy can be applied as a wrapper to any existing deep learning network (including attention-based, CNN-based, or both) to boost its performance.
- Score: 47.87570819350573
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The use of vision transformers (ViT) in computer vision is increasing due to
limited inductive biases (e.g., locality, weight sharing, etc.) and increased
scalability compared to other deep learning methods (e.g., convolutional neural
networks (CNN)). This has led to some initial studies on the use of ViT for
biometric recognition, including fingerprint recognition. In this work, we
improve on these initial studies for transformers in fingerprint recognition by
i.) evaluating additional attention-based architectures in addition to vanilla
ViT, ii.) scaling to larger and more diverse training and evaluation datasets,
and iii.) combining the complimentary representations of attention-based and
CNN-based embeddings for improved state-of-the-art (SOTA) fingerprint
recognition for both authentication (1:1 comparisons) and identification (1:N
comparisions). Our combined architecture, AFR-Net (Attention-Driven Fingerprint
Recognition Network), outperforms several baseline transformer and CNN-based
models, including a SOTA commercial fingerprint system, Verifinger v12.3,
across many intra-sensor, cross-sensor (including contact to contactless), and
latent to rolled fingerprint matching datasets. Additionally, we propose a
realignment strategy using local embeddings extracted from intermediate feature
maps within the networks to refine the global embeddings in low certainty
situations, which boosts the overall recognition accuracy significantly for all
the evaluations across each of the models. This realignment strategy requires
no additional training and can be applied as a wrapper to any existing deep
learning network (including attention-based, CNN-based, or both) to boost its
performance.
Related papers
- Combined CNN and ViT features off-the-shelf: Another astounding baseline for recognition [49.14350399025926]
We apply pre-trained architectures, originally developed for the ImageNet Large Scale Visual Recognition Challenge, for periocular recognition.
Middle-layer features from CNNs and ViTs are a suitable way to recognize individuals based on periocular images.
arXiv Detail & Related papers (2024-07-28T11:52:36Z) - AdaNCA: Neural Cellular Automata As Adaptors For More Robust Vision Transformer [27.921949273217468]
Vision Transformers (ViTs) demonstrate remarkable performance in image classification through visual-token interaction learning.
We propose Neural Cellular Automata (NCA) for Vision Transformers that uses NCA as plug-and-play adaptors between ViT layers.
With less than a 3% increase in parameters, AdaNCA contributes to more than 10% absolute improvement in accuracy under adversarial attacks.
arXiv Detail & Related papers (2024-06-12T14:59:12Z) - Joint Identity Verification and Pose Alignment for Partial Fingerprints [33.05877729161858]
We propose a novel framework for joint identity verification and pose alignment of partial fingerprint pairs.
Our method achieves state-of-the-art performance in both partial fingerprint verification and relative pose estimation.
arXiv Detail & Related papers (2024-05-07T02:45:50Z) - FingerNet: EEG Decoding of A Fine Motor Imagery with Finger-tapping Task
Based on A Deep Neural Network [4.613725465729454]
This study introduces FingerNet, a specialized network for fine MI classification.
Performance showed significantly higher accuracy in classifying five finger-tapping tasks.
For biased predictions, particularly for thumb and index classes, we led to the implementation of weighted cross-entropy.
arXiv Detail & Related papers (2024-03-06T08:05:53Z) - GanFinger: GAN-Based Fingerprint Generation for Deep Neural Network
Ownership Verification [8.00359513511764]
We propose a network fingerprinting approach, named as GanFinger, to construct the network fingerprints based on the network behavior.
GanFinger significantly outperforms the state-of-the-arts in efficiency, stealthiness, and discriminability.
It achieves a remarkable 6.57 times faster in fingerprint generation and boosts the ARUC value by 0.175, resulting in a relative improvement of about 26%.
arXiv Detail & Related papers (2023-12-25T05:35:57Z) - Affine-Consistent Transformer for Multi-Class Cell Nuclei Detection [76.11864242047074]
We propose a novel Affine-Consistent Transformer (AC-Former), which directly yields a sequence of nucleus positions.
We introduce an Adaptive Affine Transformer (AAT) module, which can automatically learn the key spatial transformations to warp original images for local network training.
Experimental results demonstrate that the proposed method significantly outperforms existing state-of-the-art algorithms on various benchmarks.
arXiv Detail & Related papers (2023-10-22T02:27:02Z) - NAR-Former V2: Rethinking Transformer for Universal Neural Network
Representation Learning [25.197394237526865]
We propose a modified Transformer-based universal neural network representation learning model NAR-Former V2.
Specifically, we take the network as a graph and design a straightforward tokenizer to encode the network into a sequence.
We incorporate the inductive representation learning capability of GNN into Transformer, enabling Transformer to generalize better when encountering unseen architecture.
arXiv Detail & Related papers (2023-06-19T09:11:04Z) - A Generic Shared Attention Mechanism for Various Backbone Neural Networks [53.36677373145012]
Self-attention modules (SAMs) produce strongly correlated attention maps across different layers.
Dense-and-Implicit Attention (DIA) shares SAMs across layers and employs a long short-term memory module.
Our simple yet effective DIA can consistently enhance various network backbones.
arXiv Detail & Related papers (2022-10-27T13:24:08Z) - Fusion of CNNs and statistical indicators to improve image
classification [65.51757376525798]
Convolutional Networks have dominated the field of computer vision for the last ten years.
Main strategy to prolong this trend relies on further upscaling networks in size.
We hypothesise that adding heterogeneous sources of information may be more cost-effective to a CNN than building a bigger network.
arXiv Detail & Related papers (2020-12-20T23:24:31Z) - Semantics-aware Adaptive Knowledge Distillation for Sensor-to-Vision
Action Recognition [131.6328804788164]
We propose a framework, named Semantics-aware Adaptive Knowledge Distillation Networks (SAKDN), to enhance action recognition in vision-sensor modality (videos)
The SAKDN uses multiple wearable-sensors as teacher modalities and uses RGB videos as student modality.
arXiv Detail & Related papers (2020-09-01T03:38:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.