Finger Multimodal Feature Fusion and Recognition Based on Channel
Spatial Attention
- URL: http://arxiv.org/abs/2209.02368v1
- Date: Tue, 6 Sep 2022 10:48:30 GMT
- Title: Finger Multimodal Feature Fusion and Recognition Based on Channel
Spatial Attention
- Authors: Jian Guo, Jiaxiang Tu, Hengyi Ren, Chong Han, Lijuan Sun
- Abstract summary: We propose a multimodal biometric fusion recognition algorithm based on fingerprints and finger veins.
For each pair of fingerprint and finger vein images, we first propose a simple and effective Convolutional Neural Network (CNN) to extract features.
Then, we build a multimodal feature fusion module (Channel Spatial Attention Fusion Module, CSAFM) to fully fuse the complementary information between fingerprints and finger veins.
- Score: 8.741051302995755
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Due to the instability and limitations of unimodal biometric systems,
multimodal systems have attracted more and more attention from researchers.
However, how to exploit the independent and complementary information between
different modalities remains a key and challenging problem. In this paper, we
propose a multimodal biometric fusion recognition algorithm based on
fingerprints and finger veins (Fingerprint Finger Veins-Channel Spatial
Attention Fusion Module, FPV-CSAFM). Specifically, for each pair of fingerprint
and finger vein images, we first propose a simple and effective Convolutional
Neural Network (CNN) to extract features. Then, we build a multimodal feature
fusion module (Channel Spatial Attention Fusion Module, CSAFM) to fully fuse
the complementary information between fingerprints and finger veins. Different
from existing fusion strategies, our fusion method can dynamically adjust the
fusion weights according to the importance of different modalities in channel
and spatial dimensions, so as to better combine the information between
different modalities and improve the overall recognition performance. To
evaluate the performance of our method, we conduct a series of experiments on
multiple public datasets. Experimental results show that the proposed FPV-CSAFM
achieves excellent recognition performance on three multimodal datasets based
on fingerprints and finger veins.
Related papers
- Multimodality Helps Few-Shot 3D Point Cloud Semantic Segmentation [61.91492500828508]
Few-shot 3D point cloud segmentation (FS-PCS) aims at generalizing models to segment novel categories with minimal support samples.
We introduce a cost-free multimodal FS-PCS setup, utilizing textual labels and the potentially available 2D image modality.
We propose a simple yet effective Test-time Adaptive Cross-modal Seg (TACC) technique to mitigate training bias.
arXiv Detail & Related papers (2024-10-29T19:28:41Z) - AMFD: Distillation via Adaptive Multimodal Fusion for Multispectral Pedestrian Detection [23.91870504363899]
Double-stream networks in multispectral detection employ two separate feature extraction branches for multi-modal data.
This has hindered the widespread employment of multispectral pedestrian detection in embedded devices for autonomous systems.
We introduce the Adaptive Modal Fusion Distillation (AMFD) framework, which can fully utilize the original modal features of the teacher network.
arXiv Detail & Related papers (2024-05-21T17:17:17Z) - Fusion-Mamba for Cross-modality Object Detection [63.56296480951342]
Cross-modality fusing information from different modalities effectively improves object detection performance.
We design a Fusion-Mamba block (FMB) to map cross-modal features into a hidden state space for interaction.
Our proposed approach outperforms the state-of-the-art methods on $m$AP with 5.9% on $M3FD$ and 4.9% on FLIR-Aligned datasets.
arXiv Detail & Related papers (2024-04-14T05:28:46Z) - Joint Multimodal Transformer for Emotion Recognition in the Wild [49.735299182004404]
Multimodal emotion recognition (MMER) systems typically outperform unimodal systems.
This paper proposes an MMER method that relies on a joint multimodal transformer (JMT) for fusion with key-based cross-attention.
arXiv Detail & Related papers (2024-03-15T17:23:38Z) - DiffVein: A Unified Diffusion Network for Finger Vein Segmentation and
Authentication [50.017055360261665]
We introduce DiffVein, a unified diffusion model-based framework which simultaneously addresses vein segmentation and authentication tasks.
For better feature interaction between these two branches, we introduce two specialized modules.
In this way, our framework allows for a dynamic interplay between diffusion and segmentation embeddings.
arXiv Detail & Related papers (2024-02-03T06:49:42Z) - Just Noticeable Visual Redundancy Forecasting: A Deep Multimodal-driven
Approach [11.600496805298778]
Just noticeable difference (JND) refers to the maximum visual change that human eyes cannot perceive.
In this article, we investigate the JND modeling from an end-to-end multimodal perspective, namely hmJND-Net.
arXiv Detail & Related papers (2023-03-18T09:36:59Z) - Multimodal Object Detection via Bayesian Fusion [59.31437166291557]
We study multimodal object detection with RGB and thermal cameras, since the latter can provide much stronger object signatures under poor illumination.
Our key contribution is a non-learned late-fusion method that fuses together bounding box detections from different modalities.
We apply our approach to benchmarks containing both aligned (KAIST) and unaligned (FLIR) multimodal sensor data.
arXiv Detail & Related papers (2021-04-07T04:03:20Z) - MSAF: Multimodal Split Attention Fusion [6.460517449962825]
We propose a novel multimodal fusion module that learns to emphasize more contributive features across all modalities.
Our approach achieves competitive results in each task and outperforms other application-specific networks and multimodal fusion benchmarks.
arXiv Detail & Related papers (2020-12-13T22:42:41Z) - Deep Multimodal Fusion by Channel Exchanging [87.40768169300898]
This paper proposes a parameter-free multimodal fusion framework that dynamically exchanges channels between sub-networks of different modalities.
The validity of such exchanging process is also guaranteed by sharing convolutional filters yet keeping separate BN layers across modalities, which, as an add-on benefit, allows our multimodal architecture to be almost as compact as a unimodal network.
arXiv Detail & Related papers (2020-11-10T09:53:20Z) - Multi-modal Fusion for Single-Stage Continuous Gesture Recognition [45.19890687786009]
We introduce a single-stage continuous gesture recognition framework, called Temporal Multi-Modal Fusion (TMMF)
TMMF can detect and classify multiple gestures in a video via a single model.
This approach learns the natural transitions between gestures and non-gestures without the need for a pre-processing segmentation step.
arXiv Detail & Related papers (2020-11-10T07:09:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.