Related papers: Neural Architecture Search based Global-local Vision Mamba for Palm-Vein Recognition

Neural Architecture Search based Global-local Vision Mamba for Palm-Vein Recognition

URL: http://arxiv.org/abs/2408.05743v4
Date: Tue, 10 Sep 2024 09:07:35 GMT
Title: Neural Architecture Search based Global-local Vision Mamba for Palm-Vein Recognition
Authors: Huafeng Qin, Yuming Fu, Jing Chen, Mounim A. El-Yacoubi, Xinbo Gao, Feng Xi,
Abstract summary: We propose a hybrid network structure named Global-local Vision Mamba (GLVM) to learn the local correlations in images explicitly and global dependencies among tokens for vein feature representation. Thirdly, to learn the complementary features, we propose a ConvMamba block consisting of three branches, named Multi-head Mamba branch (MHMamba), Feature Iteration Unit branch (FIU), and Convolutional Neural Network (CNN) branch. Finally, a Globallocal Alternate Neural Architecture Search (GLNAS) method is proposed to search the optimal architecture of GLVM alternately with the evolutionary algorithm.
Score: 42.4241558556591
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Due to the advantages such as high security, high privacy, and liveness recognition, vein recognition has been received more and more attention in past years. Recently, deep learning models, e.g., Mamba has shown robust feature representation with linear computational complexity and successfully applied for visual tasks. However, vision Manba can capture long-distance feature dependencies but unfortunately deteriorate local feature details. Besides, manually designing a Mamba architecture based on human priori knowledge is very time-consuming and error-prone. In this paper, first, we propose a hybrid network structure named Global-local Vision Mamba (GLVM), to learn the local correlations in images explicitly and global dependencies among tokens for vein feature representation. Secondly, we design a Multi-head Mamba to learn the dependencies along different directions, so as to improve the feature representation ability of vision Mamba. Thirdly, to learn the complementary features, we propose a ConvMamba block consisting of three branches, named Multi-head Mamba branch (MHMamba), Feature Iteration Unit branch (FIU), and Convolutional Neural Network (CNN) branch, where the Feature Iteration Unit branch aims to fuse convolutional local features with Mamba-based global representations. Finally, a Globallocal Alternate Neural Architecture Search (GLNAS) method is proposed to search the optimal architecture of GLVM alternately with the evolutionary algorithm, thereby improving the recognition performance for vein recognition tasks. We conduct rigorous experiments on three public palm-vein databases to estimate the performance. The experimental results demonstrate that the proposed method outperforms the representative approaches and achieves state-of-the-art recognition accuracy.

Related papers

Samba+: General and Accurate Salient Object Detection via A More Unified Mamba-based Framework [66.2103745798444]
Saliency Mamba (Samba) is a pure Mamba-based architecture that flexibly handles various distinct salient object detection tasks.<n>Samba individually outperforms existing methods across six SOD tasks on 22 datasets with lower computational cost.<n>Samba+ achieves even superior results on these tasks and datasets by using a single trained versatile model.
arXiv Detail & Related papers (2026-02-02T03:34:25Z)
ReIDMamba: Learning Discriminative Features with Visual State Space Model for Person Re-Identification [5.546676157182037]
We propose a pure Mamba-based person ReID framework named ReIDMamba.<n>Our proposed ReIDMamba model boasts only one-third the parameters of TransReID, along with lower GPU memory usage and faster inference throughput.
arXiv Detail & Related papers (2025-11-11T08:01:16Z)
GCRPNet: Graph-Enhanced Contextual and Regional Perception Network for Salient Object Detection in Optical Remote Sensing Images [68.33481681452675]
We propose a graph-enhanced contextual and regional perception network (GCRPNet)<n>It builds upon the Mamba architecture to simultaneously capture long-range dependencies and enhance regional feature representation.<n>It performs adaptive patch scanning on feature maps processed via multi-scale convolutions, thereby capturing rich local region information.
arXiv Detail & Related papers (2025-08-14T11:31:43Z)
AtrousMamaba: An Atrous-Window Scanning Visual State Space Model for Remote Sensing Change Detection [29.004019252136565]
We propose a novel model, AtrousMamba, which balances the extraction of fine-grained local details with the integration of global contextual information.<n>By leveraging the atrous window scan visual state space (AWVSS) module, we design dedicated end-to-end Mamba-based frameworks for binary change detection (BCD) and semantic change detection (SCD)<n> Experimental results on six benchmark datasets show that the proposed framework outperforms existing CNN-based, Transformer-based, and Mamba-based methods.
arXiv Detail & Related papers (2025-07-22T02:36:16Z)
RD-UIE: Relation-Driven State Space Modeling for Underwater Image Enhancement [59.364418120895]
Underwater image enhancement (UIE) is a critical preprocessing step for marine vision applications.<n>We develop a novel relation-driven Mamba framework for effective UIE (RD-UIE)<n>Experiments on underwater enhancement benchmarks demonstrate RD-UIE outperforms the state-of-the-art approach WMamba.
arXiv Detail & Related papers (2025-05-02T12:21:44Z)
TransMamba: Fast Universal Architecture Adaption from Transformers to Mamba [88.31117598044725]
We explore cross-architecture training to transfer the ready knowledge in existing Transformer models to alternative architecture Mamba, termed TransMamba. Our approach employs a two-stage strategy to expedite training new Mamba models, ensuring effectiveness in across uni-modal and cross-modal tasks. For cross-modal learning, we propose a cross-Mamba module that integrates language awareness into Mamba's visual features, enhancing the cross-modal interaction capabilities of Mamba architecture.
arXiv Detail & Related papers (2025-02-21T01:22:01Z)
MambaGlue: Fast and Robust Local Feature Matching With Mamba [9.397265252815115]
We propose a novel Mamba-based local feature matching approach, called MambaGlue. Mamba is an emerging state-of-the-art architecture rapidly gaining recognition for its superior speed in both training and inference. Our MambaGlue achieves a balance between robustness and efficiency in real-world applications.
arXiv Detail & Related papers (2025-02-01T15:43:03Z)
Mamba-SEUNet: Mamba UNet for Monaural Speech Enhancement [54.427965535613886]
Mamba, as a novel state-space model (SSM), has gained widespread application in natural language processing and computer vision. In this work, we introduce Mamba-SEUNet, an innovative architecture that integrates Mamba with U-Net for SE tasks.
arXiv Detail & Related papers (2024-12-21T13:43:51Z)
Bidirectional Gated Mamba for Sequential Recommendation [56.85338055215429]
Mamba, a recent advancement, has exhibited exceptional performance in time series prediction. We introduce a new framework named Selective Gated Mamba ( SIGMA) for Sequential Recommendation. Our results indicate that SIGMA outperforms current models on five real-world datasets.
arXiv Detail & Related papers (2024-08-21T09:12:59Z)
MambaVision: A Hybrid Mamba-Transformer Vision Backbone [54.965143338206644]
We propose a novel hybrid Mamba-Transformer backbone, denoted as MambaVision, which is specifically tailored for vision applications. Our core contribution includes redesigning the Mamba formulation to enhance its capability for efficient modeling of visual features. We conduct a comprehensive ablation study on the feasibility of integrating Vision Transformers (ViT) with Mamba.
arXiv Detail & Related papers (2024-07-10T23:02:45Z)
MambaDepth: Enhancing Long-range Dependency for Self-Supervised Fine-Structured Monocular Depth Estimation [0.0]
MambaDepth is a versatile network tailored for self-supervised depth estimation. MambaDepth combines the U-Net's effectiveness in self-supervised depth estimation with the advanced capabilities of Mamba. MambaDepth proves its superior generalization capacities on other datasets such as Make3D and Cityscapes.
arXiv Detail & Related papers (2024-06-06T22:08:48Z)
Vision Mamba: A Comprehensive Survey and Taxonomy [11.025533218561284]
State Space Model (SSM) is a mathematical model used to describe and analyze the behavior of dynamic systems. Based on the latest state-space models, Mamba merges time-varying parameters into SSMs and formulates a hardware-aware algorithm for efficient training and inference. Mamba is expected to become a new AI architecture that may outperform Transformer.
arXiv Detail & Related papers (2024-05-07T15:30:14Z)
MiM-ISTD: Mamba-in-Mamba for Efficient Infrared Small Target Detection [72.46396769642787]
We develop a nested structure, Mamba-in-Mamba (MiM-ISTD), for efficient infrared small target detection. MiM-ISTD is $8 times$ faster than the SOTA method and reduces GPU memory usage by 62.2$%$ when testing on $2048 times 2048$ images.
arXiv Detail & Related papers (2024-03-04T15:57:29Z)
Weak-Mamba-UNet: Visual Mamba Makes CNN and ViT Work Better for Scribble-based Medical Image Segmentation [13.748446415530937]
This paper introduces Weak-Mamba-UNet, an innovative weakly-supervised learning (WSL) framework for medical image segmentation. WSL strategy incorporates three distinct architecture but same symmetrical encoder-decoder networks: a CNN-based UNet for detailed local feature extraction, a Swin Transformer-based SwinUNet for comprehensive global context understanding, and a VMamba-based Mamba-UNet for efficient long-range dependency modeling. The effectiveness of Weak-Mamba-UNet is validated on a publicly available MRI cardiac segmentation dataset with processed annotations, where it surpasses the performance of a similar WSL
arXiv Detail & Related papers (2024-02-16T18:43:39Z)
Mamba-UNet: UNet-Like Pure Visual Mamba for Medical Image Segmentation [21.1787366866505]
We propose Mamba-UNet, a novel architecture that synergizes the U-Net in medical image segmentation with Mamba's capability. Mamba-UNet adopts a pure Visual Mamba (VMamba)-based encoder-decoder structure, infused with skip connections to preserve spatial information across different scales of the network.
arXiv Detail & Related papers (2024-02-07T18:33:04Z)
Swin-UMamba: Mamba-based UNet with ImageNet-based pretraining [85.08169822181685]
This paper introduces a novel Mamba-based model, Swin-UMamba, designed specifically for medical image segmentation tasks. Swin-UMamba demonstrates superior performance with a large margin compared to CNNs, ViTs, and latest Mamba-based models.
arXiv Detail & Related papers (2024-02-05T18:58:11Z)
ResNeSt: Split-Attention Networks [86.25490825631763]
We present a modularized architecture, which applies the channel-wise attention on different network branches to leverage their success in capturing cross-feature interactions and learning diverse representations. Our model, named ResNeSt, outperforms EfficientNet in accuracy and latency trade-off on image classification.
arXiv Detail & Related papers (2020-04-19T20:40:31Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.