MambaGlue: Fast and Robust Local Feature Matching With Mamba
- URL: http://arxiv.org/abs/2502.00462v1
- Date: Sat, 01 Feb 2025 15:43:03 GMT
- Title: MambaGlue: Fast and Robust Local Feature Matching With Mamba
- Authors: Kihwan Ryoo, Hyungtae Lim, Hyun Myung,
- Abstract summary: We propose a novel Mamba-based local feature matching approach, called MambaGlue.<n>Mamba is an emerging state-of-the-art architecture rapidly gaining recognition for its superior speed in both training and inference.<n>Our MambaGlue achieves a balance between robustness and efficiency in real-world applications.
- Score: 9.397265252815115
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: In recent years, robust matching methods using deep learning-based approaches have been actively studied and improved in computer vision tasks. However, there remains a persistent demand for both robust and fast matching techniques. To address this, we propose a novel Mamba-based local feature matching approach, called MambaGlue, where Mamba is an emerging state-of-the-art architecture rapidly gaining recognition for its superior speed in both training and inference, and promising performance compared with Transformer architectures. In particular, we propose two modules: a) MambaAttention mixer to simultaneously and selectively understand the local and global context through the Mamba-based self-attention structure and b) deep confidence score regressor, which is a multi-layer perceptron (MLP)-based architecture that evaluates a score indicating how confidently matching predictions correspond to the ground-truth correspondences. Consequently, our MambaGlue achieves a balance between robustness and efficiency in real-world applications. As verified on various public datasets, we demonstrate that our MambaGlue yields a substantial performance improvement over baseline approaches while maintaining fast inference speed. Our code will be available on https://github.com/url-kaist/MambaGlue
Related papers
- LaTIM: Measuring Latent Token-to-Token Interactions in Mamba Models [1.249658136570244]
State space models (SSMs) have emerged as an efficient alternative to transformers for long-context sequence modeling.
SSMs lack the interpretability tools that have been crucial for understanding and improving attention-based architectures.
We introduce LaTIM, a novel token-level decomposition method for both Mamba-1 and Mamba-2 that enables fine-grained interpretability.
arXiv Detail & Related papers (2025-02-21T17:33:59Z) - TransMamba: Fast Universal Architecture Adaption from Transformers to Mamba [88.31117598044725]
We explore cross-architecture training to transfer the ready knowledge in existing Transformer models to alternative architecture Mamba, termed TransMamba.
Our approach employs a two-stage strategy to expedite training new Mamba models, ensuring effectiveness in across uni-modal and cross-modal tasks.
For cross-modal learning, we propose a cross-Mamba module that integrates language awareness into Mamba's visual features, enhancing the cross-modal interaction capabilities of Mamba architecture.
arXiv Detail & Related papers (2025-02-21T01:22:01Z) - From Markov to Laplace: How Mamba In-Context Learns Markov Chains [36.22373318908893]
We study in-context learning on Markov chains and uncover a surprising phenomenon.
Unlike transformers, even a single-layer Mamba efficiently learns the in-context Laplacian smoothing estimator.
These theoretical insights align strongly with empirical results and represent the first formal connection between Mamba and optimal statistical estimators.
arXiv Detail & Related papers (2025-02-14T14:13:55Z) - Mamba-SEUNet: Mamba UNet for Monaural Speech Enhancement [54.427965535613886]
Mamba, as a novel state-space model (SSM), has gained widespread application in natural language processing and computer vision.<n>In this work, we introduce Mamba-SEUNet, an innovative architecture that integrates Mamba with U-Net for SE tasks.
arXiv Detail & Related papers (2024-12-21T13:43:51Z) - MobileMamba: Lightweight Multi-Receptive Visual Mamba Network [51.33486891724516]
Previous research on lightweight models has primarily focused on CNNs and Transformer-based designs.
We propose the MobileMamba framework, which balances efficiency and performance.
MobileMamba achieves up to 83.6% on Top-1, surpassing existing state-of-the-art methods.
arXiv Detail & Related papers (2024-11-24T18:01:05Z) - ReMamba: Equip Mamba with Effective Long-Sequence Modeling [50.530839868893786]
We propose ReMamba, which enhances Mamba's ability to comprehend long contexts.<n>ReMamba incorporates selective compression and adaptation techniques within a two-stage re-forward process.
arXiv Detail & Related papers (2024-08-28T02:47:27Z) - Neural Architecture Search based Global-local Vision Mamba for Palm-Vein Recognition [42.4241558556591]
We propose a hybrid network structure named Global-local Vision Mamba (GLVM) to learn the local correlations in images explicitly and global dependencies among tokens for vein feature representation.
Thirdly, to learn the complementary features, we propose a ConvMamba block consisting of three branches, named Multi-head Mamba branch (MHMamba), Feature Iteration Unit branch (FIU), and Convolutional Neural Network (CNN) branch.
Finally, a Globallocal Alternate Neural Architecture Search (GLNAS) method is proposed to search the optimal architecture of GLVM alternately with the evolutionary algorithm.
arXiv Detail & Related papers (2024-08-11T10:42:22Z) - MambaVision: A Hybrid Mamba-Transformer Vision Backbone [54.965143338206644]
We propose a novel hybrid Mamba-Transformer backbone, MambaVision, specifically tailored for vision applications.
We show that equipping the Mamba architecture with self-attention blocks in the final layers greatly improves its capacity to capture long-range spatial dependencies.
For classification on the ImageNet-1K dataset, MambaVision variants achieve state-of-the-art (SOTA) performance in terms of both Top-1 accuracy and throughput.
arXiv Detail & Related papers (2024-07-10T23:02:45Z) - MiM-ISTD: Mamba-in-Mamba for Efficient Infrared Small Target Detection [72.46396769642787]
We develop a nested structure, Mamba-in-Mamba (MiM-ISTD), for efficient infrared small target detection.
MiM-ISTD is $8 times$ faster than the SOTA method and reduces GPU memory usage by 62.2$%$ when testing on $2048 times 2048$ images.
arXiv Detail & Related papers (2024-03-04T15:57:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.