Related papers: JamMa: Ultra-lightweight Local Feature Matching with Joint Mamba

JamMa: Ultra-lightweight Local Feature Matching with Joint Mamba

URL: http://arxiv.org/abs/2503.03437v1
Date: Wed, 05 Mar 2025 12:12:51 GMT
Title: JamMa: Ultra-lightweight Local Feature Matching with Joint Mamba
Authors: Xiaoyong Lu, Songlin Du,
Abstract summary: We propose an ultra-lightweight Mamba-based matcher, named JamMa, which converges on a single GPU and achieves an impressive performance-efficiency balance in inference.<n>To unlock the potential of Mamba for feature matching, we propose Joint Mamba with a scan-merge strategy named JEGO, which enables: (1) Joint scan of two images to achieve high-frequency mutual interaction, (2) Efficient scan with skip steps to reduce sequence length, (3) Global receptive field, and (4) Omnidirectional feature representation.
Score: 8.878053726388075
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Existing state-of-the-art feature matchers capture long-range dependencies with Transformers but are hindered by high spatial complexity, leading to demanding training and highlatency inference. Striking a better balance between performance and efficiency remains a challenge in feature matching. Inspired by the linear complexity O(N) of Mamba, we propose an ultra-lightweight Mamba-based matcher, named JamMa, which converges on a single GPU and achieves an impressive performance-efficiency balance in inference. To unlock the potential of Mamba for feature matching, we propose Joint Mamba with a scan-merge strategy named JEGO, which enables: (1) Joint scan of two images to achieve high-frequency mutual interaction, (2) Efficient scan with skip steps to reduce sequence length, (3) Global receptive field, and (4) Omnidirectional feature representation. With the above properties, the JEGO strategy significantly outperforms the scan-merge strategies proposed in VMamba and EVMamba in the feature matching task. Compared to attention-based sparse and semi-dense matchers, JamMa demonstrates a superior balance between performance and efficiency, delivering better performance with less than 50% of the parameters and FLOPs.

Related papers

VMatcher: State-Space Semi-Dense Local Feature Matching [0.0]
VMatcher is a hybrid Mamba-Transformer network for semi-dense feature matching between image pairs.<n>VMatcher integrates Mamba's highly efficient long-sequence processing with the Transformer's attention mechanism.
arXiv Detail & Related papers (2025-07-31T09:39:16Z)
Routing Mamba: Scaling State Space Models with Mixture-of-Experts Projection [88.47928738482719]
Linear State Space Models (SSMs) offer remarkable performance gains in sequence modeling.<n>Recent advances, such as Mamba, further enhance SSMs with input-dependent gating and hardware-aware implementations.<n>We introduce Routing Mamba (RoM), a novel approach that scales SSM parameters using sparse mixtures of linear projection experts.
arXiv Detail & Related papers (2025-06-22T19:26:55Z)
MambaGlue: Fast and Robust Local Feature Matching With Mamba [9.397265252815115]
We propose a novel Mamba-based local feature matching approach, called MambaGlue.<n>Mamba is an emerging state-of-the-art architecture rapidly gaining recognition for its superior speed in both training and inference.<n>Our MambaGlue achieves a balance between robustness and efficiency in real-world applications.
arXiv Detail & Related papers (2025-02-01T15:43:03Z)
Mixture-of-Mamba: Enhancing Multi-Modal State-Space Models with Modality-Aware Sparsity [56.0251572416922]
State Space Models (SSMs) have emerged as efficient alternatives to Transformers for sequential modeling.<n>We propose a novel SSM architecture that introduces modality-aware sparsity through modality-specific parameterization of the Mamba block.<n>We evaluate Mixture-of-Mamba across three multi-modal pretraining settings.
arXiv Detail & Related papers (2025-01-27T18:35:05Z)
Detail Matters: Mamba-Inspired Joint Unfolding Network for Snapshot Spectral Compressive Imaging [40.80197280147993]
We propose a Mamba-inspired Joint Unfolding Network (MiJUN) to overcome the inherent nonlinear and ill-posed characteristics of HSI reconstruction.<n>We introduce an accelerated unfolding network scheme, which reduces the reliance on initial optimization stages.<n>We refine the scanning strategy with Mamba by integrating the tensor mode-$k$ unfolding into the Mamba network.
arXiv Detail & Related papers (2025-01-02T13:56:23Z)
Manta: Enhancing Mamba for Few-Shot Action Recognition of Long Sub-Sequence [33.38031167119682]
In few-shot action recognition, long sub-sequences of video naturally express entire actions more effectively.<n>Recent Mamba demonstrates efficiency in modeling long sequences, but directly applying Mamba to FSAR overlooks the importance of local feature modeling and alignment.<n>We propose a Matryoshka MAmba and CoNtrasTive LeArning framework (Manta) to solve these challenges.<n>Manta achieves new state-of-the-art performance on prominent benchmarks, including SSv2, Kinetics, UCF101, and HMDB51.
arXiv Detail & Related papers (2024-12-10T13:03:42Z)
MobileMamba: Lightweight Multi-Receptive Visual Mamba Network [51.33486891724516]
Previous research on lightweight models has primarily focused on CNNs and Transformer-based designs. We propose the MobileMamba framework, which balances efficiency and performance. MobileMamba achieves up to 83.6% on Top-1, surpassing existing state-of-the-art methods.
arXiv Detail & Related papers (2024-11-24T18:01:05Z)
Hi-Mamba: Hierarchical Mamba for Efficient Image Super-Resolution [42.259283231048954]
State Space Models (SSM) have shown strong representation ability in modeling long-range dependency with linear complexity. We propose a novel Hierarchical Mamba network, namely, Hi-Mamba, for image super-resolution (SR)
arXiv Detail & Related papers (2024-10-14T04:15:04Z)
ReMamba: Equip Mamba with Effective Long-Sequence Modeling [50.530839868893786]
We propose ReMamba, which enhances Mamba's ability to comprehend long contexts. ReMamba incorporates selective compression and adaptation techniques within a two-stage re-forward process.
arXiv Detail & Related papers (2024-08-28T02:47:27Z)
SIGMA: Selective Gated Mamba for Sequential Recommendation [56.85338055215429]
Mamba, a recent advancement, has exhibited exceptional performance in time series prediction.<n>We introduce a new framework named Selective Gated Mamba ( SIGMA) for Sequential Recommendation.<n>Our results indicate that SIGMA outperforms current models on five real-world datasets.
arXiv Detail & Related papers (2024-08-21T09:12:59Z)
LaMamba-Diff: Linear-Time High-Fidelity Diffusion Models Based on Local Attention and Mamba [54.85262314960038]
Local Attentional Mamba blocks capture both global contexts and local details with linear complexity. Our model exhibits exceptional scalability and surpasses the performance of DiT across various model scales on ImageNet at 256x256 resolution. Compared to state-of-the-art diffusion models on ImageNet 256x256 and 512x512, our largest model presents notable advantages, such as a reduction of up to 62% GFLOPs.
arXiv Detail & Related papers (2024-08-05T16:39:39Z)
EfficientVMamba: Atrous Selective Scan for Light Weight Visual Mamba [19.062950348441426]
This work proposes to explore the potential of visual state space models in light-weight model design and introduce a novel efficient model variant dubbed EfficientVMamba. Our EfficientVMamba integrates a atrous-based selective scan approach by efficient skip sampling, constituting building blocks designed to harness both global and local representational features. Experimental results show that, EfficientVMamba scales down the computational complexity while yields competitive results across a variety of vision tasks.
arXiv Detail & Related papers (2024-03-15T02:48:47Z)
MiM-ISTD: Mamba-in-Mamba for Efficient Infrared Small Target Detection [72.46396769642787]
We develop a nested structure, Mamba-in-Mamba (MiM-ISTD), for efficient infrared small target detection. MiM-ISTD is $8 times$ faster than the SOTA method and reduces GPU memory usage by 62.2$%$ when testing on $2048 times 2048$ images.
arXiv Detail & Related papers (2024-03-04T15:57:29Z)

This list is automatically generated from the titles and abstracts of the papers in this site.