MambaFlow: A Mamba-Centric Architecture for End-to-End Optical Flow Estimation
- URL: http://arxiv.org/abs/2503.07046v1
- Date: Mon, 10 Mar 2025 08:33:54 GMT
- Title: MambaFlow: A Mamba-Centric Architecture for End-to-End Optical Flow Estimation
- Authors: Juntian Du, Yuan Sun, Zhihu Zhou, Pinyi Chen, Runzhe Zhang, Keji Mao,
- Abstract summary: Proposal is the first Mamba-centric architecture for end-to-end optical flow estimation.<n>MambaFlow achieves an EPE all of 1.60, surpassing the leading 1.74 of GMFlow.<n>MambaFlow significantly improves inference speed with a runtime of 0.113 seconds, making it 18% faster than GMFlow.
- Score: 1.5828557827183316
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Optical flow estimation based on deep learning, particularly the recently proposed top-performing methods that incorporate the Transformer, has demonstrated impressive performance, due to the Transformer's powerful global modeling capabilities. However, the quadratic computational complexity of attention mechanism in the Transformers results in time-consuming training and inference. To alleviate these issues, we propose a novel MambaFlow framework that leverages the high accuracy and efficiency of Mamba architecture to capture features with local correlation while preserving its global information, achieving remarkable performance. To the best of our knowledge, the proposed method is the first Mamba-centric architecture for end-to-end optical flow estimation. It comprises two primary contributed components, both of which are Mamba-centric: a feature enhancement Mamba (FEM) module designed to optimize feature representation quality and a flow propagation Mamba (FPM) module engineered to address occlusion issues by facilitate effective flow information dissemination. Extensive experiments demonstrate that our approach achieves state-of-the-art results, despite encountering occluded regions. On the Sintel benchmark, MambaFlow achieves an EPE all of 1.60, surpassing the leading 1.74 of GMFlow. Additionally, MambaFlow significantly improves inference speed with a runtime of 0.113 seconds, making it 18% faster than GMFlow. The source code will be made publicly available upon acceptance of the paper.
Related papers
- MambaGlue: Fast and Robust Local Feature Matching With Mamba [9.397265252815115]
We propose a novel Mamba-based local feature matching approach, called MambaGlue.<n>Mamba is an emerging state-of-the-art architecture rapidly gaining recognition for its superior speed in both training and inference.<n>Our MambaGlue achieves a balance between robustness and efficiency in real-world applications.
arXiv Detail & Related papers (2025-02-01T15:43:03Z) - FlowMamba: Learning Point Cloud Scene Flow with Global Motion Propagation [14.293476753863272]
We propose a novel global-aware scene flow estimation network with global motion propagation, named FlowMamba.<n>FlowMamba is the first method to achieve millimeter-level prediction accuracy in FlyingThings3D and KITTI datasets.
arXiv Detail & Related papers (2024-12-23T08:03:59Z) - Mamba-SEUNet: Mamba UNet for Monaural Speech Enhancement [54.427965535613886]
Mamba, as a novel state-space model (SSM), has gained widespread application in natural language processing and computer vision.
In this work, we introduce Mamba-SEUNet, an innovative architecture that integrates Mamba with U-Net for SE tasks.
arXiv Detail & Related papers (2024-12-21T13:43:51Z) - MobileMamba: Lightweight Multi-Receptive Visual Mamba Network [51.33486891724516]
Previous research on lightweight models has primarily focused on CNNs and Transformer-based designs.
We propose the MobileMamba framework, which balances efficiency and performance.
MobileMamba achieves up to 83.6% on Top-1, surpassing existing state-of-the-art methods.
arXiv Detail & Related papers (2024-11-24T18:01:05Z) - Mamba for Scalable and Efficient Personalized Recommendations [0.135975510645475]
We present a novel hybrid model that replaces Transformer layers with Mamba layers within the FT-Transformer architecture.
We evaluate FT-Mamba in comparison to a traditional Transformer-based model within a Two-Tower architecture on three datasets.
arXiv Detail & Related papers (2024-09-11T14:26:14Z) - ReMamba: Equip Mamba with Effective Long-Sequence Modeling [50.530839868893786]
We propose ReMamba, which enhances Mamba's ability to comprehend long contexts.<n>ReMamba incorporates selective compression and adaptation techniques within a two-stage re-forward process.
arXiv Detail & Related papers (2024-08-28T02:47:27Z) - SIGMA: Selective Gated Mamba for Sequential Recommendation [56.85338055215429]
Mamba, a recent advancement, has exhibited exceptional performance in time series prediction.<n>We introduce a new framework named Selective Gated Mamba ( SIGMA) for Sequential Recommendation.<n>Our results indicate that SIGMA outperforms current models on five real-world datasets.
arXiv Detail & Related papers (2024-08-21T09:12:59Z) - MambaUIE&SR: Unraveling the Ocean's Secrets with Only 2.8 GFLOPs [1.7648680700685022]
Underwater Image Enhancement (UIE) techniques aim to address the problem of underwater image degradation due to light absorption and scattering.
Recent years, both Convolution Neural Network (CNN)-based and Transformer-based methods have been widely explored.
MambaUIE is able to efficiently synthesize global and local information and maintains a very small number of parameters with high accuracy.
arXiv Detail & Related papers (2024-04-22T05:12:11Z) - MiM-ISTD: Mamba-in-Mamba for Efficient Infrared Small Target Detection [72.46396769642787]
We develop a nested structure, Mamba-in-Mamba (MiM-ISTD), for efficient infrared small target detection.
MiM-ISTD is $8 times$ faster than the SOTA method and reduces GPU memory usage by 62.2$%$ when testing on $2048 times 2048$ images.
arXiv Detail & Related papers (2024-03-04T15:57:29Z) - GMFlow: Learning Optical Flow via Global Matching [124.57850500778277]
We propose a GMFlow framework for learning optical flow estimation.
It consists of three main components: a customized Transformer for feature enhancement, a correlation and softmax layer for global feature matching, and a self-attention layer for flow propagation.
Our new framework outperforms 32-iteration RAFT's performance on the challenging Sintel benchmark.
arXiv Detail & Related papers (2021-11-26T18:59:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.