FaRMamba: Frequency-based learning and Reconstruction aided Mamba for Medical Segmentation
- URL: http://arxiv.org/abs/2507.20056v1
- Date: Sat, 26 Jul 2025 20:41:53 GMT
- Title: FaRMamba: Frequency-based learning and Reconstruction aided Mamba for Medical Segmentation
- Authors: Ze Rong, ZiYue Zhao, Zhaoxin Wang, Lei Ma,
- Abstract summary: Vision Mamba employs one-dimensional causal state-space recurrence to efficiently model global dependencies.<n>Its patch tokenization and 1D serialization disrupt local pixel adjacency and impose a low-pass filtering effect.<n>We propose FaRMamba, a novel extension that explicitly addresses LHICD and 2D-SSD through two complementary modules.
- Score: 3.5790602918760586
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Accurate medical image segmentation remains challenging due to blurred lesion boundaries (LBA), loss of high-frequency details (LHD), and difficulty in modeling long-range anatomical structures (DC-LRSS). Vision Mamba employs one-dimensional causal state-space recurrence to efficiently model global dependencies, thereby substantially mitigating DC-LRSS. However, its patch tokenization and 1D serialization disrupt local pixel adjacency and impose a low-pass filtering effect, resulting in Local High-frequency Information Capture Deficiency (LHICD) and two-dimensional Spatial Structure Degradation (2D-SSD), which in turn exacerbate LBA and LHD. In this work, we propose FaRMamba, a novel extension that explicitly addresses LHICD and 2D-SSD through two complementary modules. A Multi-Scale Frequency Transform Module (MSFM) restores attenuated high-frequency cues by isolating and reconstructing multi-band spectra via wavelet, cosine, and Fourier transforms. A Self-Supervised Reconstruction Auxiliary Encoder (SSRAE) enforces pixel-level reconstruction on the shared Mamba encoder to recover full 2D spatial correlations, enhancing both fine textures and global context. Extensive evaluations on CAMUS echocardiography, MRI-based Mouse-cochlea, and Kvasir-Seg endoscopy demonstrate that FaRMamba consistently outperforms competitive CNN-Transformer hybrids and existing Mamba variants, delivering superior boundary accuracy, detail preservation, and global coherence without prohibitive computational overhead. This work provides a flexible frequency-aware framework for future segmentation models that directly mitigates core challenges in medical imaging.
Related papers
- Multimodal Visual Surrogate Compression for Alzheimer's Disease Classification [69.87877580725768]
Multimodal Visual Surrogate Compression (MVSC) learns to compress and adapt large 3D sMRI volumes into compact 2D features.<n>MVSC has two key components: a Volume Context that captures global cross-slice context under textual guidance, and an Adaptive Slice Fusion module that aggregates slice-level information in a text-enhanced, patch-wise manner.
arXiv Detail & Related papers (2026-01-29T13:05:46Z) - Frequency Error-Guided Under-sampling Optimization for Multi-Contrast MRI Reconstruction [24.246450246745905]
Multi-contrast MRI reconstruction has emerged as a promising direction by leveraging complementary information from fully-sampled reference scans.<n>Existing approaches suffer from three major limitations: (1) superficial reference fusion strategies, (2) insufficient utilization of the complementary information provided by the reference contrast, and (3) fixed under-sampling patterns.<n>We propose an efficient and interpretable frequency error-guided reconstruction framework to tackle these issues.
arXiv Detail & Related papers (2026-01-14T09:40:34Z) - HiFi-MambaV2: Hierarchical Shared-Routed MoE for High-Fidelity MRI Reconstruction [9.831136414187448]
HiFi-MambaV2 is a hierarchical shared-routed Mixture-of-Experts architecture that couples frequency decomposition with content-adaptive computation.<n>We show that HiFi-MambaV2 consistently outperforms CNN-, Transformer-, and prior Mamba-based baselines in PSNR, SSIM, and NMSE.
arXiv Detail & Related papers (2025-11-23T16:58:15Z) - Versatile and Efficient Medical Image Super-Resolution Via Frequency-Gated Mamba [10.69081892501522]
We propose FGMamba, a novel frequency-aware gated state-space model that unifies global dependency modeling and fine-detail enhancement into a lightweight architecture.<n>Our results validate the effectiveness of frequency-aware state-space modeling for scalable and accurate medical image enhancement.
arXiv Detail & Related papers (2025-10-31T09:12:12Z) - MedVSR: Medical Video Super-Resolution with Cross State-Space Propagation [63.38824041721275]
Low-resolution (LR) medical videos present unique challenges for video super-resolution (VSR) models.<n>We propose MedVSR, a tailored framework for medical VSR.<n>We show that MedVSR significantly outperforms existing VSR models in reconstruction performance and efficiency.
arXiv Detail & Related papers (2025-09-25T14:56:59Z) - HiFi-Mamba: Dual-Stream W-Laplacian Enhanced Mamba for High-Fidelity MRI Reconstruction [5.899756063964437]
High-Fidelity Mamba (HiFi-Mamba) is a novel dual-stream Mamba-based architecture for MRI reconstruction.<n>HiFi-Mamba consistently outperforms state-of-the-art CNN-based, Transformer-based, and other Mamba-based models in reconstruction accuracy.
arXiv Detail & Related papers (2025-08-07T10:08:18Z) - SAMba-UNet: Synergizing SAM2 and Mamba in UNet with Heterogeneous Aggregation for Cardiac MRI Segmentation [6.451534509235736]
This study proposes an innovative dual-encoder architecture named SAMba-UNet.<n>The framework achieves cross-modal feature collaborative learning by integrating the vision foundation model SAM2, the state-space model Mamba, and the classical UNet.<n> Experiments on the ACDC cardiac MRI dataset demonstrate that the proposed model achieves a Dice coefficient of 0.9103 and an HD95 boundary error of 1.0859 mm.
arXiv Detail & Related papers (2025-05-22T06:57:03Z) - DH-Mamba: Exploring Dual-domain Hierarchical State Space Models for MRI Reconstruction [6.341065683872316]
This paper explores selective state space models (Mamba) for efficient and effective MRI reconstruction.<n>Mamba typically flattens 2D images into distinct 1D sequences along rows and columns, disrupting k-space's unique spectrum.<n>Existing approaches adopt multi-directional lengthy scanning to unfold images at the pixel level, leading to long-range forgetting and high computational burden.
arXiv Detail & Related papers (2025-01-14T14:41:51Z) - Cross-Scan Mamba with Masked Training for Robust Spectral Imaging [51.557804095896174]
We propose the Cross-Scanning Mamba, named CS-Mamba, that employs a Spatial-Spectral SSM for global-local balanced context encoding.<n>Experiment results show that our CS-Mamba achieves state-of-the-art performance and the masked training method can better reconstruct smooth features to improve the visual quality.
arXiv Detail & Related papers (2024-08-01T15:14:10Z) - Enhanced Masked Image Modeling to Avoid Model Collapse on Multi-modal MRI Datasets [6.3467517115551875]
Masked image modeling (MIM) has shown promise in utilizing unlabeled data.<n>We analyze and address model collapse in two types: complete collapse and dimensional collapse.<n>We construct the enhanced MIM (E-MIM) with HMP and PBT module to avoid model collapse multi-modal MRI.
arXiv Detail & Related papers (2024-07-15T01:11:30Z) - MMR-Mamba: Multi-Modal MRI Reconstruction with Mamba and Spatial-Frequency Information Fusion [17.084083262801737]
We propose MMR-Mamba, a novel framework that thoroughly and efficiently integrates multi-modal features for MRI reconstruction.
Specifically, we first design a Target modality-guided Cross Mamba (TCM) module in the spatial domain.
Then, we introduce a Selective Frequency Fusion (SFF) module to efficiently integrate global information in the Fourier domain.
arXiv Detail & Related papers (2024-06-27T07:30:54Z) - Dual Hyperspectral Mamba for Efficient Spectral Compressive Imaging [102.35787741640749]
We propose a novel Dual Hyperspectral Mamba (DHM) to explore both global long-range dependencies and local contexts for efficient HSI reconstruction.
Specifically, our DHM consists of multiple dual hyperspectral S4 blocks (DHSBs) to restore original HSIs.
arXiv Detail & Related papers (2024-06-01T14:14:40Z) - Frequency-Assisted Mamba for Remote Sensing Image Super-Resolution [49.902047563260496]
We develop the first attempt to integrate the Vision State Space Model (Mamba) for remote sensing image (RSI) super-resolution.
To achieve better SR reconstruction, building upon Mamba, we devise a Frequency-assisted Mamba framework, dubbed FMSR.
Our FMSR features a multi-level fusion architecture equipped with the Frequency Selection Module (FSM), Vision State Space Module (VSSM), and Hybrid Gate Module (HGM)
arXiv Detail & Related papers (2024-05-08T11:09:24Z) - Enhancing Retinal Vascular Structure Segmentation in Images With a Novel
Design Two-Path Interactive Fusion Module Model [6.392575673488379]
We introduce Swin-Res-Net, a specialized module designed to enhance the precision of retinal vessel segmentation.
Swin-Res-Net utilizes the Swin transformer which uses shifted windows with displacement for partitioning.
Our proposed architecture produces outstanding results, either meeting or surpassing those of other published models.
arXiv Detail & Related papers (2024-03-03T01:36:11Z) - Dual-scale Enhanced and Cross-generative Consistency Learning for Semi-supervised Medical Image Segmentation [49.57907601086494]
Medical image segmentation plays a crucial role in computer-aided diagnosis.
We propose a novel Dual-scale Enhanced and Cross-generative consistency learning framework for semi-supervised medical image (DEC-Seg)
arXiv Detail & Related papers (2023-12-26T12:56:31Z) - Mutual Information-driven Triple Interaction Network for Efficient Image
Dehazing [54.168567276280505]
We propose a novel Mutual Information-driven Triple interaction Network (MITNet) for image dehazing.
The first stage, named amplitude-guided haze removal, aims to recover the amplitude spectrum of the hazy images for haze removal.
The second stage, named phase-guided structure refined, devotes to learning the transformation and refinement of the phase spectrum.
arXiv Detail & Related papers (2023-08-14T08:23:58Z) - Cross-Modal Causal Intervention for Medical Report Generation [107.76649943399168]
Radiology Report Generation (RRG) is essential for computer-aided diagnosis and medication guidance.<n> generating accurate lesion descriptions remains challenging due to spurious correlations from visual-linguistic biases.<n>We propose a two-stage framework named CrossModal Causal Representation Learning (CMCRL)<n> Experiments on IU-Xray and MIMIC-CXR show that our CMCRL pipeline significantly outperforms state-of-the-art methods.
arXiv Detail & Related papers (2023-03-16T07:23:55Z) - AliasNet: Alias Artefact Suppression Network for Accelerated
Phase-Encode MRI [4.752084030395196]
Sparse reconstruction is an important aspect of MRI, helping to reduce acquisition time and improve spatial-temporal resolution.
Experiments conducted on retrospectively under-sampled brain and knee data demonstrate that combination of the proposed 1D AliasNet modules with existing 2D deep learned (DL) recovery techniques leads to an improvement in image quality.
arXiv Detail & Related papers (2023-02-17T13:16:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.