Versatile and Efficient Medical Image Super-Resolution Via Frequency-Gated Mamba
- URL: http://arxiv.org/abs/2510.27296v1
- Date: Fri, 31 Oct 2025 09:12:12 GMT
- Title: Versatile and Efficient Medical Image Super-Resolution Via Frequency-Gated Mamba
- Authors: Wenfeng Huang, Xiangyun Liao, Wei Cao, Wenjing Jia, Weixin Si,
- Abstract summary: We propose FGMamba, a novel frequency-aware gated state-space model that unifies global dependency modeling and fine-detail enhancement into a lightweight architecture.<n>Our results validate the effectiveness of frequency-aware state-space modeling for scalable and accurate medical image enhancement.
- Score: 10.69081892501522
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Medical image super-resolution (SR) is essential for enhancing diagnostic accuracy while reducing acquisition cost and scanning time. However, modeling both long-range anatomical structures and fine-grained frequency details with low computational overhead remains challenging. We propose FGMamba, a novel frequency-aware gated state-space model that unifies global dependency modeling and fine-detail enhancement into a lightweight architecture. Our method introduces two key innovations: a Gated Attention-enhanced State-Space Module (GASM) that integrates efficient state-space modeling with dual-branch spatial and channel attention, and a Pyramid Frequency Fusion Module (PFFM) that captures high-frequency details across multiple resolutions via FFT-guided fusion. Extensive evaluations across five medical imaging modalities (Ultrasound, OCT, MRI, CT, and Endoscopic) demonstrate that FGMamba achieves superior PSNR/SSIM while maintaining a compact parameter footprint ($<$0.75M), outperforming CNN-based and Transformer-based SOTAs. Our results validate the effectiveness of frequency-aware state-space modeling for scalable and accurate medical image enhancement.
Related papers
- Accelerating 3D Photoacoustic Computed Tomography with End-to-End Physics-Aware Neural Operators [74.65171736966131]
Photoacoustic computed tomography (PACT) combines optical contrast with ultrasonic resolution, achieving deep-tissue imaging beyond the optical diffusion limit.<n>Current implementations require dense transducer arrays and prolonged acquisition times, limiting clinical translation.<n>We introduce Pano, an end-to-end physics-aware model that directly learns the inverse acoustic mapping from sensor measurements to volumetric reconstructions.
arXiv Detail & Related papers (2025-09-11T23:12:55Z) - SpectMamba: Integrating Frequency and State Space Models for Enhanced Medical Image Detection [11.43227481199105]
We present SpectMamba, the first Mamba-based architecture designed for medical image detection.<n>A key component of SpectMamba is the Hybrid Spatial-Frequency Attention (HSFA) block, which separately learns high- and low-frequency features.<n>We show that SpectMamba achieves state-of-the-art performance while being both effective and efficient across various medical image detection tasks.
arXiv Detail & Related papers (2025-09-01T02:56:45Z) - FaRMamba: Frequency-based learning and Reconstruction aided Mamba for Medical Segmentation [3.5790602918760586]
Vision Mamba employs one-dimensional causal state-space recurrence to efficiently model global dependencies.<n>Its patch tokenization and 1D serialization disrupt local pixel adjacency and impose a low-pass filtering effect.<n>We propose FaRMamba, a novel extension that explicitly addresses LHICD and 2D-SSD through two complementary modules.
arXiv Detail & Related papers (2025-07-26T20:41:53Z) - FADPNet: Frequency-Aware Dual-Path Network for Face Super-Resolution [70.61549422952193]
Face super-resolution (FSR) under limited computational costs remains an open problem.<n>Existing approaches typically treat all facial pixels equally, resulting in suboptimal allocation of computational resources.<n>We propose FADPNet, a Frequency-Aware Dual-Path Network that decomposes facial features into low- and high-frequency components.
arXiv Detail & Related papers (2025-06-17T02:33:42Z) - Structure-Accurate Medical Image Translation via Dynamic Frequency Balance and Knowledge Guidance [60.33892654669606]
Diffusion model is a powerful strategy to synthesize the required medical images.<n>Existing approaches still suffer from the problem of anatomical structure distortion due to the overfitting of high-frequency information.<n>We propose a novel method based on dynamic frequency balance and knowledge guidance.
arXiv Detail & Related papers (2025-04-13T05:48:13Z) - Delta-WKV: A Novel Meta-in-Context Learner for MRI Super-Resolution [0.7864304771129751]
We propose Delta-WKV, a novel MRI super-resolution model that combines Meta-in-Context Learning (MiCL) with the Delta rule to better recognize both local and global patterns in MRI images.<n>Tests show that Delta-WKV outperforms existing methods, improving PSNR by 0.06 dB and SSIM by 0.001, while reducing training and inference times by over 15%.
arXiv Detail & Related papers (2025-02-28T08:49:46Z) - FgC2F-UDiff: Frequency-guided and Coarse-to-fine Unified Diffusion Model for Multi-modality Missing MRI Synthesis [6.475175425060296]
We propose a novel unified synthesis model, the Frequency-guided and Coarse-to-fine Unified Diffusion Model (FgC2F-UDiff)
arXiv Detail & Related papers (2025-01-07T04:42:45Z) - A Unified Model for Compressed Sensing MRI Across Undersampling Patterns [69.19631302047569]
We propose a unified MRI reconstruction model robust to various measurement undersampling patterns and image resolutions.<n>Our model improves SSIM by 11% and PSNR by 4 dB over a state-of-the-art CNN (End-to-End VarNet) with 600$times$ faster inference than diffusion methods.
arXiv Detail & Related papers (2024-10-05T20:03:57Z) - NT-ViT: Neural Transcoding Vision Transformers for EEG-to-fMRI Synthesis [7.542742087154667]
This paper introduces the Neural Transcoding Vision Transformer (modelname)
modelname is a generative model designed to estimate high-resolution functional Magnetic Resonance Imaging (fMRI) samples from simultaneous Electroencephalography (EEG) data.
arXiv Detail & Related papers (2024-09-18T09:38:08Z) - On Sensitivity and Robustness of Normalization Schemes to Input
Distribution Shifts in Automatic MR Image Diagnosis [58.634791552376235]
Deep Learning (DL) models have achieved state-of-the-art performance in diagnosing multiple diseases using reconstructed images as input.
DL models are sensitive to varying artifacts as it leads to changes in the input data distribution between the training and testing phases.
We propose to use other normalization techniques, such as Group Normalization and Layer Normalization, to inject robustness into model performance against varying image artifacts.
arXiv Detail & Related papers (2023-06-23T03:09:03Z) - Preservation of High Frequency Content for Deep Learning-Based Medical
Image Classification [74.84221280249876]
An efficient analysis of large amounts of chest radiographs can aid physicians and radiologists.
We propose a novel Discrete Wavelet Transform (DWT)-based method for the efficient identification and encoding of visual information.
arXiv Detail & Related papers (2022-05-08T15:29:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.