Rotate to Scan: UNet-like Mamba with Triplet SSM Module for Medical Image Segmentation
- URL: http://arxiv.org/abs/2403.17701v4
- Date: Fri, 3 May 2024 10:12:09 GMT
- Title: Rotate to Scan: UNet-like Mamba with Triplet SSM Module for Medical Image Segmentation
- Authors: Hao Tang, Lianglun Cheng, Guoheng Huang, Zhengguang Tan, Junhao Lu, Kaihong Wu,
- Abstract summary: We propose Triplet Mamba-UNet as a new type of image segmentation network.
Our model achieves a one-third reduction in parameters compared to the previous VM-UNet.
- Score: 8.686237221268584
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Image segmentation holds a vital position in the realms of diagnosis and treatment within the medical domain. Traditional convolutional neural networks (CNNs) and Transformer models have made significant advancements in this realm, but they still encounter challenges because of limited receptive field or high computing complexity. Recently, State Space Models (SSMs), particularly Mamba and its variants, have demonstrated notable performance in the field of vision. However, their feature extraction methods may not be sufficiently effective and retain some redundant structures, leaving room for parameter reduction. Motivated by previous spatial and channel attention methods, we propose Triplet Mamba-UNet. The method leverages residual VSS Blocks to extract intensive contextual features, while Triplet SSM is employed to fuse features across spatial and channel dimensions. We conducted experiments on ISIC17, ISIC18, CVC-300, CVC-ClinicDB, Kvasir-SEG, CVC-ColonDB, and Kvasir-Instrument datasets, demonstrating the superior segmentation performance of our proposed TM-UNet. Additionally, compared to the previous VM-UNet, our model achieves a one-third reduction in parameters.
Related papers
- EM-Net: Efficient Channel and Frequency Learning with Mamba for 3D Medical Image Segmentation [3.6813810514531085]
We introduce a novel 3D medical image segmentation model called EM-Net. Inspired by its success, we introduce a novel Mamba-based 3D medical image segmentation model called EM-Net.
Comprehensive experiments on two challenging multi-organ datasets with other state-of-the-art (SOTA) algorithms show that our method exhibits better segmentation accuracy while requiring nearly half the parameter size of SOTA models and 2x faster training speed.
arXiv Detail & Related papers (2024-09-26T09:34:33Z) - MambaClinix: Hierarchical Gated Convolution and Mamba-Based U-Net for Enhanced 3D Medical Image Segmentation [6.673169053236727]
We propose MambaClinix, a novel U-shaped architecture for medical image segmentation.
MambaClinix integrates a hierarchical gated convolutional network with Mamba in an adaptive stage-wise framework.
Our results show that MambaClinix achieves high segmentation accuracy while maintaining low model complexity.
arXiv Detail & Related papers (2024-09-19T07:51:14Z) - GroupMamba: Parameter-Efficient and Accurate Group Visual State Space Model [66.35608254724566]
State-space models (SSMs) have showcased effective performance in modeling long-range dependencies with subquadratic complexity.
However, pure SSM-based models still face challenges related to stability and achieving optimal performance on computer vision tasks.
Our paper addresses the challenges of scaling SSM-based models for computer vision, particularly the instability and inefficiency of large model sizes.
arXiv Detail & Related papers (2024-07-18T17:59:58Z) - Mamba-in-Mamba: Centralized Mamba-Cross-Scan in Tokenized Mamba Model for Hyperspectral Image Classification [4.389334324926174]
This study introduces the innovative Mamba-in-Mamba (MiM) architecture for HSI classification, the first attempt of deploying State Space Model (SSM) in this task.
MiM model includes 1) A novel centralized Mamba-Cross-Scan (MCS) mechanism for transforming images into sequence-data, 2) A Tokenized Mamba (T-Mamba) encoder, and 3) A Weighted MCS Fusion (WMF) module.
Experimental results from three public HSI datasets demonstrate that our method outperforms existing baselines and state-of-the-art approaches.
arXiv Detail & Related papers (2024-05-20T13:19:02Z) - VM-UNET-V2 Rethinking Vision Mamba UNet for Medical Image Segmentation [8.278068663433261]
We propose Vison Mamba-UNetV2, inspired by Mamba architecture, to capture contextual information in images.
VM-UNetV2 exhibits competitive performance in medical image segmentation tasks.
We conduct comprehensive experiments on the ISIC17, ISIC18, CVC-300, CVC-ClinicDB, Kvasir CVC-ColonDB and ETIS-LaribPolypDB public datasets.
arXiv Detail & Related papers (2024-03-14T08:12:39Z) - SDR-Former: A Siamese Dual-Resolution Transformer for Liver Lesion
Classification Using 3D Multi-Phase Imaging [59.78761085714715]
This study proposes a novel Siamese Dual-Resolution Transformer (SDR-Former) framework for liver lesion classification.
The proposed framework has been validated through comprehensive experiments on two clinical datasets.
To support the scientific community, we are releasing our extensive multi-phase MR dataset for liver lesion analysis to the public.
arXiv Detail & Related papers (2024-02-27T06:32:56Z) - VM-UNet: Vision Mamba UNet for Medical Image Segmentation [2.3876474175791302]
We propose a U-shape architecture model for medical image segmentation, named Vision Mamba UNet (VM-UNet)
We conduct comprehensive experiments on the ISIC17, ISIC18, and Synapse datasets, and the results indicate that VM-UNet performs competitively in medical image segmentation tasks.
arXiv Detail & Related papers (2024-02-04T13:37:21Z) - 3DSAM-adapter: Holistic adaptation of SAM from 2D to 3D for promptable tumor segmentation [52.699139151447945]
We propose a novel adaptation method for transferring the segment anything model (SAM) from 2D to 3D for promptable medical image segmentation.
Our model can outperform domain state-of-the-art medical image segmentation models on 3 out of 4 tasks, specifically by 8.25%, 29.87%, and 10.11% for kidney tumor, pancreas tumor, colon cancer segmentation, and achieve similar performance for liver tumor segmentation.
arXiv Detail & Related papers (2023-06-23T12:09:52Z) - Coarse-to-Fine Sparse Transformer for Hyperspectral Image Reconstruction [138.04956118993934]
We propose a novel Transformer-based method, coarse-to-fine sparse Transformer (CST)
CST embedding HSI sparsity into deep learning for HSI reconstruction.
In particular, CST uses our proposed spectra-aware screening mechanism (SASM) for coarse patch selecting. Then the selected patches are fed into our customized spectra-aggregation hashing multi-head self-attention (SAH-MSA) for fine pixel clustering and self-similarity capturing.
arXiv Detail & Related papers (2022-03-09T16:17:47Z) - Automatic size and pose homogenization with spatial transformer network
to improve and accelerate pediatric segmentation [51.916106055115755]
We propose a new CNN architecture that is pose and scale invariant thanks to the use of Spatial Transformer Network (STN)
Our architecture is composed of three sequential modules that are estimated together during training.
We test the proposed method in kidney and renal tumor segmentation on abdominal pediatric CT scanners.
arXiv Detail & Related papers (2021-07-06T14:50:03Z) - FetReg: Placental Vessel Segmentation and Registration in Fetoscopy
Challenge Dataset [57.30136148318641]
Fetoscopy laser photocoagulation is a widely used procedure for the treatment of Twin-to-Twin Transfusion Syndrome (TTTS)
This may lead to increased procedural time and incomplete ablation, resulting in persistent TTTS.
Computer-assisted intervention may help overcome these challenges by expanding the fetoscopic field of view through video mosaicking and providing better visualization of the vessel network.
We present a large-scale multi-centre dataset for the development of generalized and robust semantic segmentation and video mosaicking algorithms for the fetal environment with a focus on creating drift-free mosaics from long duration fetoscopy videos.
arXiv Detail & Related papers (2021-06-10T17:14:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.