U-Mamba: Enhancing Long-range Dependency for Biomedical Image
Segmentation
- URL: http://arxiv.org/abs/2401.04722v1
- Date: Tue, 9 Jan 2024 18:53:20 GMT
- Title: U-Mamba: Enhancing Long-range Dependency for Biomedical Image
Segmentation
- Authors: Jun Ma, Feifei Li, and Bo Wang
- Abstract summary: We introduce U-Mamba, a general-purpose network for biomedical image segmentation.
Inspired by the State Space Sequence Models (SSMs), a new family of deep sequence models, we design a hybrid CNN-SSM block.
We conduct experiments on four diverse tasks, including the 3D abdominal organ segmentation in CT and MR images, instrument segmentation in endoscopy images, and cell segmentation in microscopy images.
- Score: 10.083902382768406
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Convolutional Neural Networks (CNNs) and Transformers have been the most
popular architectures for biomedical image segmentation, but both of them have
limited ability to handle long-range dependencies because of inherent locality
or computational complexity. To address this challenge, we introduce U-Mamba, a
general-purpose network for biomedical image segmentation. Inspired by the
State Space Sequence Models (SSMs), a new family of deep sequence models known
for their strong capability in handling long sequences, we design a hybrid
CNN-SSM block that integrates the local feature extraction power of
convolutional layers with the abilities of SSMs for capturing the long-range
dependency. Moreover, U-Mamba enjoys a self-configuring mechanism, allowing it
to automatically adapt to various datasets without manual intervention. We
conduct extensive experiments on four diverse tasks, including the 3D abdominal
organ segmentation in CT and MR images, instrument segmentation in endoscopy
images, and cell segmentation in microscopy images. The results reveal that
U-Mamba outperforms state-of-the-art CNN-based and Transformer-based
segmentation networks across all tasks. This opens new avenues for efficient
long-range dependency modeling in biomedical image analysis. The code, models,
and data are publicly available at https://wanglab.ai/u-mamba.html.
Related papers
- HMT-UNet: A hybird Mamba-Transformer Vision UNet for Medical Image Segmentation [1.5574423250822542]
We propose a U-shape architecture model for medical image segmentation, named Hybird Transformer vision Mamba UNet (HTM-UNet)
We conduct comprehensive experiments on the ISIC17, ISIC18, CVC-300, CVC-ClinicDB, Kvasir, CVC-ColonDB, ETIS-Larib PolypDB public datasets and ZD-LCI-GIM private dataset.
arXiv Detail & Related papers (2024-08-21T02:25:14Z) - I2I-Mamba: Multi-modal medical image synthesis via selective state space modeling [8.48392350084504]
We propose a novel adversarial model for medical image synthesis, I2I-Mamba, to efficiently capture long-range context.
I2I-Mamba offers superior performance against state-of-the-art CNN- and transformer-based methods in synthesizing target-modality images.
arXiv Detail & Related papers (2024-05-22T21:55:58Z) - LKM-UNet: Large Kernel Vision Mamba UNet for Medical Image Segmentation [9.862277278217045]
In this paper, we introduce a Large Kernel Vision Mamba U-shape Network, or LKM-UNet, for medical image segmentation.
A distinguishing feature of our LKM-UNet is its utilization of large Mamba kernels, excelling in locally spatial modeling compared to small kernel-based CNNs and Transformers.
Comprehensive experiments demonstrate the feasibility and the effectiveness of using large-size Mamba kernels to achieve large receptive fields.
arXiv Detail & Related papers (2024-03-12T05:34:51Z) - Mamba-UNet: UNet-Like Pure Visual Mamba for Medical Image Segmentation [21.1787366866505]
We propose Mamba-UNet, a novel architecture that synergizes the U-Net in medical image segmentation with Mamba's capability.
Mamba-UNet adopts a pure Visual Mamba (VMamba)-based encoder-decoder structure, infused with skip connections to preserve spatial information across different scales of the network.
arXiv Detail & Related papers (2024-02-07T18:33:04Z) - nnMamba: 3D Biomedical Image Segmentation, Classification and Landmark
Detection with State Space Model [24.955052600683423]
In this paper, we introduce nnMamba, a novel architecture that integrates the strengths of CNNs and the advanced long-range modeling capabilities of State Space Sequence Models (SSMs)
Experiments on 6 datasets demonstrate nnMamba's superiority over state-of-the-art methods in a suite of challenging tasks, including 3D image segmentation, classification, and landmark detection.
arXiv Detail & Related papers (2024-02-05T21:28:47Z) - Vivim: a Video Vision Mamba for Medical Video Segmentation [52.11785024350253]
This paper presents a Video Vision Mamba-based framework, dubbed as Vivim, for medical video segmentation tasks.
Our Vivim can effectively compress the long-term representation into sequences at varying scales.
Experiments on thyroid segmentation, breast lesion segmentation in ultrasound videos, and polyp segmentation in colonoscopy videos demonstrate the effectiveness and efficiency of our Vivim.
arXiv Detail & Related papers (2024-01-25T13:27:03Z) - Learning from partially labeled data for multi-organ and tumor
segmentation [102.55303521877933]
We propose a Transformer based dynamic on-demand network (TransDoDNet) that learns to segment organs and tumors on multiple datasets.
A dynamic head enables the network to accomplish multiple segmentation tasks flexibly.
We create a large-scale partially labeled Multi-Organ and Tumor benchmark, termed MOTS, and demonstrate the superior performance of our TransDoDNet over other competitors.
arXiv Detail & Related papers (2022-11-13T13:03:09Z) - MISSU: 3D Medical Image Segmentation via Self-distilling TransUNet [55.16833099336073]
We propose to self-distill a Transformer-based UNet for medical image segmentation.
It simultaneously learns global semantic information and local spatial-detailed features.
Our MISSU achieves the best performance over previous state-of-the-art methods.
arXiv Detail & Related papers (2022-06-02T07:38:53Z) - Two-Stream Graph Convolutional Network for Intra-oral Scanner Image
Segmentation [133.02190910009384]
We propose a two-stream graph convolutional network (i.e., TSGCN) to handle inter-view confusion between different raw attributes.
Our TSGCN significantly outperforms state-of-the-art methods in 3D tooth (surface) segmentation.
arXiv Detail & Related papers (2022-04-19T10:41:09Z) - Medical Transformer: Gated Axial-Attention for Medical Image
Segmentation [73.98974074534497]
We study the feasibility of using Transformer-based network architectures for medical image segmentation tasks.
We propose a Gated Axial-Attention model which extends the existing architectures by introducing an additional control mechanism in the self-attention module.
To train the model effectively on medical images, we propose a Local-Global training strategy (LoGo) which further improves the performance.
arXiv Detail & Related papers (2021-02-21T18:35:14Z) - TransUNet: Transformers Make Strong Encoders for Medical Image
Segmentation [78.01570371790669]
Medical image segmentation is an essential prerequisite for developing healthcare systems.
On various medical image segmentation tasks, the u-shaped architecture, also known as U-Net, has become the de-facto standard.
We propose TransUNet, which merits both Transformers and U-Net, as a strong alternative for medical image segmentation.
arXiv Detail & Related papers (2021-02-08T16:10:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.