Vision Mamba for Classification of Breast Ultrasound Images
- URL: http://arxiv.org/abs/2407.03552v2
- Date: Tue, 17 Sep 2024 04:37:16 GMT
- Title: Vision Mamba for Classification of Breast Ultrasound Images
- Authors: Ali Nasiri-Sarvi, Mahdi S. Hosseini, Hassan Rivaz,
- Abstract summary: Mamba-based models, VMamba and Vim, are a recent family of vision encoders that offer promising performance improvements in many computer vision tasks.
This paper compares Mamba-based models with traditional Convolutional Neural Networks (CNNs) and Vision Transformers (ViTs) using the breast ultrasound BUSI dataset and Breast Ultrasound B dataset.
- Score: 9.90112908284836
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Mamba-based models, VMamba and Vim, are a recent family of vision encoders that offer promising performance improvements in many computer vision tasks. This paper compares Mamba-based models with traditional Convolutional Neural Networks (CNNs) and Vision Transformers (ViTs) using the breast ultrasound BUSI dataset and Breast Ultrasound B dataset. Our evaluation, which includes multiple runs of experiments and statistical significance analysis, demonstrates that some of the Mamba-based architectures often outperform CNN and ViT models with statistically significant results. For example, in the B dataset, the best Mamba-based models have a 1.98\% average AUC and a 5.0\% average Accuracy improvement compared to the best non-Mamba-based model in this study. These Mamba-based models effectively capture long-range dependencies while maintaining some inductive biases, making them suitable for applications with limited data. The code is available at \url{https://github.com/anasiri/BU-Mamba}
Related papers
- MedSegMamba: 3D CNN-Mamba Hybrid Architecture for Brain Segmentation [15.514511820130474]
We develop a 3D patch-based hybrid CNN-Mamba model for subcortical brain segmentation.
Our model's performance was validated against several benchmarks.
arXiv Detail & Related papers (2024-09-12T02:19:19Z) - ReMamba: Equip Mamba with Effective Long-Sequence Modeling [50.530839868893786]
We propose ReMamba, which enhances Mamba's ability to comprehend long contexts.
ReMamba incorporates selective compression and adaptation techniques within a two-stage re-forward process.
arXiv Detail & Related papers (2024-08-28T02:47:27Z) - MambaVision: A Hybrid Mamba-Transformer Vision Backbone [54.965143338206644]
We propose a novel hybrid Mamba-Transformer backbone, denoted as MambaVision, which is specifically tailored for vision applications.
Our core contribution includes redesigning the Mamba formulation to enhance its capability for efficient modeling of visual features.
We conduct a comprehensive ablation study on the feasibility of integrating Vision Transformers (ViT) with Mamba.
arXiv Detail & Related papers (2024-07-10T23:02:45Z) - Demystify Mamba in Vision: A Linear Attention Perspective [72.93213667713493]
Mamba is an effective state space model with linear computation complexity.
We show that Mamba shares surprising similarities with linear attention Transformer.
We propose a Mamba-Like Linear Attention (MLLA) model by incorporating the merits of these two key designs into linear attention.
arXiv Detail & Related papers (2024-05-26T15:31:09Z) - Vision Mamba: A Comprehensive Survey and Taxonomy [11.025533218561284]
State Space Model (SSM) is a mathematical model used to describe and analyze the behavior of dynamic systems.
Based on the latest state-space models, Mamba merges time-varying parameters into SSMs and formulates a hardware-aware algorithm for efficient training and inference.
Mamba is expected to become a new AI architecture that may outperform Transformer.
arXiv Detail & Related papers (2024-05-07T15:30:14Z) - CLIP-Mamba: CLIP Pretrained Mamba Models with OOD and Hessian Evaluation [18.383760896304604]
This report introduces the first attempt to train a Mamba model utilizing contrastive technical-image pretraining (CLIP)
A Mamba model 67 million parameters is on par with a 307 million- parameters Vision Transformer (ViT) model in zero-shot classification tasks.
arXiv Detail & Related papers (2024-04-30T09:40:07Z) - MiM-ISTD: Mamba-in-Mamba for Efficient Infrared Small Target Detection [72.46396769642787]
We develop a nested structure, Mamba-in-Mamba (MiM-ISTD), for efficient infrared small target detection.
MiM-ISTD is $8 times$ faster than the SOTA method and reduces GPU memory usage by 62.2$%$ when testing on $2048 times 2048$ images.
arXiv Detail & Related papers (2024-03-04T15:57:29Z) - PointMamba: A Simple State Space Model for Point Cloud Analysis [65.59944745840866]
We propose PointMamba, transferring the success of Mamba, a recent representative state space model (SSM), from NLP to point cloud analysis tasks.
Unlike traditional Transformers, PointMamba employs a linear complexity algorithm, presenting global modeling capacity while significantly reducing computational costs.
arXiv Detail & Related papers (2024-02-16T14:56:13Z) - Swin-UMamba: Mamba-based UNet with ImageNet-based pretraining [85.08169822181685]
This paper introduces a novel Mamba-based model, Swin-UMamba, designed specifically for medical image segmentation tasks.
Swin-UMamba demonstrates superior performance with a large margin compared to CNNs, ViTs, and latest Mamba-based models.
arXiv Detail & Related papers (2024-02-05T18:58:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.