Medical Image Segmentation Using Advanced Unet: VMSE-Unet and VM-Unet CBAM+
- URL: http://arxiv.org/abs/2507.00511v2
- Date: Wed, 09 Jul 2025 04:27:07 GMT
- Title: Medical Image Segmentation Using Advanced Unet: VMSE-Unet and VM-Unet CBAM+
- Authors: Sayandeep Kanrar, Raja Piyush, Qaiser Razi, Debanshi Chakraborty, Vikas Hassija, GSS Chalapathi,
- Abstract summary: We present the VMSE U-Net and VM-Unet CBAM+ model, two cutting-edge deep learning architectures designed to enhance medical image segmentation.<n>Our approach integrates Squeeze-and-Excitation (SE) and Convolutional Block Attention Module (CBAM) techniques into the traditional VM U-Net framework.<n>Both models show superior performance compared to the baseline VM-Unet across multiple datasets.
- Score: 1.1056622446799464
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we present the VMSE U-Net and VM-Unet CBAM+ model, two cutting-edge deep learning architectures designed to enhance medical image segmentation. Our approach integrates Squeeze-and-Excitation (SE) and Convolutional Block Attention Module (CBAM) techniques into the traditional VM U-Net framework, significantly improving segmentation accuracy, feature localization, and computational efficiency. Both models show superior performance compared to the baseline VM-Unet across multiple datasets. Notably, VMSEUnet achieves the highest accuracy, IoU, precision, and recall while maintaining low loss values. It also exhibits exceptional computational efficiency with faster inference times and lower memory usage on both GPU and CPU. Overall, the study suggests that the enhanced architecture VMSE-Unet is a valuable tool for medical image analysis. These findings highlight its potential for real-world clinical applications, emphasizing the importance of further research to optimize accuracy, robustness, and computational efficiency.
Related papers
- DSVM-UNet : Enhancing VM-UNet with Dual Self-distillation for Medical Image Segmentation [18.35953332045796]
We propose a simple yet effective approach to improve the model by Dual Self-distillation for VM-UNet (DSVM-UNet) without any complex architectural designs.<n>Our approach achieves state-of-the-art performance while maintaining computational efficiency.
arXiv Detail & Related papers (2026-01-27T15:06:38Z) - EfficientGFormer: Multimodal Brain Tumor Segmentation via Pruned Graph-Augmented Transformer [0.0]
EfficientGFormer is a novel architecture that integrates pretrained foundation models with graph-based reasoning.<n> Experiments on the MSD Task01 and BraTS 2021 datasets demonstrate that EfficientGFormer achieves state-of-the-art accuracy with significantly reduced memory and inference time.
arXiv Detail & Related papers (2025-08-02T18:52:59Z) - BiVM: Accurate Binarized Neural Network for Efficient Video Matting [56.000594826508504]
Deep neural networks for real-time video matting suffer significant computational limitations on edge devices.<n>We present BiVM, an accurate and resource-efficient Binarized neural network for Video Matting.<n>BiVM surpasses alternative binarized video matting networks, including state-of-the-art (SOTA) binarization methods, by a substantial margin.
arXiv Detail & Related papers (2025-07-06T16:32:37Z) - CIM-NET: A Video Denoising Deep Neural Network Model Optimized for Computing-in-Memory Architectures [4.1888033476195226]
CIM chips offer a promising solution by integrating within memory cells.<n>Existing DNN models are often designed without considering CIM architectural constraints.<n>We propose a hardware-algorithm co-design framework incorporating two innovations.
arXiv Detail & Related papers (2025-05-23T02:26:56Z) - DyMU: Dynamic Merging and Virtual Unmerging for Efficient VLMs [124.52164183968145]
We present DyMU, an efficient, training-free framework that reduces the computational burden of vision-language models (VLMs)<n>Our approach comprises two key components. First, Dynamic Token Merging (DToMe) reduces the number of visual token embeddings by merging similar tokens based on image complexity.<n>Second, Virtual Token Unmerging (VTU) simulates the expected token sequence for large language models (LLMs) by efficiently reconstructing the attention dynamics of a full sequence.
arXiv Detail & Related papers (2025-04-23T18:38:18Z) - ContextFormer: Redefining Efficiency in Semantic Segmentation [48.81126061219231]
Convolutional methods, although capturing local dependencies well, struggle with long-range relationships.<n>Vision Transformers (ViTs) excel in global context capture but are hindered by high computational demands.<n>We propose ContextFormer, a hybrid framework leveraging the strengths of CNNs and ViTs in the bottleneck to balance efficiency, accuracy, and robustness for real-time semantic segmentation.
arXiv Detail & Related papers (2025-01-31T16:11:04Z) - Granular Ball Twin Support Vector Machine with Universum Data [4.573310303307945]
We propose a novel Granular Ball Twin Support Vector Machine with Universum Data (GBU-TSVM)<n>The proposed GBU-TSVM represents data instances as hyper-balls rather than points in the feature space.<n>By grouping data points into granular balls, the model achieves superior computational efficiency, increased noise resistance, and enhanced interpretability.
arXiv Detail & Related papers (2024-12-04T15:02:28Z) - Improved Unet brain tumor image segmentation based on GSConv module and ECA attention mechanism [0.0]
An improved model of medical image segmentation for brain tumor is discussed, which is a deep learning algorithm based on U-Net architecture.
Based on the traditional U-Net, we introduce GSConv module and ECA attention mechanism to improve the performance of the model in medical image segmentation tasks.
arXiv Detail & Related papers (2024-09-20T16:35:19Z) - PIM-Opt: Demystifying Distributed Optimization Algorithms on a Real-World Processing-In-Memory System [21.09681871279162]
Modern Machine Learning (ML) training on large-scale datasets is a time-consuming workload.
It relies on the optimization algorithm Gradient Descent (SGD) due to its effectiveness, simplicity, and generalization performance.
processor-centric architectures suffer from low performance and high energy consumption while executing ML training workloads.
Processing-In-Memory (PIM) is a promising solution to alleviate the data movement bottleneck.
arXiv Detail & Related papers (2024-04-10T17:00:04Z) - Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model [48.233300343211205]
We propose a new generic vision backbone with bidirectional Mamba blocks (Vim)
Vim marks the image sequences with position embeddings and compresses the visual representation with bidirectional state space models.
The results demonstrate that Vim is capable of overcoming the computation & memory constraints on performing Transformer-style understanding for high-resolution images.
arXiv Detail & Related papers (2024-01-17T18:56:18Z) - A Heterogeneous In-Memory Computing Cluster For Flexible End-to-End
Inference of Real-World Deep Neural Networks [12.361842554233558]
Deployment of modern TinyML tasks on small battery-constrained IoT devices requires high computational energy efficiency.
Analog In-Memory Computing (IMC) using non-volatile memory (NVM) promises major efficiency improvements in deep neural network (DNN) inference.
We present a heterogeneous tightly-coupled architecture integrating 8 RISC-V cores, an in-memory computing accelerator (IMA), and digital accelerators.
arXiv Detail & Related papers (2022-01-04T11:12:01Z) - Image-specific Convolutional Kernel Modulation for Single Image
Super-resolution [85.09413241502209]
In this issue, we propose a novel image-specific convolutional modulation kernel (IKM)
We exploit the global contextual information of image or feature to generate an attention weight for adaptively modulating the convolutional kernels.
Experiments on single image super-resolution show that the proposed methods achieve superior performances over state-of-the-art methods.
arXiv Detail & Related papers (2021-11-16T11:05:10Z) - KiU-Net: Towards Accurate Segmentation of Biomedical Images using
Over-complete Representations [59.65174244047216]
We propose an over-complete architecture (Ki-Net) which involves projecting the data onto higher dimensions.
This network, when augmented with U-Net, results in significant improvements in the case of segmenting small anatomical landmarks.
We evaluate the proposed method on the task of brain anatomy segmentation from 2D Ultrasound of preterm neonates.
arXiv Detail & Related papers (2020-06-08T18:59:24Z) - On Coresets for Support Vector Machines [61.928187390362176]
A coreset is a small, representative subset of the original data points.
We show that our algorithm can be used to extend the applicability of any off-the-shelf SVM solver to streaming, distributed, and dynamic data settings.
arXiv Detail & Related papers (2020-02-15T23:25:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.