Related papers: A Study on the Performance of U-Net Modifications in Retroperitoneal Tumor Segmentation

A Study on the Performance of U-Net Modifications in Retroperitoneal Tumor Segmentation

URL: http://arxiv.org/abs/2502.00314v1
Date: Sat, 01 Feb 2025 04:25:28 GMT
Title: A Study on the Performance of U-Net Modifications in Retroperitoneal Tumor Segmentation
Authors: Moein Heidari, Ehsan Khodapanah Aghdam, Alexander Manzella, Daniel Hsu, Rebecca Scalabrino, Wenjin Chen, David J. Foran, Ilker Hacihaliloglu,
Abstract summary: The retroperitoneum hosts a variety of tumors, including rare benign and malignant types, which pose diagnostic and treatment challenges.<n> Estimating tumor volume is difficult due to their irregular shapes, and manual segmentation is time-consuming.<n>This study evaluates U-Net enhancements, including CNN, ViT, Mamba, and xLSTM, on a new in-house CT dataset and a public organ segmentation dataset.
Score: 45.39707664801522
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: The retroperitoneum hosts a variety of tumors, including rare benign and malignant types, which pose diagnostic and treatment challenges due to their infrequency and proximity to vital structures. Estimating tumor volume is difficult due to their irregular shapes, and manual segmentation is time-consuming. Automatic segmentation using U-Net and its variants, incorporating Vision Transformer (ViT) elements, has shown promising results but struggles with high computational demands. To address this, architectures like the Mamba State Space Model (SSM) and Extended Long-Short Term Memory (xLSTM) offer efficient solutions by handling long-range dependencies with lower resource consumption. This study evaluates U-Net enhancements, including CNN, ViT, Mamba, and xLSTM, on a new in-house CT dataset and a public organ segmentation dataset. The proposed ViLU-Net model integrates Vi-blocks for improved segmentation. Results highlight xLSTM's efficiency in the U-Net framework. The code is publicly accessible on GitHub.

Related papers

ContextFormer: Redefining Efficiency in Semantic Segmentation [48.81126061219231]
Convolutional methods, although capturing local dependencies well, struggle with long-range relationships. Vision Transformers (ViTs) excel in global context capture but are hindered by high computational demands. We propose ContextFormer, a hybrid framework leveraging the strengths of CNNs and ViTs in the bottleneck to balance efficiency, accuracy, and robustness for real-time semantic segmentation.
arXiv Detail & Related papers (2025-01-31T16:11:04Z)
MBDRes-U-Net: Multi-Scale Lightweight Brain Tumor Segmentation Network [0.0]
This study proposes the MBDRes-U-Net model using the three-dimensional (3D) U-Net framework, which integrates multibranch residual blocks and fused attention into the model. The computational burden of the model is reduced by the branch strategy, which effectively uses the rich local features in multimodal images.
arXiv Detail & Related papers (2024-11-04T09:03:43Z)
Multiscale Encoder and Omni-Dimensional Dynamic Convolution Enrichment in nnU-Net for Brain Tumor Segmentation [9.39565041325745]
This study introduces a novel segmentation algorithm utilizing a modified nnU-Net architecture. We enhance conventional convolution layers by incorporating omni-dimensional dynamic convolution layers, resulting in improved feature representation. Our model's efficacy is demonstrated on diverse datasets from the BraTS-2023 challenge.
arXiv Detail & Related papers (2024-09-20T05:25:46Z)
xLSTM-UNet can be an Effective 2D & 3D Medical Image Segmentation Backbone with Vision-LSTM (ViL) better than its Mamba Counterpart [13.812935743270517]
We propose xLSTM-UNet, a UNet structured deep learning neural network that leverages Vision-LSTM (xLSTM) as its backbone for medical image segmentation. xLSTM is a recently proposed as the successor of Long Short-Term Memory (LSTM) networks. Our findings demonstrate that xLSTM-UNet consistently surpasses the performance of leading CNN-based, Transformer-based, and Mamba-based segmentation networks.
arXiv Detail & Related papers (2024-07-01T17:59:54Z)
Are Vision xLSTM Embedded UNet More Reliable in Medical 3D Image Segmentation? [3.1777394653936937]
This paper investigates the integration of CNNs with Vision Extended Long-Term Memory (Vision-xLSTM)s.<n>Vision-xLSTM blocks capture temporal and global relationships within the patches, as extracted from the CNN feature maps.<n>Our primary objective is to propose that Vision-xLSTM forms an appropriate backbone for medical image segmentation, offering excellent performance with reduced computational costs.
arXiv Detail & Related papers (2024-06-24T08:01:05Z)
Vivim: a Video Vision Mamba for Medical Video Segmentation [52.11785024350253]
This paper presents a Video Vision Mamba-based framework, dubbed as Vivim, for medical video segmentation tasks. Our Vivim can effectively compress the long-term representation into sequences at varying scales. Experiments on thyroid segmentation, breast lesion segmentation in ultrasound videos, and polyp segmentation in colonoscopy videos demonstrate the effectiveness and efficiency of our Vivim.
arXiv Detail & Related papers (2024-01-25T13:27:03Z)
U-Mamba: Enhancing Long-range Dependency for Biomedical Image Segmentation [10.083902382768406]
We introduce U-Mamba, a general-purpose network for biomedical image segmentation. Inspired by the State Space Sequence Models (SSMs), a new family of deep sequence models, we design a hybrid CNN-SSM block. We conduct experiments on four diverse tasks, including the 3D abdominal organ segmentation in CT and MR images, instrument segmentation in endoscopy images, and cell segmentation in microscopy images.
arXiv Detail & Related papers (2024-01-09T18:53:20Z)
Dual-scale Enhanced and Cross-generative Consistency Learning for Semi-supervised Medical Image Segmentation [49.57907601086494]
Medical image segmentation plays a crucial role in computer-aided diagnosis. We propose a novel Dual-scale Enhanced and Cross-generative consistency learning framework for semi-supervised medical image (DEC-Seg)
arXiv Detail & Related papers (2023-12-26T12:56:31Z)
Learning from partially labeled data for multi-organ and tumor segmentation [102.55303521877933]
We propose a Transformer based dynamic on-demand network (TransDoDNet) that learns to segment organs and tumors on multiple datasets. A dynamic head enables the network to accomplish multiple segmentation tasks flexibly. We create a large-scale partially labeled Multi-Organ and Tumor benchmark, termed MOTS, and demonstrate the superior performance of our TransDoDNet over other competitors.
arXiv Detail & Related papers (2022-11-13T13:03:09Z)
MISSU: 3D Medical Image Segmentation via Self-distilling TransUNet [55.16833099336073]
We propose to self-distill a Transformer-based UNet for medical image segmentation. It simultaneously learns global semantic information and local spatial-detailed features. Our MISSU achieves the best performance over previous state-of-the-art methods.
arXiv Detail & Related papers (2022-06-02T07:38:53Z)
TransUNet: Transformers Make Strong Encoders for Medical Image Segmentation [78.01570371790669]
Medical image segmentation is an essential prerequisite for developing healthcare systems. On various medical image segmentation tasks, the u-shaped architecture, also known as U-Net, has become the de-facto standard. We propose TransUNet, which merits both Transformers and U-Net, as a strong alternative for medical image segmentation.
arXiv Detail & Related papers (2021-02-08T16:10:50Z)

This list is automatically generated from the titles and abstracts of the papers in this site.