Adaptive Transformer Attention and Multi-Scale Fusion for Spine 3D Segmentation
- URL: http://arxiv.org/abs/2503.12853v1
- Date: Mon, 17 Mar 2025 06:27:43 GMT
- Title: Adaptive Transformer Attention and Multi-Scale Fusion for Spine 3D Segmentation
- Authors: Yanlin Xiang, Qingyuan He, Ting Xu, Ran Hao, Jiacheng Hu, Hanchao Zhang,
- Abstract summary: This study proposes a 3D semantic segmentation method for the spine based on the improved SwinUNETR.<n>Aiming at the complex anatomical structure of spinal images, this paper introduces a multi-scale fusion mechanism to enhance the feature extraction capability.<n>The experimental results show that compared with 3D CNN, 3D U-Net, and 3D U-Net + Transformer, the model of this study has achieved significant improvements.
- Score: 3.1862885335359095
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This study proposes a 3D semantic segmentation method for the spine based on the improved SwinUNETR to improve segmentation accuracy and robustness. Aiming at the complex anatomical structure of spinal images, this paper introduces a multi-scale fusion mechanism to enhance the feature extraction capability by using information of different scales, thereby improving the recognition accuracy of the model for the target area. In addition, the introduction of the adaptive attention mechanism enables the model to dynamically adjust the attention to the key area, thereby optimizing the boundary segmentation effect. The experimental results show that compared with 3D CNN, 3D U-Net, and 3D U-Net + Transformer, the model of this study has achieved significant improvements in mIoU, mDice, and mAcc indicators, and has better segmentation performance. The ablation experiment further verifies the effectiveness of the proposed improved method, proving that multi-scale fusion and adaptive attention mechanism have a positive effect on the segmentation task. Through the visualization analysis of the inference results, the model can better restore the real anatomical structure of the spinal image. Future research can further optimize the Transformer structure and expand the data scale to improve the generalization ability of the model. This study provides an efficient solution for the task of medical image segmentation, which is of great significance to intelligent medical image analysis.
Related papers
- Multi-Scale Transformer Architecture for Accurate Medical Image Classification [4.578375402082224]
This study introduces an AI-driven skin lesion classification algorithm built on an enhanced Transformer architecture.<n>By integrating a multi-scale feature fusion mechanism and refining the self-attention process, the model effectively extracts both global and local features.<n>Performance evaluation on the ISIC 2017 dataset demonstrates that the improved Transformer surpasses established AI models.
arXiv Detail & Related papers (2025-02-10T08:22:25Z) - J-CaPA : Joint Channel and Pyramid Attention Improves Medical Image Segmentation [2.503388496100123]
We propose a transformer-based architecture that jointly applies Channel Attention and Pyramid Attention mechanisms to improve multi-scale feature extraction.
Our proposed model demonstrates improved segmentation accuracy for complex anatomical structures, outperforming existing state-of-the-art methods.
arXiv Detail & Related papers (2024-11-25T16:52:21Z) - Improved Unet brain tumor image segmentation based on GSConv module and ECA attention mechanism [0.0]
An improved model of medical image segmentation for brain tumor is discussed, which is a deep learning algorithm based on U-Net architecture.
Based on the traditional U-Net, we introduce GSConv module and ECA attention mechanism to improve the performance of the model in medical image segmentation tasks.
arXiv Detail & Related papers (2024-09-20T16:35:19Z) - Optimizing Universal Lesion Segmentation: State Space Model-Guided Hierarchical Networks with Feature Importance Adjustment [0.0]
We introduce Mamba-Ahnet, a novel integration of State Space Model (SSM) and Advanced Hierarchical Network (AHNet) within the MAMBA framework.
Mamba-Ahnet combines SSM's feature extraction and comprehension with AHNet's attention mechanisms and image reconstruction, aiming to enhance segmentation accuracy and robustness.
arXiv Detail & Related papers (2024-04-26T08:15:43Z) - Deep Learning-Based Brain Image Segmentation for Automated Tumour Detection [0.0]
The objective is to leverage state-of-the-art convolutional neural networks (CNNs) on a large dataset of brain MRI scans for segmentation.
The proposed methodology applies pre-processing techniques for enhanced performance and generalizability.
arXiv Detail & Related papers (2024-04-06T15:09:49Z) - SDR-Former: A Siamese Dual-Resolution Transformer for Liver Lesion
Classification Using 3D Multi-Phase Imaging [59.78761085714715]
This study proposes a novel Siamese Dual-Resolution Transformer (SDR-Former) framework for liver lesion classification.
The proposed framework has been validated through comprehensive experiments on two clinical datasets.
To support the scientific community, we are releasing our extensive multi-phase MR dataset for liver lesion analysis to the public.
arXiv Detail & Related papers (2024-02-27T06:32:56Z) - S^2Former-OR: Single-Stage Bi-Modal Transformer for Scene Graph Generation in OR [50.435592120607815]
Scene graph generation (SGG) of surgical procedures is crucial in enhancing holistically cognitive intelligence in the operating room (OR)
Previous works have primarily relied on multi-stage learning, where the generated semantic scene graphs depend on intermediate processes with pose estimation and object detection.
In this study, we introduce a novel single-stage bi-modal transformer framework for SGG in the OR, termed S2Former-OR.
arXiv Detail & Related papers (2024-02-22T11:40:49Z) - Dual-scale Enhanced and Cross-generative Consistency Learning for Semi-supervised Medical Image Segmentation [49.57907601086494]
Medical image segmentation plays a crucial role in computer-aided diagnosis.
We propose a novel Dual-scale Enhanced and Cross-generative consistency learning framework for semi-supervised medical image (DEC-Seg)
arXiv Detail & Related papers (2023-12-26T12:56:31Z) - View-Disentangled Transformer for Brain Lesion Detection [50.4918615815066]
We propose a novel view-disentangled transformer to enhance the extraction of MRI features for more accurate tumour detection.
First, the proposed transformer harvests long-range correlation among different positions in a 3D brain scan.
Second, the transformer models a stack of slice features as multiple 2D views and enhance these features view-by-view.
Third, we deploy the proposed transformer module in a transformer backbone, which can effectively detect the 2D regions surrounding brain lesions.
arXiv Detail & Related papers (2022-09-20T11:58:23Z) - Revisiting 3D Context Modeling with Supervised Pre-training for
Universal Lesion Detection in CT Slices [48.85784310158493]
We propose a Modified Pseudo-3D Feature Pyramid Network (MP3D FPN) to efficiently extract 3D context enhanced 2D features for universal lesion detection in CT slices.
With the novel pre-training method, the proposed MP3D FPN achieves state-of-the-art detection performance on the DeepLesion dataset.
The proposed 3D pre-trained weights can potentially be used to boost the performance of other 3D medical image analysis tasks.
arXiv Detail & Related papers (2020-12-16T07:11:16Z) - Unsupervised Instance Segmentation in Microscopy Images via Panoptic
Domain Adaptation and Task Re-weighting [86.33696045574692]
We propose a Cycle Consistency Panoptic Domain Adaptive Mask R-CNN (CyC-PDAM) architecture for unsupervised nuclei segmentation in histopathology images.
We first propose a nuclei inpainting mechanism to remove the auxiliary generated objects in the synthesized images.
Secondly, a semantic branch with a domain discriminator is designed to achieve panoptic-level domain adaptation.
arXiv Detail & Related papers (2020-05-05T11:08:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.