ATM-Net: Anatomy-Aware Text-Guided Multi-Modal Fusion for Fine-Grained Lumbar Spine Segmentation
- URL: http://arxiv.org/abs/2504.03476v1
- Date: Fri, 04 Apr 2025 14:36:12 GMT
- Title: ATM-Net: Anatomy-Aware Text-Guided Multi-Modal Fusion for Fine-Grained Lumbar Spine Segmentation
- Authors: Sheng Lian, Dengfeng Pan, Jianlong Cai, Guang-Yong Chen, Zhun Zhong, Zhiming Luo, Shen Zhao, Shuo Li,
- Abstract summary: ATM-Net is an innovative framework that employs an anatomy-aware, text-guided, multi-modal fusion mechanism for fine-grained segmentation of lumbar substructures.<n> ATM-Net achieves Dice of 79.39% and HD95 of 9.91 pixels on SPIDER, outperforming the competitive SpineParseNet by 8.31% and 4.14 pixels, respectively.
- Score: 38.04249138515548
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Accurate lumbar spine segmentation is crucial for diagnosing spinal disorders. Existing methods typically use coarse-grained segmentation strategies that lack the fine detail needed for precise diagnosis. Additionally, their reliance on visual-only models hinders the capture of anatomical semantics, leading to misclassified categories and poor segmentation details. To address these limitations, we present ATM-Net, an innovative framework that employs an anatomy-aware, text-guided, multi-modal fusion mechanism for fine-grained segmentation of lumbar substructures, i.e., vertebrae (VBs), intervertebral discs (IDs), and spinal canal (SC). ATM-Net adopts the Anatomy-aware Text Prompt Generator (ATPG) to adaptively convert image annotations into anatomy-aware prompts in different views. These insights are further integrated with image features via the Holistic Anatomy-aware Semantic Fusion (HASF) module, building a comprehensive anatomical context. The Channel-wise Contrastive Anatomy-Aware Enhancement (CCAE) module further enhances class discrimination and refines segmentation through class-wise channel-level multi-modal contrastive learning. Extensive experiments on the MRSpineSeg and SPIDER datasets demonstrate that ATM-Net significantly outperforms state-of-the-art methods, with consistent improvements regarding class discrimination and segmentation details. For example, ATM-Net achieves Dice of 79.39% and HD95 of 9.91 pixels on SPIDER, outperforming the competitive SpineParseNet by 8.31% and 4.14 pixels, respectively.
Related papers
- Organ-aware Multi-scale Medical Image Segmentation Using Text Prompt Engineering [17.273290949721975]
Existing medical image segmentation methods rely on uni-modal visual inputs, such as images or videos, requiring labor-intensive manual annotations.<n>Medical imaging techniques capture multiple intertwined organs within a single scan, further complicating segmentation accuracy.<n>To address these challenges, MedSAM was developed to enhance segmentation accuracy by integrating image features with user-provided prompts.
arXiv Detail & Related papers (2025-03-18T01:35:34Z) - MGFI-Net: A Multi-Grained Feature Integration Network for Enhanced Medical Image Segmentation [0.3108011671896571]
A major challenge in medical image segmentation is achieving accurate delineation of regions of interest in the presence of noise, low contrast, or complex anatomical structures.<n>Existing segmentation models often neglect the integration of multi-grained information and fail to preserve edge details.<n>We propose a novel image semantic segmentation model called the Multi-Grained Feature Integration Network (MGFI-Net)<n>Our MGFI-Net is designed with two dedicated modules to tackle these issues.
arXiv Detail & Related papers (2025-02-19T15:24:34Z) - SMAFormer: Synergistic Multi-Attention Transformer for Medical Image Segmentation [28.490211130600702]
We introduce SMAFormer, a Transformer-based architecture that fuses multiple attention mechanisms for enhanced segmentation of small tumors and organs.<n> SMAFormer can capture both local and global features for medical image segmentation.<n>We conduct extensive experiments on various medical image segmentation tasks, including multi-organ, liver tumor, and bladder tumor segmentation, achieving state-of-the-art results.
arXiv Detail & Related papers (2024-08-31T04:23:33Z) - Teaching AI the Anatomy Behind the Scan: Addressing Anatomical Flaws in Medical Image Segmentation with Learnable Prior [34.54360931760496]
Key anatomical features, such as the number of organs, their shapes and relative positions, are crucial for building a robust multi-organ segmentation model.
We introduce a novel architecture called the Anatomy-Informed Network (AIC-Net)
AIC-Net incorporates a learnable input termed "Anatomical Prior", which can be adapted to patient-specific anatomy.
arXiv Detail & Related papers (2024-03-27T10:46:24Z) - M3BUNet: Mobile Mean Max UNet for Pancreas Segmentation on CT-Scans [25.636974007788986]
We propose M3BUNet, a fusion of MobileNet and U-Net neural networks, equipped with a novel Mean-Max (MM) attention that operates in two stages to gradually segment pancreas CT images.
For the fine segmentation stage, we found that applying a wavelet decomposition filter to create multi-input images enhances pancreas segmentation performance.
Our approach demonstrates a considerable performance improvement, achieving an average Dice Similarity Coefficient (DSC) value of up to 89.53% and an Intersection Over Union (IOU) score of up to 81.16 for the NIH pancreas dataset.
arXiv Detail & Related papers (2024-01-18T23:10:08Z) - Dual-scale Enhanced and Cross-generative Consistency Learning for Semi-supervised Medical Image Segmentation [49.57907601086494]
Medical image segmentation plays a crucial role in computer-aided diagnosis.
We propose a novel Dual-scale Enhanced and Cross-generative consistency learning framework for semi-supervised medical image (DEC-Seg)
arXiv Detail & Related papers (2023-12-26T12:56:31Z) - Scale-aware Super-resolution Network with Dual Affinity Learning for
Lesion Segmentation from Medical Images [50.76668288066681]
We present a scale-aware super-resolution network to adaptively segment lesions of various sizes from low-resolution medical images.
Our proposed network achieved consistent improvements compared to other state-of-the-art methods.
arXiv Detail & Related papers (2023-05-30T14:25:55Z) - Self-Supervised Correction Learning for Semi-Supervised Biomedical Image
Segmentation [84.58210297703714]
We propose a self-supervised correction learning paradigm for semi-supervised biomedical image segmentation.
We design a dual-task network, including a shared encoder and two independent decoders for segmentation and lesion region inpainting.
Experiments on three medical image segmentation datasets for different tasks demonstrate the outstanding performance of our method.
arXiv Detail & Related papers (2023-01-12T08:19:46Z) - Reliable Joint Segmentation of Retinal Edema Lesions in OCT Images [55.83984261827332]
In this paper, we propose a novel reliable multi-scale wavelet-enhanced transformer network.
We develop a novel segmentation backbone that integrates a wavelet-enhanced feature extractor network and a multi-scale transformer module.
Our proposed method achieves better segmentation accuracy with a high degree of reliability as compared to other state-of-the-art segmentation approaches.
arXiv Detail & Related papers (2022-12-01T07:32:56Z) - Few-shot Medical Image Segmentation using a Global Correlation Network
with Discriminative Embedding [60.89561661441736]
We propose a novel method for few-shot medical image segmentation.
We construct our few-shot image segmentor using a deep convolutional network trained episodically.
We enhance discriminability of deep embedding to encourage clustering of the feature domains of the same class.
arXiv Detail & Related papers (2020-12-10T04:01:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.