SA$^{2}$Net: Scale-Adaptive Structure-Affinity Transformation for Spine Segmentation from Ultrasound Volume Projection Imaging
- URL: http://arxiv.org/abs/2510.26568v1
- Date: Thu, 30 Oct 2025 14:58:16 GMT
- Title: SA$^{2}$Net: Scale-Adaptive Structure-Affinity Transformation for Spine Segmentation from Ultrasound Volume Projection Imaging
- Authors: Hao Xie, Zixun Huang, Yushen Zuo, Yakun Ju, Frank H. F. Leung, N. F. Law, Kin-Man Lam, Yong-Ping Zheng, Sai Ho Ling,
- Abstract summary: We propose a novel structure-aware network (SA$2$Net) for effective spine segmentation.<n>First, we propose a scale-adaptive complementary strategy to learn the cross-dimensional long-distance correlation features for spinal images.<n>Second, we transform semantic features with class-specific affinity and combine it with a Transformer decoder for structure-aware reasoning.
- Score: 21.660042213751794
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Spine segmentation, based on ultrasound volume projection imaging (VPI), plays a vital role for intelligent scoliosis diagnosis in clinical applications. However, this task faces several significant challenges. Firstly, the global contextual knowledge of spines may not be well-learned if we neglect the high spatial correlation of different bone features. Secondly, the spine bones contain rich structural knowledge regarding their shapes and positions, which deserves to be encoded into the segmentation process. To address these challenges, we propose a novel scale-adaptive structure-aware network (SA$^{2}$Net) for effective spine segmentation. First, we propose a scale-adaptive complementary strategy to learn the cross-dimensional long-distance correlation features for spinal images. Second, motivated by the consistency between multi-head self-attention in Transformers and semantic level affinity, we propose structure-affinity transformation to transform semantic features with class-specific affinity and combine it with a Transformer decoder for structure-aware reasoning. In addition, we adopt a feature mixing loss aggregation method to enhance model training. This method improves the robustness and accuracy of the segmentation process. The experimental results demonstrate that our SA$^{2}$Net achieves superior segmentation performance compared to other state-of-the-art methods. Moreover, the adaptability of SA$^{2}$Net to various backbones enhances its potential as a promising tool for advanced scoliosis diagnosis using intelligent spinal image analysis. The code and experimental demo are available at https://github.com/taetiseo09/SA2Net.
Related papers
- Subsampled Randomized Fourier GaLore for Adapting Foundation Models in Depth-Driven Liver Landmark Segmentation [6.91206648866302]
We propose a depth-guided liver landmark segmentation framework integrating semantic and geometric cues via vision foundation encoders.<n>To efficiently adapt SAM2, we introduce SRFT-GaLore, a novel low-rank gradient projection method that replaces the computationally expensive SVD with a Subsampled Randomized Fourier Transform.<n>Our method achieves a 4.85% improvement in Dice Similarity Coefficient and a 11.78-point reduction in Average Symmetric Surface Distance compared to the D2GPLand.
arXiv Detail & Related papers (2025-11-05T04:16:49Z) - SpineMamba: Enhancing 3D Spinal Segmentation in Clinical Imaging through Residual Visual Mamba Layers and Shape Priors [10.431439196002842]
We introduce a residual visual Mamba layer to model the deep semantic features and long-range spatial dependencies of 3D spinal data.
We also propose a novel spinal shape prior module that captures specific anatomical information of the spine from medical images.
SpineMamba achieves superior segmentation performance, exceeding it by up to 2 percentage points.
arXiv Detail & Related papers (2024-08-28T15:59:40Z) - Dual-scale Enhanced and Cross-generative Consistency Learning for Semi-supervised Medical Image Segmentation [49.57907601086494]
Medical image segmentation plays a crucial role in computer-aided diagnosis.
We propose a novel Dual-scale Enhanced and Cross-generative consistency learning framework for semi-supervised medical image (DEC-Seg)
arXiv Detail & Related papers (2023-12-26T12:56:31Z) - Shape Matters: Detecting Vertebral Fractures Using Differentiable
Point-Based Shape Decoding [51.38395069380457]
Degenerative spinal pathologies are highly prevalent among the elderly population.
Timely diagnosis of osteoporotic fractures and other degenerative deformities facilitates proactive measures to mitigate the risk of severe back pain and disability.
In this study, we specifically explore the use of shape auto-encoders for vertebrae.
arXiv Detail & Related papers (2023-12-08T18:11:22Z) - Self-supervised Semantic Segmentation: Consistency over Transformation [3.485615723221064]
We propose a novel self-supervised algorithm, textbfS$3$-Net, which integrates a robust framework based on the proposed Inception Large Kernel Attention (I-LKA) modules.
We leverage deformable convolution as an integral component to effectively capture and delineate lesion deformations for superior object boundary definition.
Our experimental results on skin lesion and lung organ segmentation tasks show the superior performance of our method compared to the SOTA approaches.
arXiv Detail & Related papers (2023-08-31T21:28:46Z) - M$^{2}$SNet: Multi-scale in Multi-scale Subtraction Network for Medical Image Segmentation [66.89632406480949]
We propose a general multi-scale in multi-scale subtraction network (M$2$SNet) to finish diverse segmentation from medical image.<n>Our method performs favorably against most state-of-the-art methods under different evaluation metrics on eleven datasets of four different medical image segmentation tasks.
arXiv Detail & Related papers (2023-03-20T06:26:49Z) - Reliable Joint Segmentation of Retinal Edema Lesions in OCT Images [55.83984261827332]
In this paper, we propose a novel reliable multi-scale wavelet-enhanced transformer network.
We develop a novel segmentation backbone that integrates a wavelet-enhanced feature extractor network and a multi-scale transformer module.
Our proposed method achieves better segmentation accuracy with a high degree of reliability as compared to other state-of-the-art segmentation approaches.
arXiv Detail & Related papers (2022-12-01T07:32:56Z) - Two-Stream Graph Convolutional Network for Intra-oral Scanner Image
Segmentation [133.02190910009384]
We propose a two-stream graph convolutional network (i.e., TSGCN) to handle inter-view confusion between different raw attributes.
Our TSGCN significantly outperforms state-of-the-art methods in 3D tooth (surface) segmentation.
arXiv Detail & Related papers (2022-04-19T10:41:09Z) - DONet: Dual Objective Networks for Skin Lesion Segmentation [77.9806410198298]
We propose a simple yet effective framework, named Dual Objective Networks (DONet), to improve the skin lesion segmentation.
Our DONet adopts two symmetric decoders to produce different predictions for approaching different objectives.
To address the challenge of large variety of lesion scales and shapes in dermoscopic images, we additionally propose a recurrent context encoding module (RCEM)
arXiv Detail & Related papers (2020-08-19T06:02:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.