Related papers: Anatomy-Guided Representation Learning Using a Transformer-Based Network for Thyroid Nodule Segmentation in Ultrasound Images

Anatomy-Guided Representation Learning Using a Transformer-Based Network for Thyroid Nodule Segmentation in Ultrasound Images

URL: http://arxiv.org/abs/2512.12662v1
Date: Sun, 14 Dec 2025 12:20:20 GMT
Title: Anatomy-Guided Representation Learning Using a Transformer-Based Network for Thyroid Nodule Segmentation in Ultrasound Images
Authors: Muhammad Umar Farooq, Abd Ur Rehman, Azka Rehman, Muhammad Usman, Dong-Kyu Chae, Junaid Qadir,
Abstract summary: We propose SSMT-Net, a Semi-Supervised Multi-Task Transformer-based Network that enhances Transformer-centric encoder feature extraction capability in an initial unsupervised phase.<n>In the supervised phase, the model jointly optimize nodule segmentation, gland segmentation, and size estimation, integrating both local and global contextual features.<n>In evaluations on the TN3K and DDTI datasets, SSMT-Net outperforms state-of-the-art methods, with higher accuracy and robustness, indicating its potential for real-world clinical applications.
Score: 16.78356926470714
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Accurate thyroid nodule segmentation in ultrasound images is critical for diagnosis and treatment planning. However, ambiguous boundaries between nodules and surrounding tissues, size variations, and the scarcity of annotated ultrasound data pose significant challenges for automated segmentation. Existing deep learning models struggle to incorporate contextual information from the thyroid gland and generalize effectively across diverse cases. To address these challenges, we propose SSMT-Net, a Semi-Supervised Multi-Task Transformer-based Network that leverages unlabeled data to enhance Transformer-centric encoder feature extraction capability in an initial unsupervised phase. In the supervised phase, the model jointly optimizes nodule segmentation, gland segmentation, and nodule size estimation, integrating both local and global contextual features. Extensive evaluations on the TN3K and DDTI datasets demonstrate that SSMT-Net outperforms state-of-the-art methods, with higher accuracy and robustness, indicating its potential for real-world clinical applications.

Related papers

Multi-Sequence Parotid Gland Lesion Segmentation via Expert Text-Guided Segment Anything Model [10.892625199593644]
The parotid gland segment anything model (PG-SAM) incorporates expert domain knowledge for cross-sequence parotid gland lesion segmentation.<n>The PG-SAM achieves state-of-the-art performance in parotid gland lesion segmentation across three independent clinical centers.
arXiv Detail & Related papers (2025-08-13T09:25:29Z)
GANet-Seg: Adversarial Learning for Brain Tumor Segmentation with Hybrid Generative Models [1.0456203870202954]
This work introduces a novel framework for brain tumor segmentation leveraging pre-trained GANs and Unet architectures.<n>By combining a global anomaly detection module with a refined mask generation network, the proposed model accurately identifies tumor-sensitive regions.<n>Multi-modal MRI data and synthetic image augmentation are employed to improve robustness and address the challenge of limited annotated datasets.
arXiv Detail & Related papers (2025-06-26T13:28:09Z)
Multi-encoder nnU-Net outperforms transformer models with self-supervised pretraining [0.0]
This study addresses the essential task of medical image segmentation, which involves the automatic identification and delineation of anatomical structures and pathological regions in medical images.<n>We propose a novel self-supervised learning Multi-encoder nnU-Net architecture designed to process multiple MRI modalities independently through separate encoders.<n>Our Multi-encoder nnU-Net demonstrates exceptional performance, achieving a Dice Similarity Coefficient (DSC) of 93.72%, which surpasses that of other models such as vanilla nnU-Net, SegResNet, and Swin UNETR.
arXiv Detail & Related papers (2025-04-04T14:31:06Z)
CathFlow: Self-Supervised Segmentation of Catheters in Interventional Ultrasound Using Optical Flow and Transformers [66.15847237150909]
We introduce a self-supervised deep learning architecture to segment catheters in longitudinal ultrasound images. The network architecture builds upon AiAReSeg, a segmentation transformer built with the Attention in Attention mechanism. We validated our model on a test dataset, consisting of unseen synthetic data and images collected from silicon aorta phantoms.
arXiv Detail & Related papers (2024-03-21T15:13:36Z)
SDR-Former: A Siamese Dual-Resolution Transformer for Liver Lesion Classification Using 3D Multi-Phase Imaging [59.78761085714715]
This study proposes a novel Siamese Dual-Resolution Transformer (SDR-Former) framework for liver lesion classification. The proposed framework has been validated through comprehensive experiments on two clinical datasets. To support the scientific community, we are releasing our extensive multi-phase MR dataset for liver lesion analysis to the public.
arXiv Detail & Related papers (2024-02-27T06:32:56Z)
Ultrasound Image Segmentation of Thyroid Nodule via Latent Semantic Feature Co-Registration [12.211161441766532]
The present paper proposes ASTN, a framework for thyroid nodule segmentation achieved through a new type co-registration network. By extracting latent semantic information from the atlas and target images, this framework can ensure the integrity of anatomical structure. This paper also provides an atlas selection algorithm to mitigate the difficulty of co-registration.
arXiv Detail & Related papers (2023-10-13T16:18:48Z)
Reliable Joint Segmentation of Retinal Edema Lesions in OCT Images [55.83984261827332]
In this paper, we propose a novel reliable multi-scale wavelet-enhanced transformer network. We develop a novel segmentation backbone that integrates a wavelet-enhanced feature extractor network and a multi-scale transformer module. Our proposed method achieves better segmentation accuracy with a high degree of reliability as compared to other state-of-the-art segmentation approaches.
arXiv Detail & Related papers (2022-12-01T07:32:56Z)
Learning from partially labeled data for multi-organ and tumor segmentation [102.55303521877933]
We propose a Transformer based dynamic on-demand network (TransDoDNet) that learns to segment organs and tumors on multiple datasets. A dynamic head enables the network to accomplish multiple segmentation tasks flexibly. We create a large-scale partially labeled Multi-Organ and Tumor benchmark, termed MOTS, and demonstrate the superior performance of our TransDoDNet over other competitors.
arXiv Detail & Related papers (2022-11-13T13:03:09Z)
TransAttUnet: Multi-level Attention-guided U-Net with Transformer for Medical Image Segmentation [33.45471457058221]
This paper proposes a novel Transformer based medical image semantic segmentation framework called TransAttUnet. In particular, we establish additional multi-scale skip connections between decoder blocks to aggregate the different semantic-scale upsampling features. Our method consistently outperforms the state-of-the-art baselines.
arXiv Detail & Related papers (2021-07-12T09:17:06Z)
Inf-Net: Automatic COVID-19 Lung Infection Segmentation from CT Images [152.34988415258988]
Automated detection of lung infections from computed tomography (CT) images offers a great potential to augment the traditional healthcare strategy for tackling COVID-19. segmenting infected regions from CT slices faces several challenges, including high variation in infection characteristics, and low intensity contrast between infections and normal tissues. To address these challenges, a novel COVID-19 Deep Lung Infection Network (Inf-Net) is proposed to automatically identify infected regions from chest CT slices.
arXiv Detail & Related papers (2020-04-22T07:30:56Z)
MS-Net: Multi-Site Network for Improving Prostate Segmentation with Heterogeneous MRI Data [75.73881040581767]
We propose a novel multi-site network (MS-Net) for improving prostate segmentation by learning robust representations. Our MS-Net improves the performance across all datasets consistently, and outperforms state-of-the-art methods for multi-site learning.
arXiv Detail & Related papers (2020-02-09T14:11:50Z)

This list is automatically generated from the titles and abstracts of the papers in this site.