DualSwinUnet++: An Enhanced Swin-Unet Architecture With Dual Decoders For PTMC Segmentation
- URL: http://arxiv.org/abs/2410.18239v3
- Date: Sun, 20 Jul 2025 16:58:02 GMT
- Title: DualSwinUnet++: An Enhanced Swin-Unet Architecture With Dual Decoders For PTMC Segmentation
- Authors: Maryam Dialameh, Hossein Rajabzadeh, Moslem Sadeghi-Goughari, Jung Suk Sim, Hyock Ju Kwon,
- Abstract summary: We propose DualSwinUnet++, a dual-decoder transformer-based architecture designed to enhance PTMC segmentation.<n>The model is trained on a clinical ultrasound dataset with 691 annotated RFA images and evaluated against state-of-the-art models.
- Score: 0.8388591755871736
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Precise segmentation of papillary thyroid microcarcinoma (PTMC) during ultrasound-guided radiofrequency ablation (RFA) is critical for effective treatment but remains challenging due to acoustic artifacts, small lesion size, and anatomical variability. In this study, we propose DualSwinUnet++, a dual-decoder transformer-based architecture designed to enhance PTMC segmentation by incorporating thyroid gland context. DualSwinUnet++ employs independent linear projection heads for each decoder and a residual information flow mechanism that passes intermediate features from the first (thyroid) decoder to the second (PTMC) decoder via concatenation and transformation. These design choices allow the model to condition tumor prediction explicitly on gland morphology without shared gradient interference. Trained on a clinical ultrasound dataset with 691 annotated RFA images and evaluated against state-of-the-art models, DualSwinUnet++ achieves superior Dice and Jaccard scores while maintaining sub-200ms inference latency. The results demonstrate the model's suitability for near real-time surgical assistance and its effectiveness in improving segmentation accuracy in challenging PTMC cases.
Related papers
- FaRMamba: Frequency-based learning and Reconstruction aided Mamba for Medical Segmentation [3.5790602918760586]
Vision Mamba employs one-dimensional causal state-space recurrence to efficiently model global dependencies.<n>Its patch tokenization and 1D serialization disrupt local pixel adjacency and impose a low-pass filtering effect.<n>We propose FaRMamba, a novel extension that explicitly addresses LHICD and 2D-SSD through two complementary modules.
arXiv Detail & Related papers (2025-07-26T20:41:53Z) - CSASN: A Multitask Attention-Based Framework for Heterogeneous Thyroid Carcinoma Classification in Ultrasound Images [4.577163442985675]
Heterogeneous morphological features and data imbalance pose significant challenges in rare thyroid carcinoma classification using ultrasound imaging.<n>We propose a novel multitask learning framework, Channel-Spatial Attention Synergy Network (CSASN), which integrates a dual-branch feature extractor.
arXiv Detail & Related papers (2025-05-04T18:23:03Z) - Multi-encoder nnU-Net outperforms transformer models with self-supervised pretraining [0.0]
This study addresses the essential task of medical image segmentation, which involves the automatic identification and delineation of anatomical structures and pathological regions in medical images.<n>We propose a novel self-supervised learning Multi-encoder nnU-Net architecture designed to process multiple MRI modalities independently through separate encoders.<n>Our Multi-encoder nnU-Net demonstrates exceptional performance, achieving a Dice Similarity Coefficient (DSC) of 93.72%, which surpasses that of other models such as vanilla nnU-Net, SegResNet, and Swin UNETR.
arXiv Detail & Related papers (2025-04-04T14:31:06Z) - MAST-Pro: Dynamic Mixture-of-Experts for Adaptive Segmentation of Pan-Tumors with Knowledge-Driven Prompts [54.915060471994686]
We propose MAST-Pro, a novel framework that integrates dynamic Mixture-of-Experts (D-MoE) and knowledge-driven prompts for pan-tumor segmentation.
Specifically, text and anatomical prompts provide domain-specific priors guiding tumor representation learning, while D-MoE dynamically selects experts to balance generic and tumor-specific feature learning.
Experiments on multi-anatomical tumor datasets demonstrate that MAST-Pro outperforms state-of-the-art approaches, achieving up to a 5.20% improvement in average improvement while reducing trainable parameters by 91.04%, without compromising accuracy.
arXiv Detail & Related papers (2025-03-18T15:39:44Z) - A Self-supervised Multimodal Deep Learning Approach to Differentiate Post-radiotherapy Progression from Pseudoprogression in Glioblastoma [5.98776969609135]
Accurate differentiation of pseudoprogression (PsP) from True Progression (TP) following radiotherapy in glioblastoma (GBM) patients is crucial for optimal treatment planning.
This study proposes a multimodal deep-learning approach utilizing complementary information from routine anatomical MR images, clinical parameters, and RT treatment planning information for improved predictive accuracy.
arXiv Detail & Related papers (2025-02-06T11:57:57Z) - ONCOPILOT: A Promptable CT Foundation Model For Solid Tumor Evaluation [3.8763197858217935]
ONCOPILOT is an interactive radiological foundation model trained on approximately 7,500 CT scans covering the whole body.
It performs 3D tumor segmentation using visual prompts like point-click and bounding boxes, outperforming state-of-the-art models.
ONCOPILOT also accelerates measurement processes and reduces inter-reader variability.
arXiv Detail & Related papers (2024-10-10T13:36:49Z) - Synthesizing Late-Stage Contrast Enhancement in Breast MRI: A Comprehensive Pipeline Leveraging Temporal Contrast Enhancement Dynamics [0.3499870393443268]
This study presents a pipeline for synthesizing late-phase DCE-MRI images from early-phase data.
The proposed approach introduces a novel loss function, Time Intensity Loss (TI-loss), leveraging the temporal behavior of contrast agents to guide the training of a generative model.
Two metrics are proposed to evaluate image quality: the Contrast Agent Pattern Score ($mathcalCP_s$), which validates enhancement patterns in annotated regions, and the Average Difference in Enhancement ($mathcalED$), measuring differences between real and generated enhancements.
arXiv Detail & Related papers (2024-09-03T04:31:49Z) - Prototype Learning Guided Hybrid Network for Breast Tumor Segmentation in DCE-MRI [58.809276442508256]
We propose a hybrid network via the combination of convolution neural network (CNN) and transformer layers.
The experimental results on private and public DCE-MRI datasets demonstrate that the proposed hybrid network superior performance than the state-of-the-art methods.
arXiv Detail & Related papers (2024-08-11T15:46:00Z) - CathFlow: Self-Supervised Segmentation of Catheters in Interventional Ultrasound Using Optical Flow and Transformers [66.15847237150909]
We introduce a self-supervised deep learning architecture to segment catheters in longitudinal ultrasound images.
The network architecture builds upon AiAReSeg, a segmentation transformer built with the Attention in Attention mechanism.
We validated our model on a test dataset, consisting of unseen synthetic data and images collected from silicon aorta phantoms.
arXiv Detail & Related papers (2024-03-21T15:13:36Z) - Dual-Domain Coarse-to-Fine Progressive Estimation Network for
Simultaneous Denoising, Limited-View Reconstruction, and Attenuation
Correction of Cardiac SPECT [16.75701769113328]
Single-Photon Emission Computed Tomography (SPECT) is widely applied for the diagnosis of coronary artery diseases.
Low-dose (LD) SPECT aims to minimize radiation exposure but leads to increased image noise. Limited-view (LV) SPECT, such as the latest GE MyoSPECT ES system, enables accelerated scanning and reduces hardware expenses but degrades reconstruction accuracy.
arXiv Detail & Related papers (2024-01-23T23:28:15Z) - Dual-scale Enhanced and Cross-generative Consistency Learning for Semi-supervised Medical Image Segmentation [49.57907601086494]
Medical image segmentation plays a crucial role in computer-aided diagnosis.
We propose a novel Dual-scale Enhanced and Cross-generative consistency learning framework for semi-supervised medical image (DEC-Seg)
arXiv Detail & Related papers (2023-12-26T12:56:31Z) - Rotational Augmented Noise2Inverse for Low-dose Computed Tomography
Reconstruction [83.73429628413773]
Supervised deep learning methods have shown the ability to remove noise in images but require accurate ground truth.
We propose a novel self-supervised framework for LDCT, in which ground truth is not required for training the convolutional neural network (CNN)
Numerical and experimental results show that the reconstruction accuracy of N2I with sparse views is degrading while the proposed rotational augmented Noise2Inverse (RAN2I) method keeps better image quality over a different range of sampling angles.
arXiv Detail & Related papers (2023-12-19T22:40:51Z) - Latent Diffusion Model for Medical Image Standardization and Enhancement [11.295078152769559]
DiffusionCT is a score-based DDPM model that transforms disparate non-standard distributions into a standardized form.
The architecture comprises a U-Net-based encoder-decoder, augmented by a DDPM model integrated at the bottleneck position.
Empirical tests on patient CT images indicate notable improvements in image standardization using DiffusionCT.
arXiv Detail & Related papers (2023-10-08T17:11:14Z) - Deep Learning Framework with Multi-Head Dilated Encoders for Enhanced
Segmentation of Cervical Cancer on Multiparametric Magnetic Resonance Imaging [0.6597195879147557]
T2-weighted magnetic resonance imaging (MRI) and diffusion-weighted imaging (DWI) are essential components for cervical cancer diagnosis.
We propose a novel multi-head framework that uses dilated convolutions and shared residual connections for separate encoding of multiparametric MRI images.
arXiv Detail & Related papers (2023-06-19T19:41:21Z) - Segmentation of Planning Target Volume in CT Series for Total Marrow
Irradiation Using U-Net [0.0]
We present a deep learning-based auto-contouring method for segmenting Planning Target Volume (PTV) for TMLI treatment using the U-Net architecture.
Our findings are a preliminary but significant step towards developing a segmentation model that has the potential to save radiation oncologists a considerable amount of time.
arXiv Detail & Related papers (2023-04-05T10:40:37Z) - Cross-Modal Causal Intervention for Medical Report Generation [107.76649943399168]
Radiology Report Generation (RRG) is essential for computer-aided diagnosis and medication guidance.<n> generating accurate lesion descriptions remains challenging due to spurious correlations from visual-linguistic biases.<n>We propose a two-stage framework named CrossModal Causal Representation Learning (CMCRL)<n> Experiments on IU-Xray and MIMIC-CXR show that our CMCRL pipeline significantly outperforms state-of-the-art methods.
arXiv Detail & Related papers (2023-03-16T07:23:55Z) - Reliable Joint Segmentation of Retinal Edema Lesions in OCT Images [55.83984261827332]
In this paper, we propose a novel reliable multi-scale wavelet-enhanced transformer network.
We develop a novel segmentation backbone that integrates a wavelet-enhanced feature extractor network and a multi-scale transformer module.
Our proposed method achieves better segmentation accuracy with a high degree of reliability as compared to other state-of-the-art segmentation approaches.
arXiv Detail & Related papers (2022-12-01T07:32:56Z) - PD-DWI: Predicting response to neoadjuvant chemotherapy in invasive
breast cancer with Physiologically-Decomposed Diffusion-Weighted MRI
machine-learning model [0.0]
We introduce PD-DWI, a physiologically decomposed DWI machine-learning model to predict pCR from DWI and clinical data.
Our model substantially improves the area under the curve (AUC), compared to the current best result on the leaderboard.
arXiv Detail & Related papers (2022-06-12T08:59:49Z) - Multiple Time Series Fusion Based on LSTM An Application to CAP A Phase
Classification Using EEG [56.155331323304]
Deep learning based electroencephalogram channels' feature level fusion is carried out in this work.
Channel selection, fusion, and classification procedures were optimized by two optimization algorithms.
arXiv Detail & Related papers (2021-12-18T14:17:49Z) - Lung Cancer Lesion Detection in Histopathology Images Using Graph-Based
Sparse PCA Network [93.22587316229954]
We propose a graph-based sparse principal component analysis (GS-PCA) network, for automated detection of cancerous lesions on histological lung slides stained by hematoxylin and eosin (H&E)
We evaluate the performance of the proposed algorithm on H&E slides obtained from an SVM K-rasG12D lung cancer mouse model using precision/recall rates, F-score, Tanimoto coefficient, and area under the curve (AUC) of the receiver operator characteristic (ROC)
arXiv Detail & Related papers (2021-10-27T19:28:36Z) - Inf-Net: Automatic COVID-19 Lung Infection Segmentation from CT Images [152.34988415258988]
Automated detection of lung infections from computed tomography (CT) images offers a great potential to augment the traditional healthcare strategy for tackling COVID-19.
segmenting infected regions from CT slices faces several challenges, including high variation in infection characteristics, and low intensity contrast between infections and normal tissues.
To address these challenges, a novel COVID-19 Deep Lung Infection Network (Inf-Net) is proposed to automatically identify infected regions from chest CT slices.
arXiv Detail & Related papers (2020-04-22T07:30:56Z) - Detecting Pancreatic Ductal Adenocarcinoma in Multi-phase CT Scans via
Alignment Ensemble [77.5625174267105]
Pancreatic ductal adenocarcinoma (PDAC) is one of the most lethal cancers among the population.
Multiple phases provide more information than single phase, but they are unaligned and inhomogeneous in texture.
We suggest an ensemble of all these alignments as a promising way to boost the performance of PDAC detection.
arXiv Detail & Related papers (2020-03-18T19:06:27Z) - Segmentation of Retinal Low-Cost Optical Coherence Tomography Images
using Deep Learning [2.571523045125397]
The need for treatment is determined by the presence or change of disease-specific OCT-based biomarkers.
The monitoring frequency of current treatment schemes is not individually adapted to the patient and therefore often insufficient.
One of the key requirements of a home monitoring OCT system is a computer-aided diagnosis to automatically detect and quantify pathological changes.
arXiv Detail & Related papers (2020-01-23T12:55:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.