Rethinking Pulmonary Embolism Segmentation: A Study of Current Approaches and Challenges with an Open Weight Model
- URL: http://arxiv.org/abs/2509.18308v2
- Date: Wed, 24 Sep 2025 01:07:54 GMT
- Title: Rethinking Pulmonary Embolism Segmentation: A Study of Current Approaches and Challenges with an Open Weight Model
- Authors: Yixin Zhang, Ryan Chamberlain, Lawrence Ngo, Kevin Kramer, Maciej A. Mazurowski,
- Abstract summary: 3D models are particularly well-suited to this task given the morphological characteristics of emboli.<n>CNN-based models generally yield superior performance compared to their ViT-based counterparts in PE segmentation.<n>Central and large emboli can be segmented with satisfactory accuracy, while distal emboli remain challenging due to both task complexity and the scarcity of high-quality datasets.
- Score: 21.024556007374684
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this study, we curated a densely annotated in-house dataset comprising 490 CTPA scans. Using this dataset, we systematically evaluated nine widely used segmentation architectures from both the CNN and Vision Transformer (ViT) families, initialized with either pretrained or random weights, under a unified testing framework as a performance audit. Our study leads to several important observations: (1) 3D U-Net with a ResNet encoder remains a highly effective architecture for PE segmentation; (2) 3D models are particularly well-suited to this task given the morphological characteristics of emboli; (3) CNN-based models generally yield superior performance compared to their ViT-based counterparts in PE segmentation; (4) classification-based pretraining, even on large PE datasets, can adversely impact segmentation performance compared to training from scratch, suggesting that PE classification and segmentation may rely on different sets of discriminative features; (5) different model architectures show a highly consistent pattern of segmentation performance when trained on the same data; and (6) while central and large emboli can be segmented with satisfactory accuracy, distal emboli remain challenging due to both task complexity and the scarcity of high-quality datasets. Besides these findings, our best-performing model achieves a mean Dice score of 0.7131 for segmentation. It detects 181 emboli with 49 false positives and 28 false negatives from 60 in-house testing scans. Its generalizability is further validated on public datasets.
Related papers
- Comparative Analysis of Custom CNN Architectures versus Pre-trained Models and Transfer Learning: A Study on Five Bangladesh Datasets [0.0]
Transfer learning with fine-tuning consistently outperforms both custom CNNs built from scratch and feature extraction methods.<n>While custom CNNs offer advantages in model size (3.4M parameters vs. 11-134M for pre-trained models), pre-trained models with transfer learning provide superior performance.
arXiv Detail & Related papers (2026-01-07T19:36:41Z) - In search of truth: Evaluating concordance of AI-based anatomy segmentation models [3.740726797046942]
AI-based methods for anatomy segmentation can help automate characterization of large imaging datasets.<n>We introduce a practical framework to assist in evaluating them on datasets that do not contain ground truth annotations.
arXiv Detail & Related papers (2025-12-17T19:33:56Z) - The role of data partitioning on the performance of EEG-based deep learning models in supervised cross-subject analysis: a preliminary study [37.69303106863453]
Deep learning is advancing the analysis of electroencephalography (EEG) data by effectively discovering highly nonlinear patterns.<n>No comprehensive guidelines for proper data partitioning and cross-validation exist in the domain.<n>This paper thoroughly investigates the role of data partitioning and cross-validation in evaluating EEG deep learning models.
arXiv Detail & Related papers (2025-05-19T12:05:28Z) - A Comparative Performance Analysis of Classification and Segmentation Models on Bangladeshi Pothole Dataset [0.953693516244499]
The study involves a comprehensive performance analysis of popular classification and segmentation models, applied over a Bangladeshi pothole dataset.<n>This custom dataset of 824 samples, collected from the streets of Dhaka and Bogura performs competitively against the existing industrial and custom datasets.<n>The experimental results exhibit that, our dataset performs on par or outperforms the similar classification models utilized in the existing literature, reaching accuracy and f1-scores over 99%.
arXiv Detail & Related papers (2025-01-11T18:03:46Z) - Self-Training with Pseudo-Label Scorer for Aspect Sentiment Quad Prediction [54.23208041792073]
Aspect Sentiment Quad Prediction (ASQP) aims to predict all quads (aspect term, aspect category, opinion term, sentiment polarity) for a given review.
A key challenge in the ASQP task is the scarcity of labeled data, which limits the performance of existing methods.
We propose a self-training framework with a pseudo-label scorer, wherein a scorer assesses the match between reviews and their pseudo-labels.
arXiv Detail & Related papers (2024-06-26T05:30:21Z) - A Lightweight Measure of Classification Difficulty from Application Dataset Characteristics [4.220363193932374]
We propose an efficient cosine similarity-based classification difficulty measure S.
It is calculated from the number of classes and intra- and inter-class similarity metrics of the dataset.
We show how a practitioner can use this measure to help select an efficient model 6 to 29x faster than through repeated training and testing.
arXiv Detail & Related papers (2024-04-09T03:27:09Z) - PULASki: Learning inter-rater variability using statistical distances to improve probabilistic segmentation [35.34932609930401]
This work proposes the PULASki method as a computationally efficient generative tool for biomedical image segmentation.<n>It captures variability in expert annotations, even in small datasets.<n>Our experiments are also the first to present a comparative study of the computationally feasible segmentation of complex geometries using 3D patches and the traditional use of 2D slices.
arXiv Detail & Related papers (2023-12-25T10:31:22Z) - The effect of data augmentation and 3D-CNN depth on Alzheimer's Disease
detection [51.697248252191265]
This work summarizes and strictly observes best practices regarding data handling, experimental design, and model evaluation.
We focus on Alzheimer's Disease (AD) detection, which serves as a paradigmatic example of challenging problem in healthcare.
Within this framework, we train predictive 15 models, considering three different data augmentation strategies and five distinct 3D CNN architectures.
arXiv Detail & Related papers (2023-09-13T10:40:41Z) - 3D Adversarial Augmentations for Robust Out-of-Domain Predictions [115.74319739738571]
We focus on improving the generalization to out-of-domain data.
We learn a set of vectors that deform the objects in an adversarial fashion.
We perform adversarial augmentation by applying the learned sample-independent vectors to the available objects when training a model.
arXiv Detail & Related papers (2023-08-29T17:58:55Z) - Interpretable 2D Vision Models for 3D Medical Images [47.75089895500738]
This study proposes a simple approach of adapting 2D networks with an intermediate feature representation for processing 3D images.
We show on all 3D MedMNIST datasets as benchmark and two real-world datasets consisting of several hundred high-resolution CT or MRI scans that our approach performs on par with existing methods.
arXiv Detail & Related papers (2023-07-13T08:27:09Z) - Robustness Analysis on Foundational Segmentation Models [28.01242494123917]
In this work, we perform a robustness analysis of Visual Foundation Models (VFMs) for segmentation tasks.
We benchmark seven state-of-the-art segmentation architectures using 2 different datasets.
Our findings reveal several key insights: VFMs exhibit vulnerabilities to compression-induced corruptions, despite not outpacing all of unimodal models in robustness, multimodal models show competitive resilience in zero-shot scenarios, and VFMs demonstrate enhanced robustness for certain object categories.
arXiv Detail & Related papers (2023-06-15T16:59:42Z) - Part-Based Models Improve Adversarial Robustness [57.699029966800644]
We show that combining human prior knowledge with end-to-end learning can improve the robustness of deep neural networks.
Our model combines a part segmentation model with a tiny classifier and is trained end-to-end to simultaneously segment objects into parts.
Our experiments indicate that these models also reduce texture bias and yield better robustness against common corruptions and spurious correlations.
arXiv Detail & Related papers (2022-09-15T15:41:47Z) - CHALLENGER: Training with Attribution Maps [63.736435657236505]
We show that utilizing attribution maps for training neural networks can improve regularization of models and thus increase performance.
In particular, we show that our generic domain-independent approach yields state-of-the-art results in vision, natural language processing and on time series tasks.
arXiv Detail & Related papers (2022-05-30T13:34:46Z) - MSeg: A Composite Dataset for Multi-domain Semantic Segmentation [100.17755160696939]
We present MSeg, a composite dataset that unifies semantic segmentation datasets from different domains.
We reconcile the generalization and bring the pixel-level annotations into alignment by relabeling more than 220,000 object masks in more than 80,000 images.
A model trained on MSeg ranks first on the WildDash-v1 leaderboard for robust semantic segmentation, with no exposure to WildDash data during training.
arXiv Detail & Related papers (2021-12-27T16:16:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.