Random forest-based out-of-distribution detection for robust lung cancer segmentation
- URL: http://arxiv.org/abs/2508.19112v1
- Date: Tue, 26 Aug 2025 15:14:29 GMT
- Title: Random forest-based out-of-distribution detection for robust lung cancer segmentation
- Authors: Aneesh Rangnekar, Harini Veeraraghavan,
- Abstract summary: Transformer-based models with self-supervised pretraining can produce reliably accurate segmentation from in-distribution (ID) data but degrade when applied to out-of-distribution (OOD) datasets.<n>We address this challenge with RF-Deep, a random forest classifier that utilizes deep features from a pretrained transformer encoder of the segmentation model to detect OOD scans and enhance segmentation reliability.
- Score: 2.6825994665041235
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Accurate detection and segmentation of cancerous lesions from computed tomography (CT) scans is essential for automated treatment planning and cancer treatment response assessment. Transformer-based models with self-supervised pretraining can produce reliably accurate segmentation from in-distribution (ID) data but degrade when applied to out-of-distribution (OOD) datasets. We address this challenge with RF-Deep, a random forest classifier that utilizes deep features from a pretrained transformer encoder of the segmentation model to detect OOD scans and enhance segmentation reliability. The segmentation model comprises a Swin Transformer encoder, pretrained with masked image modeling (SimMIM) on 10,432 unlabeled 3D CT scans covering cancerous and non-cancerous conditions, with a convolution decoder, trained to segment lung cancers in 317 3D scans. Independent testing was performed on 603 3D CT public datasets that included one ID dataset and four OOD datasets comprising chest CTs with pulmonary embolism (PE) and COVID-19, and abdominal CTs with kidney cancers and healthy volunteers. RF-Deep detected OOD cases with a FPR95 of 18.26%, 27.66%, and less than 0.1% on PE, COVID-19, and abdominal CTs, consistently outperforming established OOD approaches. The RF-Deep classifier provides a simple and effective approach to enhance reliability of cancer segmentation in ID and OOD scenarios.
Related papers
- Tumor-anchored deep feature random forests for out-of-distribution detection in lung cancer segmentation [2.6825994665041235]
We introduce a lightweight, architecture-agnostic approach to enhance the reliability of tumor segmentation from CT volumes.<n>RF-Deep is a plug-and-play random forests-based OOD detection framework that leverages deep features with limited outlier exposure.<n>RF-Deep achieved near-perfect detection (AUROC > 93.50) for the challenging near-OOD datasets and near-perfect detection (AUROC > 99.00) for the far-OOD datasets, substantially outperforming logit-based and radiomics approaches.
arXiv Detail & Related papers (2025-12-09T03:49:50Z) - Neural Discrete Representation Learning for Sparse-View CBCT Reconstruction: From Algorithm Design to Prospective Multicenter Clinical Evaluation [64.42236775544579]
Cone beam computed tomography (CBCT)-guided puncture has become an established approach for diagnosing and treating thoracic tumours.<n>DeepPriorCBCT is a three-stage deep learning framework that achieves diagnostic-grade reconstruction using only one-sixth of the conventional radiation dose.
arXiv Detail & Related papers (2025-11-30T12:45:02Z) - A Synthetic Data-Driven Radiology Foundation Model for Pan-tumor Clinical Diagnosis [21.212976987658415]
PASTA is a pan-tumor radiology foundation model built on PASTA-Gen.<n> PASTA achieves state-of-the-art performance on 45 of 46 oncology tasks.
arXiv Detail & Related papers (2025-02-10T05:45:03Z) - Towards a Benchmark for Colorectal Cancer Segmentation in Endorectal Ultrasound Videos: Dataset and Model Development [59.74920439478643]
In this paper, we collect and annotated the first benchmark dataset that covers diverse ERUS scenarios.
Our ERUS-10K dataset comprises 77 videos and 10,000 high-resolution annotated frames.
We introduce a benchmark model for colorectal cancer segmentation, named the Adaptive Sparse-context TRansformer (ASTR)
arXiv Detail & Related papers (2024-08-19T15:04:42Z) - Quantifying uncertainty in lung cancer segmentation with foundation models applied to mixed-domain datasets [6.712251433139412]
Medical image foundation models have shown the ability to segment organs and tumors with minimal fine-tuning.<n>These models are typically evaluated on task-specific in-distribution (ID) datasets.<n>We introduce a comprehensive set of computationally fast metrics to evaluate the performance of multiple foundation models trained with self-supervised learning (SSL)<n>SMIT produced the highest F1-score (LRAD: 0.60, 5Rater: 0.64) and lowest entropy (LRAD: 0.06, 5Rater: 0.12), indicating higher tumor detection rate and confident segmentations.
arXiv Detail & Related papers (2024-03-19T19:36:48Z) - TRUSTED: The Paired 3D Transabdominal Ultrasound and CT Human Data for
Kidney Segmentation and Registration Research [42.90853857929316]
Inter-modal image registration (IMIR) and image segmentation with abdominal Ultrasound (US) data has many important clinical applications.
We propose TRUSTED (the Tridimensional Ultra Sound TomodEnsitometrie dataset), comprising paired transabdominal 3DUS and CT kidney images from 48 human patients.
arXiv Detail & Related papers (2023-10-19T11:09:50Z) - Classification of lung cancer subtypes on CT images with synthetic
pathological priors [41.75054301525535]
Cross-scale associations exist in the image patterns between the same case's CT images and its pathological images.
We propose self-generating hybrid feature network (SGHF-Net) for accurately classifying lung cancer subtypes on CT images.
arXiv Detail & Related papers (2023-08-09T02:04:05Z) - High-Fidelity Image Synthesis from Pulmonary Nodule Lesion Maps using
Semantic Diffusion Model [10.412300404240751]
Lung cancer has been one of the leading causes of cancer-related deaths worldwide for years.
Deep learning, computer-assisted diagnosis (CAD) models based on learning algorithms can accelerate the screening process.
However, developing robust and accurate models often requires large-scale and diverse medical datasets with high-quality annotations.
arXiv Detail & Related papers (2023-05-02T01:04:22Z) - Improving Classification Model Performance on Chest X-Rays through Lung
Segmentation [63.45024974079371]
We propose a deep learning approach to enhance abnormal chest x-ray (CXR) identification performance through segmentations.
Our approach is designed in a cascaded manner and incorporates two modules: a deep neural network with criss-cross attention modules (XLSor) for localizing lung region in CXR images and a CXR classification model with a backbone of a self-supervised momentum contrast (MoCo) model pre-trained on large-scale CXR data sets.
arXiv Detail & Related papers (2022-02-22T15:24:06Z) - EMT-NET: Efficient multitask network for computer-aided diagnosis of
breast cancer [58.720142291102135]
We propose an efficient and light-weighted learning architecture to classify and segment breast tumors simultaneously.
We incorporate a segmentation task into a tumor classification network, which makes the backbone network learn representations focused on tumor regions.
The accuracy, sensitivity, and specificity of tumor classification is 88.6%, 94.1%, and 85.3%, respectively.
arXiv Detail & Related papers (2022-01-13T05:24:40Z) - CAE-Transformer: Transformer-based Model to Predict Invasiveness of Lung
Adenocarcinoma Subsolid Nodules from Non-thin Section 3D CT Scans [36.093580055848186]
Lung Adenocarcinoma (LAUC) has recently been the most prevalent.
Timely and accurate knowledge of the invasiveness of lung nodules leads to a proper treatment plan and reduces the risk of unnecessary or late surgeries.
The primary imaging modality to assess and predict the invasiveness of LAUCs is the chest CT.
In this paper, a predictive transformer-based framework, referred to as the "CAE-Transformer", is developed to classify LAUCs.
arXiv Detail & Related papers (2021-10-17T04:37:24Z) - Generative Models Improve Radiomics Performance in Different Tasks and
Different Datasets: An Experimental Study [3.040206021972938]
Radiomics is an area of research focusing on high throughput feature extraction from medical images.
Generative models can improve the performance of low dose CT-based radiomics in different tasks.
arXiv Detail & Related papers (2021-09-06T06:01:21Z) - COVID-MTL: Multitask Learning with Shift3D and Random-weighted Loss for
Automated Diagnosis and Severity Assessment of COVID-19 [39.57518533765393]
There is an urgent need for automated methods to assist accurate and effective assessment of COVID-19.
We present an end-to-end multitask learning framework (COVID-MTL) that is capable of automated and simultaneous detection (against both radiology and NAT) and severity assessment of COVID-19.
arXiv Detail & Related papers (2020-12-10T08:30:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.