CIRDataset: A large-scale Dataset for Clinically-Interpretable lung
nodule Radiomics and malignancy prediction
- URL: http://arxiv.org/abs/2206.14903v1
- Date: Wed, 29 Jun 2022 20:46:12 GMT
- Title: CIRDataset: A large-scale Dataset for Clinically-Interpretable lung
nodule Radiomics and malignancy prediction
- Authors: Wookjin Choi, Navdeep Dahiya, Saad Nadeem
- Abstract summary: Spiculations/lobulations, sharp/curved spikes on the surface of lung nodules, are good predictors of lung cancer malignancy.
No public datasets exist to date for probing the importance of these clinically-reported features in the SOTA malignancy prediction algorithms.
We release a large-scale Clinically-Interpretable Radiomics dataset, CIRDataset, containing 956 radiologist QA/QC'ed spiculation/lobulation annotations.
- Score: 4.00916638804083
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Spiculations/lobulations, sharp/curved spikes on the surface of lung nodules,
are good predictors of lung cancer malignancy and hence, are routinely assessed
and reported by radiologists as part of the standardized Lung-RADS clinical
scoring criteria. Given the 3D geometry of the nodule and 2D slice-by-slice
assessment by radiologists, manual spiculation/lobulation annotation is a
tedious task and thus no public datasets exist to date for probing the
importance of these clinically-reported features in the SOTA malignancy
prediction algorithms. As part of this paper, we release a large-scale
Clinically-Interpretable Radiomics Dataset, CIRDataset, containing 956
radiologist QA/QC'ed spiculation/lobulation annotations on segmented lung
nodules from two public datasets, LIDC-IDRI (N=883) and LUNGx (N=73). We also
present an end-to-end deep learning model based on multi-class Voxel2Mesh
extension to segment nodules (while preserving spikes), classify spikes
(sharp/spiculation and curved/lobulation), and perform malignancy prediction.
Previous methods have performed malignancy prediction for LIDC and LUNGx
datasets but without robust attribution to any clinically reported/actionable
features (due to known hyperparameter sensitivity issues with general
attribution schemes). With the release of this comprehensively-annotated
CIRDataset and end-to-end deep learning baseline, we hope that malignancy
prediction methods can validate their explanations, benchmark against our
baseline, and provide clinically-actionable insights. Dataset, code, pretrained
models, and docker containers are available at
https://github.com/nadeemlab/CIR.
Related papers
- Prediction of Lung Metastasis from Hepatocellular Carcinoma using the SEER Database [0.9055332067000195]
Hepatocellular carcinoma (HCC) is a leading cause of cancer-related mortality.
predictive models for lung metastasis inHCC remain limited in scope and clinical applicability.
We develop and validate an end-to-end machine learning pipeline using data from the Surveillance, Epidemiology, and End Results (SEER) database.
arXiv Detail & Related papers (2025-01-20T20:06:31Z) - Towards a Benchmark for Colorectal Cancer Segmentation in Endorectal Ultrasound Videos: Dataset and Model Development [59.74920439478643]
In this paper, we collect and annotated the first benchmark dataset that covers diverse ERUS scenarios.
Our ERUS-10K dataset comprises 77 videos and 10,000 high-resolution annotated frames.
We introduce a benchmark model for colorectal cancer segmentation, named the Adaptive Sparse-context TRansformer (ASTR)
arXiv Detail & Related papers (2024-08-19T15:04:42Z) - Lung-CADex: Fully automatic Zero-Shot Detection and Classification of Lung Nodules in Thoracic CT Images [45.29301790646322]
Computer-aided diagnosis can help with early lung nodul detection and facilitate subsequent nodule characterization.
We propose CADe, for segmenting lung nodules in a zero-shot manner using a variant of the Segment Anything Model called MedSAM.
We also propose, CADx, a method for the nodule characterization as benign/malignant by making a gallery of radiomic features and aligning image-feature pairs through contrastive learning.
arXiv Detail & Related papers (2024-07-02T19:30:25Z) - A Lung Nodule Dataset with Histopathology-based Cancer Type Annotation [12.617587827105496]
This research aims to bridge the gap by providing publicly accessible datasets and reliable tools for medical diagnosis.
We curated a diverse dataset of lung Computed Tomography (CT) images, comprising 330 annotated nodules (nodules are labeled as bounding boxes) from 95 distinct patients.
These promising results demonstrate that the dataset has a feasible application and further facilitate intelligent auxiliary diagnosis.
arXiv Detail & Related papers (2024-06-26T06:39:11Z) - How Does Pruning Impact Long-Tailed Multi-Label Medical Image
Classifiers? [49.35105290167996]
Pruning has emerged as a powerful technique for compressing deep neural networks, reducing memory usage and inference time without significantly affecting overall performance.
This work represents a first step toward understanding the impact of pruning on model behavior in deep long-tailed, multi-label medical image classification.
arXiv Detail & Related papers (2023-08-17T20:40:30Z) - Longitudinal Data and a Semantic Similarity Reward for Chest X-Ray Report Generation [7.586632627817609]
Radiologists face high burnout rates, partly due to the increasing volume of Chest X-rays (CXRs) requiring interpretation and reporting.
Our proposed CXR report generator integrates elements of the workflow and introduces a novel reward for reinforcement learning.
Results from our study demonstrate that the proposed model generates reports that are more aligned with radiologists' reports than state-of-the-art models.
arXiv Detail & Related papers (2023-07-19T05:41:14Z) - Revisiting Computer-Aided Tuberculosis Diagnosis [56.80999479735375]
Tuberculosis (TB) is a major global health threat, causing millions of deaths annually.
Computer-aided tuberculosis diagnosis (CTD) using deep learning has shown promise, but progress is hindered by limited training data.
We establish a large-scale dataset, namely the Tuberculosis X-ray (TBX11K) dataset, which contains 11,200 chest X-ray (CXR) images with corresponding bounding box annotations for TB areas.
This dataset enables the training of sophisticated detectors for high-quality CTD.
arXiv Detail & Related papers (2023-07-06T08:27:48Z) - Enhancing Cancer Prediction in Challenging Screen-Detected Incident Lung
Nodules Using Time-Series Deep Learning [2.744770849264355]
Lung cancer screening (LCS) using annual low-dose computed tomography (CT) scanning has been proven to significantly reduce lung cancer mortality.
Improving risk stratification of malignancy risk in lung nodules can be enhanced using machine/deep learning algorithms.
Here we show the performance of our time-series deep learning model (DeepCAD-NLM-L) which integrates multi-model information across three longitudinal data domains.
arXiv Detail & Related papers (2022-03-30T18:40:36Z) - Improving Classification Model Performance on Chest X-Rays through Lung
Segmentation [63.45024974079371]
We propose a deep learning approach to enhance abnormal chest x-ray (CXR) identification performance through segmentations.
Our approach is designed in a cascaded manner and incorporates two modules: a deep neural network with criss-cross attention modules (XLSor) for localizing lung region in CXR images and a CXR classification model with a backbone of a self-supervised momentum contrast (MoCo) model pre-trained on large-scale CXR data sets.
arXiv Detail & Related papers (2022-02-22T15:24:06Z) - The pitfalls of using open data to develop deep learning solutions for
COVID-19 detection in chest X-rays [64.02097860085202]
Deep learning models have been developed to identify COVID-19 from chest X-rays.
Results have been exceptional when training and testing on open-source data.
Data analysis and model evaluations show that the popular open-source dataset COVIDx is not representative of the real clinical problem.
arXiv Detail & Related papers (2021-09-14T10:59:11Z) - Learning Tumor Growth via Follow-Up Volume Prediction for Lung Nodules [15.069141581681016]
Follow-up serves an important role in the management of pulmonary nodules for lung cancer.
Recent deep learning studies using convolutional neural networks (CNNs) to predict the malignancy score of nodules, only provides clinicians with black-box predictions.
We propose a unified framework, named Nodule Follow-Up Prediction Network (NoFoNet), which predicts the growth of pulmonary nodules with high-quality visual appearances and accurate quantitative results.
arXiv Detail & Related papers (2020-06-24T17:18:46Z) - Predicting COVID-19 Pneumonia Severity on Chest X-ray with Deep Learning [57.00601760750389]
We present a severity score prediction model for COVID-19 pneumonia for frontal chest X-ray images.
Such a tool can gauge severity of COVID-19 lung infections that can be used for escalation or de-escalation of care.
arXiv Detail & Related papers (2020-05-24T23:13:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.