Deep learning-based COVID-19 pneumonia classification using chest CT
images: model generalizability
- URL: http://arxiv.org/abs/2102.09616v1
- Date: Thu, 18 Feb 2021 21:14:52 GMT
- Title: Deep learning-based COVID-19 pneumonia classification using chest CT
images: model generalizability
- Authors: Dan Nguyen, Fernando Kay, Jun Tan, Yulong Yan, Yee Seng Ng, Puneeth
Iyengar, Ron Peshock, Steve Jiang
- Abstract summary: Deep learning (DL) classification models were trained to identify COVID-19-positive patients on 3D computed tomography (CT) datasets from different countries.
We trained nine identical DL-based classification models by using combinations of the datasets with a 72% train, 8% validation, and 20% test data split.
The models trained on multiple datasets and evaluated on a test set from one of the datasets used for training performed better.
- Score: 54.86482395312936
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Since the outbreak of the COVID-19 pandemic, worldwide research efforts have
focused on using artificial intelligence (AI) technologies on various medical
data of COVID-19-positive patients in order to identify or classify various
aspects of the disease, with promising reported results. However, concerns have
been raised over their generalizability, given the heterogeneous factors in
training datasets. This study aims to examine the severity of this problem by
evaluating deep learning (DL) classification models trained to identify
COVID-19-positive patients on 3D computed tomography (CT) datasets from
different countries. We collected one dataset at UT Southwestern (UTSW), and
three external datasets from different countries: CC-CCII Dataset (China),
COVID-CTset (Iran), and MosMedData (Russia). We divided the data into 2
classes: COVID-19-positive and COVID-19-negative patients. We trained nine
identical DL-based classification models by using combinations of the datasets
with a 72% train, 8% validation, and 20% test data split. The models trained on
a single dataset achieved accuracy/area under the receiver operating
characteristics curve (AUC) values of 0.87/0.826 (UTSW), 0.97/0.988 (CC-CCCI),
and 0.86/0.873 (COVID-CTset) when evaluated on their own dataset. The models
trained on multiple datasets and evaluated on a test set from one of the
datasets used for training performed better. However, the performance dropped
close to an AUC of 0.5 (random guess) for all models when evaluated on a
different dataset outside of its training datasets. Including the MosMedData,
which only contained positive labels, into the training did not necessarily
help the performance on the other datasets. Multiple factors likely contribute
to these results, such as patient demographics and differences in image
acquisition or reconstruction, causing a data shift among different study
cohorts.
Related papers
- The effect of data augmentation and 3D-CNN depth on Alzheimer's Disease
detection [51.697248252191265]
This work summarizes and strictly observes best practices regarding data handling, experimental design, and model evaluation.
We focus on Alzheimer's Disease (AD) detection, which serves as a paradigmatic example of challenging problem in healthcare.
Within this framework, we train predictive 15 models, considering three different data augmentation strategies and five distinct 3D CNN architectures.
arXiv Detail & Related papers (2023-09-13T10:40:41Z) - Virtual imaging trials improved the transparency and reliability of AI systems in COVID-19 imaging [1.6040478776985583]
This study focuses on using convolutional neural networks (CNNs) for COVID-19 diagnosis using computed tomography (CT) and chest radiography (CXR)
We developed and tested multiple AI models, 3D ResNet-like and 2D EfficientNetv2 architectures, across diverse datasets.
Models trained on the most diverse datasets showed the highest external testing performance, with AUC values ranging from 0.73 to 0.76 for CT and 0.70 to 0.73 for CXR.
arXiv Detail & Related papers (2023-08-17T19:12:32Z) - Evaluating Generalizability of Deep Learning Models Using
Indian-COVID-19 CT Dataset [5.398550081886242]
ma-chine learning (ML) approaches for automatic processing of CT scan images in clinical setting are trained on limited and biased sub-sets of publicly available COVID-19 data.
This has raised concerns regarding the generalizability of these models on external datasets, not seen by the model during training.
To address some of these issues, in this work CT scan images from confirmed COVID-19 data obtained from one of the largest public repositories, COVIDx CT 2A were used for training and internal vali-dation of machine learning models.
arXiv Detail & Related papers (2022-12-28T16:23:18Z) - A Generalizable Artificial Intelligence Model for COVID-19
Classification Task Using Chest X-ray Radiographs: Evaluated Over Four
Clinical Datasets with 15,097 Patients [6.209420804714487]
The generalizability of the trained model was retrospectively evaluated using four different real-world clinical datasets.
The AI model trained using a single-source clinical dataset achieved an AUC of 0.82 when applied to the internal temporal test set.
An AUC of 0.79 was achieved when applied to a multi-institutional COVID-19 dataset collected by the Medical Imaging and Data Resource Center.
arXiv Detail & Related papers (2022-10-04T04:12:13Z) - The pitfalls of using open data to develop deep learning solutions for
COVID-19 detection in chest X-rays [64.02097860085202]
Deep learning models have been developed to identify COVID-19 from chest X-rays.
Results have been exceptional when training and testing on open-source data.
Data analysis and model evaluations show that the popular open-source dataset COVIDx is not representative of the real clinical problem.
arXiv Detail & Related papers (2021-09-14T10:59:11Z) - Systematic investigation into generalization of COVID-19 CT deep
learning models with Gabor ensemble for lung involvement scoring [9.94980188821453]
This study investigates the generalizability of key published models using the publicly available COVID-19 Computed Tomography data.
We then assess the predictive ability of these models for COVID-19 severity using an independent new dataset.
arXiv Detail & Related papers (2021-04-20T03:49:48Z) - Bootstrapping Your Own Positive Sample: Contrastive Learning With
Electronic Health Record Data [62.29031007761901]
This paper proposes a novel contrastive regularized clinical classification model.
We introduce two unique positive sampling strategies specifically tailored for EHR data.
Our framework yields highly competitive experimental results in predicting the mortality risk on real-world COVID-19 EHR data.
arXiv Detail & Related papers (2021-04-07T06:02:04Z) - Federated Semi-Supervised Learning for COVID Region Segmentation in
Chest CT using Multi-National Data from China, Italy, Japan [14.776338073000526]
COVID-19 has led to urgent needs for reliable diagnosis and management of SARS-CoV-2 infection.
Recent efforts have focused on computer-aided characterization and diagnosis.
domain shift of data across clinical data centers poses a serious challenge when deploying learning-based models.
arXiv Detail & Related papers (2020-11-23T21:51:26Z) - Classification of COVID-19 in CT Scans using Multi-Source Transfer
Learning [91.3755431537592]
We propose the use of Multi-Source Transfer Learning to improve upon traditional Transfer Learning for the classification of COVID-19 from CT scans.
With our multi-source fine-tuning approach, our models outperformed baseline models fine-tuned with ImageNet.
Our best performing model was able to achieve an accuracy of 0.893 and a Recall score of 0.897, outperforming its baseline Recall score by 9.3%.
arXiv Detail & Related papers (2020-09-22T11:53:06Z) - Deep Mining External Imperfect Data for Chest X-ray Disease Screening [57.40329813850719]
We argue that incorporating an external CXR dataset leads to imperfect training data, which raises the challenges.
We formulate the multi-label disease classification problem as weighted independent binary tasks according to the categories.
Our framework simultaneously models and tackles the domain and label discrepancies, enabling superior knowledge mining ability.
arXiv Detail & Related papers (2020-06-06T06:48:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.