COVID-19-CT-CXR: a freely accessible and weakly labeled chest X-ray and
CT image collection on COVID-19 from biomedical literature
- URL: http://arxiv.org/abs/2006.06177v2
- Date: Thu, 22 Oct 2020 03:03:55 GMT
- Title: COVID-19-CT-CXR: a freely accessible and weakly labeled chest X-ray and
CT image collection on COVID-19 from biomedical literature
- Authors: Yifan Peng, Yu-Xing Tang, Sungwon Lee, Yingying Zhu, Ronald M.
Summers, Zhiyong Lu
- Abstract summary: We present COVID-19-CT-CXR, a public database of COVID-19 CXR and CT images automatically extracted from COVID-19-relevant articles.
The final database includes 1,327 CT and 263 CXR images with their relevant text.
- Score: 19.00121006721942
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The latest threat to global health is the COVID-19 outbreak. Although there
exist large datasets of chest X-rays (CXR) and computed tomography (CT) scans,
few COVID-19 image collections are currently available due to patient privacy.
At the same time, there is a rapid growth of COVID-19-relevant articles in the
biomedical literature. Here, we present COVID-19-CT-CXR, a public database of
COVID-19 CXR and CT images, which are automatically extracted from
COVID-19-relevant articles from the PubMed Central Open Access (PMC-OA) Subset.
We extracted figures, associated captions, and relevant figure descriptions in
the article and separated compound figures into subfigures. We also designed a
deep-learning model to distinguish them from other figure types and to classify
them accordingly. The final database includes 1,327 CT and 263 CXR images (as
of May 9, 2020) with their relevant text. To demonstrate the utility of
COVID-19-CT-CXR, we conducted four case studies. (1) We show that
COVID-19-CT-CXR, when used as additional training data, is able to contribute
to improved DL performance for the classification of COVID-19 and non-COVID-19
CT. (2) We collected CT images of influenza and trained a DL baseline to
distinguish a diagnosis of COVID-19, influenza, or normal or other types of
diseases on CT. (3) We trained an unsupervised one-class classifier from
non-COVID-19 CXR and performed anomaly detection to detect COVID-19 CXR. (4)
From text-mined captions and figure descriptions, we compared clinical symptoms
and clinical findings of COVID-19 vs. those of influenza to demonstrate the
disease differences in the scientific publications. We believe that our work is
complementary to existing resources and hope that it will contribute to medical
image analysis of the COVID-19 pandemic. The dataset, code, and DL models are
publicly available at https://github.com/ncbi-nlp/COVID-19-CT-CXR.
Related papers
- COVIDx CXR-4: An Expanded Multi-Institutional Open-Source Benchmark
Dataset for Chest X-ray Image-Based Computer-Aided COVID-19 Diagnostics [79.90346960083775]
We introduce COVIDx CXR-4, an expanded multi-institutional open-source benchmark dataset for chest X-ray image-based computer-aided COVID-19 diagnostics.
COVIDx CXR-4 expands significantly on the previous COVIDx CXR-3 dataset by increasing the total patient cohort size by greater than 2.66 times.
We provide extensive analysis on the diversity of the patient demographic, imaging metadata, and disease distributions to highlight potential dataset biases.
arXiv Detail & Related papers (2023-11-29T14:40:31Z) - Optimising Chest X-Rays for Image Analysis by Identifying and Removing
Confounding Factors [49.005337470305584]
During the COVID-19 pandemic, the sheer volume of imaging performed in an emergency setting for COVID-19 diagnosis has resulted in a wide variability of clinical CXR acquisitions.
The variable quality of clinically-acquired CXRs within publicly available datasets could have a profound effect on algorithm performance.
We propose a simple and effective step-wise approach to pre-processing a COVID-19 chest X-ray dataset to remove undesired biases.
arXiv Detail & Related papers (2022-08-22T13:57:04Z) - Multi-scale alignment and Spatial ROI Module for COVID-19 Diagnosis [13.31017458409054]
We propose a deep spatial pyramid pooling (D-SPP) module to integrate contextual information over different resolutions.
We also propose a COVID-19 infection detection (CID) module to draw attention to the lesion area and remove interference from irrelevant information.
Our method produces higher accuracy of detecting COVID-19 lesions in CT and CXR images.
arXiv Detail & Related papers (2022-07-04T12:07:17Z) - COVIDx CXR-3: A Large-Scale, Open-Source Benchmark Dataset of Chest
X-ray Images for Computer-Aided COVID-19 Diagnostics [69.55060769611916]
The use of chest X-ray (CXR) imaging as a complementary screening strategy to RT-PCR testing is increasing.
Many visual perception models have been proposed for COVID-19 screening based on CXR imaging.
We introduce COVIDx CXR-3, a large-scale benchmark dataset of CXR images for supporting COVID-19 computer vision research.
arXiv Detail & Related papers (2022-06-08T04:39:44Z) - COVIDx CT-3: A Large-scale, Multinational, Open-Source Benchmark Dataset
for Computer-aided COVID-19 Screening from Chest CT Images [82.74877848011798]
We introduce COVIDx CT-3, a large-scale benchmark dataset for detection of COVID-19 cases from chest CT images.
COVIDx CT-3 includes 431,205 CT slices from 6,068 patients across at least 17 countries.
We examine the data diversity and potential biases of the COVIDx CT-3 dataset, finding significant geographic and class imbalances.
arXiv Detail & Related papers (2022-06-07T06:35:48Z) - HRCTCov19 -- A High-Resolution Chest CT Scan Image Dataset for COVID-19
Diagnosis and Differentiation [0.0]
During the COVID-19 pandemic, computed tomography (CT) was a popular method for diagnosing COVID-19 patients.
Publicly accessible COVID-19 CT image datasets are difficult to come by due to privacy concerns.
We have introduced HRCTCov19, a new COVID-19 high-resolution chest CT scan image dataset.
arXiv Detail & Related papers (2022-05-06T12:49:18Z) - Medical-VLBERT: Medical Visual Language BERT for COVID-19 CT Report
Generation With Alternate Learning [70.71564065885542]
We propose to use the medical visual language BERT (Medical-VLBERT) model to identify the abnormality on the COVID-19 scans.
This model adopts an alternate learning strategy with two procedures that are knowledge pretraining and transferring.
For automatic medical report generation on the COVID-19 cases, we constructed a dataset of 368 medical findings in Chinese and 1104 chest CT scans.
arXiv Detail & Related papers (2021-08-11T07:12:57Z) - Medical Imaging with Deep Learning for COVID- 19 Diagnosis: A
Comprehensive Review [1.7205106391379026]
The paper focuses on the application of deep learning (DL) models to medical imaging and drug discovery for managing COVID-19 disease.
We detail various medical imaging-based studies such as X-rays and computed tomography (CT) images along with DL methods for classifying COVID-19 affected versus pneumonia.
arXiv Detail & Related papers (2021-07-13T16:49:49Z) - Screening COVID-19 Based on CT/CXR Images & Building a Publicly
Available CT-scan Dataset of COVID-19 [6.142272540492935]
This study builds a large-size publicly available CT-scan dataset, consisting of more than 13k CT-images of more than 1000 individuals, in which 8k images are taken from 500 patients infected with COVID-19.
We propose a deep learning model for screening COVID-19 using our proposed CT dataset and report the baseline results.
Finally, we extend the proposed CT model for screening COVID-19 from CXR images using a transfer learning approach.
arXiv Detail & Related papers (2020-12-28T11:52:33Z) - COVIDNet-CT: A Tailored Deep Convolutional Neural Network Design for
Detection of COVID-19 Cases from Chest CT Images [75.74756992992147]
We introduce COVIDNet-CT, a deep convolutional neural network architecture that is tailored for detection of COVID-19 cases from chest CT images.
We also introduce COVIDx-CT, a benchmark CT image dataset derived from CT imaging data collected by the China National Center for Bioinformation.
arXiv Detail & Related papers (2020-09-08T15:49:55Z) - BIMCV COVID-19+: a large annotated dataset of RX and CT images from
COVID-19 patients [2.927469685126833]
This first iteration of the database includes 1,380 CX, 885 DX and 163 CT studies from 1,311 COVID-19+ patients.
This is, to the best of our knowledge, the largest COVID-19+ dataset of images available in an open format.
arXiv Detail & Related papers (2020-06-01T18:06:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.