A foundation model utilizing chest CT volumes and radiology reports for supervised-level zero-shot detection of abnormalities
- URL: http://arxiv.org/abs/2403.17834v1
- Date: Tue, 26 Mar 2024 16:19:56 GMT
- Title: A foundation model utilizing chest CT volumes and radiology reports for supervised-level zero-shot detection of abnormalities
- Authors: Ibrahim Ethem Hamamci, Sezgin Er, Furkan Almas, Ayse Gulnihan Simsek, Sevval Nil Esirgun, Irem Dogan, Muhammed Furkan Dasdelen, Bastian Wittmann, Enis Simsar, Mehmet Simsar, Emine Bensu Erdemir, Abdullah Alanbay, Anjany Sekuboyina, Berkan Lafci, Mehmet K. Ozdemir, Bjoern Menze,
- Abstract summary: A major challenge in computational research in 3D medical imaging is the lack of comprehensive datasets.
CT-RATE is the first 3D medical imaging dataset that pairs images with textual reports.
We developed CT-CLIP, a CT-focused contrastive language-image pre-training framework.
- Score: 1.8953268281326607
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: A major challenge in computational research in 3D medical imaging is the lack of comprehensive datasets. Addressing this issue, our study introduces CT-RATE, the first 3D medical imaging dataset that pairs images with textual reports. CT-RATE consists of 25,692 non-contrast chest CT volumes, expanded to 50,188 through various reconstructions, from 21,304 unique patients, along with corresponding radiology text reports. Leveraging CT-RATE, we developed CT-CLIP, a CT-focused contrastive language-image pre-training framework. As a versatile, self-supervised model, CT-CLIP is designed for broad application and does not require task-specific training. Remarkably, CT-CLIP outperforms state-of-the-art, fully supervised methods in multi-abnormality detection across all key metrics, thus eliminating the need for manual annotation. We also demonstrate its utility in case retrieval, whether using imagery or textual queries, thereby advancing knowledge dissemination. The open-source release of CT-RATE and CT-CLIP marks a significant advancement in medical AI, enhancing 3D imaging analysis and fostering innovation in healthcare.
Related papers
- RadGenome-Chest CT: A Grounded Vision-Language Dataset for Chest CT Analysis [56.57177181778517]
RadGenome-Chest CT is a large-scale, region-guided 3D chest CT interpretation dataset based on CT-RATE.
We leverage the latest powerful universal segmentation and large language models to extend the original datasets.
arXiv Detail & Related papers (2024-04-25T17:11:37Z) - CT-GLIP: 3D Grounded Language-Image Pretraining with CT Scans and Radiology Reports for Full-Body Scenarios [53.94122089629544]
We introduce CT-GLIP (Grounded Language-Image Pretraining with CT scans), a novel method that constructs organ-level image-text pairs to enhance multimodal contrastive learning.
Our method, trained on a multimodal CT dataset comprising 44,011 organ-level vision-text pairs from 17,702 patients across 104 organs, demonstrates it can identify organs and abnormalities in a zero-shot manner using natural languages.
arXiv Detail & Related papers (2024-04-23T17:59:01Z) - Bootstrapping Chest CT Image Understanding by Distilling Knowledge from X-ray Expert Models [17.75505740079875]
We explore the feasibility of leveraging language as a naturally high-quality supervision for chest CT imaging.
We bootstrap the understanding of 3D chest CT images by distilling chest-related diagnostic knowledge from an extensively pre-trained 2D X-ray expert model.
We train our model with over 12,000 pairs of chest CT images and radiology reports.
arXiv Detail & Related papers (2024-04-07T12:17:40Z) - A Unified Multi-Phase CT Synthesis and Classification Framework for
Kidney Cancer Diagnosis with Incomplete Data [18.15801599933636]
We propose a unified framework for kidney cancer diagnosis with incomplete multi-phase CT.
It simultaneously recovers missing CT images and classifies cancer subtypes using the completed set of images.
The proposed framework is based on fully 3D convolutional neural networks.
arXiv Detail & Related papers (2023-12-09T11:34:14Z) - Geometry-Aware Attenuation Field Learning for Sparse-View CBCT
Reconstruction [61.48254686722434]
Cone Beam Computed Tomography (CBCT) is the most widely used imaging method in dentistry.
sparse-view CBCT reconstruction has become a main focus to reduce radiation dose.
This paper proposes a novel attenuation field encoder-decoder framework by first encoding the volumetric feature from multi-view X-ray projections.
arXiv Detail & Related papers (2023-03-26T14:38:42Z) - COVIDx CT-3: A Large-scale, Multinational, Open-Source Benchmark Dataset
for Computer-aided COVID-19 Screening from Chest CT Images [82.74877848011798]
We introduce COVIDx CT-3, a large-scale benchmark dataset for detection of COVID-19 cases from chest CT images.
COVIDx CT-3 includes 431,205 CT slices from 6,068 patients across at least 17 countries.
We examine the data diversity and potential biases of the COVIDx CT-3 dataset, finding significant geographic and class imbalances.
arXiv Detail & Related papers (2022-06-07T06:35:48Z) - CT-SGAN: Computed Tomography Synthesis GAN [4.765541373485143]
We propose the CT-SGAN model that generates large-scale 3D synthetic CT-scan volumes when trained on a small dataset of chest CT-scans.
We show that CT-SGAN can significantly improve lung detection accuracy by pre-training a nodule on a vast amount of synthetic data.
arXiv Detail & Related papers (2021-10-14T22:20:40Z) - COVIDNet-CT: A Tailored Deep Convolutional Neural Network Design for
Detection of COVID-19 Cases from Chest CT Images [75.74756992992147]
We introduce COVIDNet-CT, a deep convolutional neural network architecture that is tailored for detection of COVID-19 cases from chest CT images.
We also introduce COVIDx-CT, a benchmark CT image dataset derived from CT imaging data collected by the China National Center for Bioinformation.
arXiv Detail & Related papers (2020-09-08T15:49:55Z) - Synergistic Learning of Lung Lobe Segmentation and Hierarchical
Multi-Instance Classification for Automated Severity Assessment of COVID-19
in CT Images [61.862364277007934]
We propose a synergistic learning framework for automated severity assessment of COVID-19 in 3D CT images.
A multi-task deep network (called M$2$UNet) is then developed to assess the severity of COVID-19 patients.
Our M$2$UNet consists of a patch-level encoder, a segmentation sub-network for lung lobe segmentation, and a classification sub-network for severity assessment.
arXiv Detail & Related papers (2020-05-08T03:16:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.