Vision Foundation Models for Computed Tomography
- URL: http://arxiv.org/abs/2501.09001v1
- Date: Wed, 15 Jan 2025 18:30:58 GMT
- Title: Vision Foundation Models for Computed Tomography
- Authors: Suraj Pai, Ibrahim Hadzic, Dennis Bontempi, Keno Bressem, Benjamin H. Kann, Andriy Fedorov, Raymond H. Mak, Hugo J. W. L. Aerts,
- Abstract summary: Foundation models (FMs) have shown transformative potential in radiology by performing diverse, complex tasks across imaging modalities.
Here, we developed CT-FM, a large-scale 3D image-based pre-trained model designed explicitly for various radiological tasks.
CT-FM was pre-trained using 148,000 computed tomography (CT) scans from the Imaging Data Commons through label-agnostic contrastive learning.
- Score: 0.5320113414681007
- License:
- Abstract: Foundation models (FMs) have shown transformative potential in radiology by performing diverse, complex tasks across imaging modalities. Here, we developed CT-FM, a large-scale 3D image-based pre-trained model designed explicitly for various radiological tasks. CT-FM was pre-trained using 148,000 computed tomography (CT) scans from the Imaging Data Commons through label-agnostic contrastive learning. We evaluated CT-FM across four categories of tasks, namely, whole-body and tumor segmentation, head CT triage, medical image retrieval, and semantic understanding, showing superior performance against state-of-the-art models. Beyond quantitative success, CT-FM demonstrated the ability to cluster regions anatomically and identify similar anatomical and structural concepts across scans. Furthermore, it remained robust across test-retest settings and indicated reasonable salient regions attached to its embeddings. This study demonstrates the value of large-scale medical imaging foundation models and by open-sourcing the model weights, code, and data, aims to support more adaptable, reliable, and interpretable AI solutions in radiology.
Related papers
- 3D Foundation AI Model for Generalizable Disease Detection in Head Computed Tomography [8.896955286474991]
We introduce FM-CT: a Foundation Model for Head CT for generalizable disease detection, trained using self-supervised learning.
Our approach pre-trains a deep learning model on a large, diverse dataset of 361,663 non-contrast 3D head CT scans without the need for manual annotations.
Our results demonstrate that the self-supervised foundation model significantly improves performance on downstream diagnostic tasks.
arXiv Detail & Related papers (2025-02-04T23:42:18Z) - 3D-CT-GPT: Generating 3D Radiology Reports through Integration of Large Vision-Language Models [51.855377054763345]
This paper introduces 3D-CT-GPT, a Visual Question Answering (VQA)-based medical visual language model for generating radiology reports from 3D CT scans.
Experiments on both public and private datasets demonstrate that 3D-CT-GPT significantly outperforms existing methods in terms of report accuracy and quality.
arXiv Detail & Related papers (2024-09-28T12:31:07Z) - ShapeMamba-EM: Fine-Tuning Foundation Model with Local Shape Descriptors and Mamba Blocks for 3D EM Image Segmentation [49.42525661521625]
This paper presents ShapeMamba-EM, a specialized fine-tuning method for 3D EM segmentation.
It is tested over a wide range of EM images, covering five segmentation tasks and 10 datasets.
arXiv Detail & Related papers (2024-08-26T08:59:22Z) - CC-DCNet: Dynamic Convolutional Neural Network with Contrastive Constraints for Identifying Lung Cancer Subtypes on Multi-modality Images [13.655407979403945]
We propose a novel deep learning network designed to accurately classify lung cancer subtype with multi-dimensional and multi-modality images.
The strength of the proposed model lies in its ability to dynamically process both paired CT-pathological image sets and independent CT image sets.
We also develop a contrastive constraint module, which quantitatively maps the cross-modality associations through network training.
arXiv Detail & Related papers (2024-07-18T01:42:00Z) - RadGenome-Chest CT: A Grounded Vision-Language Dataset for Chest CT Analysis [56.57177181778517]
RadGenome-Chest CT is a large-scale, region-guided 3D chest CT interpretation dataset based on CT-RATE.
We leverage the latest powerful universal segmentation and large language models to extend the original datasets.
arXiv Detail & Related papers (2024-04-25T17:11:37Z) - Classification of lung cancer subtypes on CT images with synthetic
pathological priors [41.75054301525535]
Cross-scale associations exist in the image patterns between the same case's CT images and its pathological images.
We propose self-generating hybrid feature network (SGHF-Net) for accurately classifying lung cancer subtypes on CT images.
arXiv Detail & Related papers (2023-08-09T02:04:05Z) - LVM-Med: Learning Large-Scale Self-Supervised Vision Models for Medical
Imaging via Second-order Graph Matching [59.01894976615714]
We introduce LVM-Med, the first family of deep networks trained on large-scale medical datasets.
We have collected approximately 1.3 million medical images from 55 publicly available datasets.
LVM-Med empirically outperforms a number of state-of-the-art supervised, self-supervised, and foundation models.
arXiv Detail & Related papers (2023-06-20T22:21:34Z) - MedNeRF: Medical Neural Radiance Fields for Reconstructing 3D-aware
CT-Projections from a Single X-ray [14.10611608681131]
Excessive ionising radiation can lead to deterministic and harmful effects on the body.
This paper proposes a Deep Learning model that learns to reconstruct CT projections from a few or even a single-view X-ray.
arXiv Detail & Related papers (2022-02-02T13:25:23Z) - Body Part Regression for CT Images [0.0]
Self-supervised body part regression model for CT volumes is developed and trained on a heterogeneous collection of CT studies.
It is demonstrated how the algorithm can contribute to the robust and reliable transfer of medical models into the clinic.
arXiv Detail & Related papers (2021-10-18T10:03:42Z) - A Multi-Stage Attentive Transfer Learning Framework for Improving
COVID-19 Diagnosis [49.3704402041314]
We propose a multi-stage attentive transfer learning framework for improving COVID-19 diagnosis.
Our proposed framework consists of three stages to train accurate diagnosis models through learning knowledge from multiple source tasks and data of different domains.
Importantly, we propose a novel self-supervised learning method to learn multi-scale representations for lung CT images.
arXiv Detail & Related papers (2021-01-14T01:39:19Z) - SAG-GAN: Semi-Supervised Attention-Guided GANs for Data Augmentation on
Medical Images [47.35184075381965]
We present a data augmentation method for generating synthetic medical images using cycle-consistency Generative Adversarial Networks (GANs)
The proposed GANs-based model can generate a tumor image from a normal image, and in turn, it can also generate a normal image from a tumor image.
We train the classification model using real images with classic data augmentation methods and classification models using synthetic images.
arXiv Detail & Related papers (2020-11-15T14:01:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.