Related papers: 3D Foundation AI Model for Generalizable Disease Detection in Head Computed Tomography

3D Foundation AI Model for Generalizable Disease Detection in Head Computed Tomography

URL: http://arxiv.org/abs/2502.02779v1
Date: Tue, 04 Feb 2025 23:42:18 GMT
Title: 3D Foundation AI Model for Generalizable Disease Detection in Head Computed Tomography
Authors: Weicheng Zhu, Haoxu Huang, Huanze Tang, Rushabh Musthyala, Boyang Yu, Long Chen, Emilio Vega, Thomas O'Donnell, Seena Dehkharghani, Jennifer A. Frontera, Arjun V. Masurkar, Kara Melmed, Narges Razavian,
Abstract summary: We introduce FM-CT: a Foundation Model for Head CT for generalizable disease detection, trained using self-supervised learning.<n>Our approach pre-trains a deep learning model on a large, diverse dataset of 361,663 non-contrast 3D head CT scans without the need for manual annotations.<n>Our results demonstrate that the self-supervised foundation model significantly improves performance on downstream diagnostic tasks.
Score: 8.896955286474991
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Head computed tomography (CT) imaging is a widely-used imaging modality with multitudes of medical indications, particularly in assessing pathology of the brain, skull, and cerebrovascular system. It is commonly the first-line imaging in neurologic emergencies given its rapidity of image acquisition, safety, cost, and ubiquity. Deep learning models may facilitate detection of a wide range of diseases. However, the scarcity of high-quality labels and annotations, particularly among less common conditions, significantly hinders the development of powerful models. To address this challenge, we introduce FM-CT: a Foundation Model for Head CT for generalizable disease detection, trained using self-supervised learning. Our approach pre-trains a deep learning model on a large, diverse dataset of 361,663 non-contrast 3D head CT scans without the need for manual annotations, enabling the model to learn robust, generalizable features. To investigate the potential of self-supervised learning in head CT, we employed both discrimination with self-distillation and masked image modeling, and we construct our model in 3D rather than at the slice level (2D) to exploit the structure of head CT scans more comprehensively and efficiently. The model's downstream classification performance is evaluated using internal and three external datasets, encompassing both in-distribution (ID) and out-of-distribution (OOD) data. Our results demonstrate that the self-supervised foundation model significantly improves performance on downstream diagnostic tasks compared to models trained from scratch and previous 3D CT foundation models on scarce annotated datasets. This work highlights the effectiveness of self-supervised learning in medical imaging and sets a new benchmark for head CT image analysis in 3D, enabling broader use of artificial intelligence for head CT-based diagnosis.

Related papers

Vision Foundation Models for Computed Tomography [0.5320113414681007]
Foundation models (FMs) have shown transformative potential in radiology by performing diverse, complex tasks across imaging modalities.<n>Here, we developed CT-FM, a large-scale 3D image-based pre-trained model designed explicitly for various radiological tasks.<n>CT-FM was pre-trained using 148,000 computed tomography (CT) scans from the Imaging Data Commons through label-agnostic contrastive learning.
arXiv Detail & Related papers (2025-01-15T18:30:58Z)
Abnormality-Driven Representation Learning for Radiology Imaging [0.8321462983924758]
We introduce lesion-enhanced contrastive learning (LeCL), a novel approach to obtain visual representations driven by abnormalities in 2D axial slices across different locations of the CT scans. We evaluate our approach across three clinical tasks: tumor lesion location, lung disease detection, and patient staging, benchmarking against four state-of-the-art foundation models.
arXiv Detail & Related papers (2024-11-25T13:53:26Z)
3D-CT-GPT: Generating 3D Radiology Reports through Integration of Large Vision-Language Models [51.855377054763345]
This paper introduces 3D-CT-GPT, a Visual Question Answering (VQA)-based medical visual language model for generating radiology reports from 3D CT scans. Experiments on both public and private datasets demonstrate that 3D-CT-GPT significantly outperforms existing methods in terms of report accuracy and quality.
arXiv Detail & Related papers (2024-09-28T12:31:07Z)
LVM-Med: Learning Large-Scale Self-Supervised Vision Models for Medical Imaging via Second-order Graph Matching [59.01894976615714]
We introduce LVM-Med, the first family of deep networks trained on large-scale medical datasets. We have collected approximately 1.3 million medical images from 55 publicly available datasets. LVM-Med empirically outperforms a number of state-of-the-art supervised, self-supervised, and foundation models.
arXiv Detail & Related papers (2023-06-20T22:21:34Z)
Self-supervised Model Based on Masked Autoencoders Advance CT Scans Classification [0.0]
This paper is inspired by the self-supervised learning algorithm MAE. It uses the MAE model pre-trained on ImageNet to perform transfer learning on CT Scans dataset. This method improves the generalization performance of the model and avoids the risk of overfitting on small datasets.
arXiv Detail & Related papers (2022-10-11T00:52:05Z)
Slice-level Detection of Intracranial Hemorrhage on CT Using Deep Descriptors of Adjacent Slices [0.31317409221921133]
We propose a new strategy to train emphslice-level classifiers on CT scans based on the descriptors of the adjacent slices along the axis. We obtain a single model in the top 4% best-performing solutions of the RSNA Intracranial Hemorrhage dataset challenge. The proposed method is general and can be applied to other 3D medical diagnosis tasks such as MRI imaging.
arXiv Detail & Related papers (2022-08-05T23:20:37Z)
ROCT-Net: A new ensemble deep convolutional model with improved spatial resolution learning for detecting common diseases from retinal OCT images [0.0]
This paper presents a new enhanced deep ensemble convolutional neural network for detecting retinal diseases from OCT images. Our model generates rich and multi-resolution features by employing the learning architectures of two robust convolutional models. Our experiments on two datasets and comparing our model with some other well-known deep convolutional neural networks have proven that our architecture can increase the classification accuracy up to 5%.
arXiv Detail & Related papers (2022-03-03T17:51:01Z)
Explainable multiple abnormality classification of chest CT volumes with AxialNet and HiResCAM [89.2175350956813]
We introduce the challenging new task of explainable multiple abnormality classification in volumetric medical images. We propose a multiple instance learning convolutional neural network, AxialNet, that allows identification of top slices for each abnormality. We then aim to improve the model's learning through a novel mask loss that leverages HiResCAM and 3D allowed regions.
arXiv Detail & Related papers (2021-11-24T01:14:33Z)
Application of Homomorphic Encryption in Medical Imaging [60.51436886110803]
We show how HE can be used to make predictions over medical images while preventing unauthorized secondary use of data. We report some experiments using 3D chest CT-Scans for a nodule detection task.
arXiv Detail & Related papers (2021-10-12T19:57:12Z)
Automated Model Design and Benchmarking of 3D Deep Learning Models for COVID-19 Detection with Chest CT Scans [72.04652116817238]
We propose a differentiable neural architecture search (DNAS) framework to automatically search for the 3D DL models for 3D chest CT scans classification. We also exploit the Class Activation Mapping (CAM) technique on our models to provide the interpretability of the results.
arXiv Detail & Related papers (2021-01-14T03:45:01Z)
A Multi-Stage Attentive Transfer Learning Framework for Improving COVID-19 Diagnosis [49.3704402041314]
We propose a multi-stage attentive transfer learning framework for improving COVID-19 diagnosis. Our proposed framework consists of three stages to train accurate diagnosis models through learning knowledge from multiple source tasks and data of different domains. Importantly, we propose a novel self-supervised learning method to learn multi-scale representations for lung CT images.
arXiv Detail & Related papers (2021-01-14T01:39:19Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.