X2CT-CLIP: Enable Multi-Abnormality Detection in Computed Tomography from Chest Radiography via Tri-Modal Contrastive Learning
- URL: http://arxiv.org/abs/2503.02162v2
- Date: Tue, 11 Mar 2025 00:50:53 GMT
- Title: X2CT-CLIP: Enable Multi-Abnormality Detection in Computed Tomography from Chest Radiography via Tri-Modal Contrastive Learning
- Authors: Jianzhong You, Yuan Gao, Sangwook Kim, Chris Mcintosh,
- Abstract summary: We propose X2CT-CLIP, a tri-modal knowledge transfer learning framework that bridges the modality gap between CT and CXR.<n>Our approach is the first work to enable multi-abnormality classification in CT, by transferring knowledge from 3D CT volumes and associated radiology reports to a CXR encoder.
- Score: 18.954939735299963
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Computed tomography (CT) is a key imaging modality for diagnosis, yet its clinical utility is marred by high radiation exposure and long turnaround times, restricting its use for larger-scale screening. Although chest radiography (CXR) is more accessible and safer, existing CXR foundation models focus primarily on detecting diseases that are readily visible on the CXR. Recently, works have explored training disease classification models on simulated CXRs, but they remain limited to recognizing a single disease type from CT. CT foundation models have also emerged with significantly improved detection of pathologies in CT. However, the generalized application of CT-derived labels on CXR has remained illusive. In this study, we propose X2CT-CLIP, a tri-modal knowledge transfer learning framework that bridges the modality gap between CT and CXR while reducing the computational burden of model training. Our approach is the first work to enable multi-abnormality classification in CT, using CXR, by transferring knowledge from 3D CT volumes and associated radiology reports to a CXR encoder via a carefully designed tri-modal alignment mechanism in latent space. Extensive evaluations on three multi-label CT datasets demonstrate that our method outperforms state-of-the-art baselines in cross-modal retrieval, few-shot adaptation, and external validation. These results highlight the potential of CXR, enriched with knowledge derived from CT, as a viable efficient alternative for disease detection in resource-limited settings.
Related papers
- 3D-CT-GPT: Generating 3D Radiology Reports through Integration of Large Vision-Language Models [51.855377054763345]
This paper introduces 3D-CT-GPT, a Visual Question Answering (VQA)-based medical visual language model for generating radiology reports from 3D CT scans.
Experiments on both public and private datasets demonstrate that 3D-CT-GPT significantly outperforms existing methods in terms of report accuracy and quality.
arXiv Detail & Related papers (2024-09-28T12:31:07Z) - X-Recon: Learning-based Patient-specific High-Resolution CT Reconstruction from Orthogonal X-Ray Images [14.04604990570727]
X-Recon is a reconstruction network based on ortho-lateral chest X-ray images.
PTX-Seg is a zero-shot pneumothorax segmentation algorithm.
The reconstruction metrics achieved state-of-the-art performance in terms of several metrics including peak signal-to-noise ratio.
arXiv Detail & Related papers (2024-07-22T03:55:36Z) - X-ray2CTPA: Generating 3D CTPA scans from 2D X-ray conditioning [24.233484690096898]
Chest X-rays or chest radiography (CXR) enables limited imaging compared to computed tomography (CT) scans.
CT scans entail higher costs, greater radiation exposure, and are less accessible than CXRs.
In this work we explore cross-modal translation from a 2D low contrast-resolution X-ray input to a 3D high contrast and spatial-resolutionA scan.
arXiv Detail & Related papers (2024-06-23T13:53:35Z) - Low-Resolution Chest X-ray Classification via Knowledge Distillation and Multi-task Learning [46.75992018094998]
This research addresses the challenges of diagnosing chest X-rays (CXRs) at low resolutions.
High-resolution CXR imaging is crucial for identifying small but critical anomalies, such as nodules or opacities.
This paper presents the Multilevel Collaborative Attention Knowledge (MLCAK) method.
arXiv Detail & Related papers (2024-05-22T06:10:54Z) - Developing Generalist Foundation Models from a Multimodal Dataset for 3D Computed Tomography [1.8424705673580284]
We introduce CT-RATE, the first dataset that pairs 3D medical images with corresponding textual reports.
We develop CT-CLIP, a CT-focused contrastive language-image pretraining framework.
We create CT-CHAT, a vision-language foundational chat model for 3D chest CT volumes.
arXiv Detail & Related papers (2024-03-26T16:19:56Z) - CT Reconstruction from Few Planar X-rays with Application towards
Low-resource Radiotherapy [20.353246282326943]
We propose a method to generate CT volumes from few (5) planar X-ray observations using a prior data distribution.
To focus the generation task on clinically-relevant features, our model can also leverage anatomical guidance during training.
Our method is better than recent sparse CT reconstruction baselines in terms of standard pixel and structure-level metrics.
arXiv Detail & Related papers (2023-08-04T01:17:57Z) - Revisiting Computer-Aided Tuberculosis Diagnosis [56.80999479735375]
Tuberculosis (TB) is a major global health threat, causing millions of deaths annually.
Computer-aided tuberculosis diagnosis (CTD) using deep learning has shown promise, but progress is hindered by limited training data.
We establish a large-scale dataset, namely the Tuberculosis X-ray (TBX11K) dataset, which contains 11,200 chest X-ray (CXR) images with corresponding bounding box annotations for TB areas.
This dataset enables the training of sophisticated detectors for high-quality CTD.
arXiv Detail & Related papers (2023-07-06T08:27:48Z) - Accurate Fine-Grained Segmentation of Human Anatomy in Radiographs via
Volumetric Pseudo-Labeling [66.75096111651062]
We created a large-scale dataset of 10,021 thoracic CTs with 157 labels.
We applied an ensemble of 3D anatomy segmentation models to extract anatomical pseudo-labels.
Our resulting segmentation models demonstrated remarkable performance on CXR.
arXiv Detail & Related papers (2023-06-06T18:01:08Z) - Chest X-ray Image Classification: A Causal Perspective [49.87607548975686]
We propose a causal approach to address the CXR classification problem, which constructs a structural causal model (SCM) and uses the backdoor adjustment to select effective visual information for CXR classification.
Experimental results demonstrate that our proposed method outperforms the open-source NIH ChestX-ray14 in terms of classification performance.
arXiv Detail & Related papers (2023-05-20T03:17:44Z) - Improving Computed Tomography (CT) Reconstruction via 3D Shape Induction [3.1498833540989413]
We propose shape induction, that is, learning the shape of 3D CT from X-ray without CT supervision, as a novel technique to incorporate realistic X-ray distributions during training of a reconstruction model.
Our experiments demonstrate that this process improves both the perceptual quality of generated CT and the accuracy of down-stream classification of pulmonary infectious diseases.
arXiv Detail & Related papers (2022-08-23T13:06:02Z) - Optimising Chest X-Rays for Image Analysis by Identifying and Removing
Confounding Factors [49.005337470305584]
During the COVID-19 pandemic, the sheer volume of imaging performed in an emergency setting for COVID-19 diagnosis has resulted in a wide variability of clinical CXR acquisitions.
The variable quality of clinically-acquired CXRs within publicly available datasets could have a profound effect on algorithm performance.
We propose a simple and effective step-wise approach to pre-processing a COVID-19 chest X-ray dataset to remove undesired biases.
arXiv Detail & Related papers (2022-08-22T13:57:04Z) - COVIDx CT-3: A Large-scale, Multinational, Open-Source Benchmark Dataset
for Computer-aided COVID-19 Screening from Chest CT Images [82.74877848011798]
We introduce COVIDx CT-3, a large-scale benchmark dataset for detection of COVID-19 cases from chest CT images.
COVIDx CT-3 includes 431,205 CT slices from 6,068 patients across at least 17 countries.
We examine the data diversity and potential biases of the COVIDx CT-3 dataset, finding significant geographic and class imbalances.
arXiv Detail & Related papers (2022-06-07T06:35:48Z) - COVIDNet-CT: A Tailored Deep Convolutional Neural Network Design for
Detection of COVID-19 Cases from Chest CT Images [75.74756992992147]
We introduce COVIDNet-CT, a deep convolutional neural network architecture that is tailored for detection of COVID-19 cases from chest CT images.
We also introduce COVIDx-CT, a benchmark CT image dataset derived from CT imaging data collected by the China National Center for Bioinformation.
arXiv Detail & Related papers (2020-09-08T15:49:55Z) - Multi-modality super-resolution loss for GAN-based super-resolution of
clinical CT images using micro CT image database [1.5247645805472543]
This paper introduces multi-modality loss function for GAN-based super-resolution.
It can maintain image structure and intensity on unpaired training dataset of clinical CT and micro CT volumes.
arXiv Detail & Related papers (2019-12-30T07:49:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.