Related papers: Time-to-Event Pretraining for 3D Medical Imaging

Time-to-Event Pretraining for 3D Medical Imaging

URL: http://arxiv.org/abs/2411.09361v1
Date: Thu, 14 Nov 2024 11:08:54 GMT
Title: Time-to-Event Pretraining for 3D Medical Imaging
Authors: Zepeng Huo, Jason Alan Fries, Alejandro Lozano, Jeya Maria Jose Valanarasu, Ethan Steinberg, Louis Blankemeier, Akshay S. Chaudhari, Curtis Langlotz, Nigam H. Shah,
Abstract summary: We introduce time-to-event pretraining, a pretraining framework for 3D medical imaging models. We use a dataset of 18,945 CT scans (4.2 million 2D images) and time-to-event distributions across thousands of EHR-derived tasks. Our method improves outcome prediction, achieving an average AUROC increase of 23.7% and a 29.4% gain in Harrell's C-index across 8 benchmark tasks.
Score: 44.46415168541444
License: http://creativecommons.org/licenses/by/4.0/
Abstract: With the rise of medical foundation models and the growing availability of imaging data, scalable pretraining techniques offer a promising way to identify imaging biomarkers predictive of future disease risk. While current self-supervised methods for 3D medical imaging models capture local structural features like organ morphology, they fail to link pixel biomarkers with long-term health outcomes due to a missing context problem. Current approaches lack the temporal context necessary to identify biomarkers correlated with disease progression, as they rely on supervision derived only from images and concurrent text descriptions. To address this, we introduce time-to-event pretraining, a pretraining framework for 3D medical imaging models that leverages large-scale temporal supervision from paired, longitudinal electronic health records (EHRs). Using a dataset of 18,945 CT scans (4.2 million 2D images) and time-to-event distributions across thousands of EHR-derived tasks, our method improves outcome prediction, achieving an average AUROC increase of 23.7% and a 29.4% gain in Harrell's C-index across 8 benchmark tasks. Importantly, these gains are achieved without sacrificing diagnostic classification performance. This study lays the foundation for integrating longitudinal EHR and 3D imaging data to advance clinical risk prediction.

Related papers

Towards Scalable Language-Image Pre-training for 3D Medical Imaging [49.18894445671976]
We introduce Hierarchical attention for Language-Image Pre-training (HLIP), a scalable pre-training framework for 3D medical imaging.<n>HLIP adopts a lightweight hierarchical attention mechanism inspired by the natural hierarchy of radiology data: slice, scan, and study.<n>Trained on 220K patients with 3.13 million scans for brain MRI and 240K patients with 1.44 million scans for head CT, HLIP achieves state-of-the-art performance.
arXiv Detail & Related papers (2025-05-28T01:16:34Z)
Brain Latent Progression: Individual-based Spatiotemporal Disease Progression on 3D Brain MRIs via Latent Diffusion [2.7853513988338108]
Progression Brain Latenttemporal model (BrLP) designed to predict individual disease-level progression in 3D brain MRIs. We train and evaluate BrLP on 11, T1-weighted (T1w) brain MRIs from 2,805 subjects and validate its generalizability on an external test set comprising 2,257 MRIs from 962.
arXiv Detail & Related papers (2025-02-12T16:47:41Z)
Trustworthy image-to-image translation: evaluating uncertainty calibration in unpaired training scenarios [0.0]
Mammographic screening is an effective method for detecting breast cancer, facilitating early diagnosis. Deep neural networks have been shown effective in some studies, but their tendency to overfit leaves considerable risk for poor generalisation and misdiagnosis. Data augmentation schemes based on unpaired neural style transfer models have been proposed that improve generalisability. We evaluate their performance when trained on image patches parsed from three open access mammography datasets and one non-medical image dataset.
arXiv Detail & Related papers (2025-01-29T11:09:50Z)
A CNN-Transformer for Classification of Longitudinal 3D MRI Images -- A Case Study on Hepatocellular Carcinoma Prediction [0.0]
HCCNet is a novel model architecture that integrates a 3D adaptation of the ConvNeXt CNN architecture with a Transformer encoder. Our results show that HCCNet significantly improves predictive accuracy and reliability over baseline models.
arXiv Detail & Related papers (2025-01-18T11:39:46Z)
Latent Drifting in Diffusion Models for Counterfactual Medical Image Synthesis [55.959002385347645]
Latent Drifting enables diffusion models to be conditioned for medical images fitted for the complex task of counterfactual image generation. We evaluate our method on three public longitudinal benchmark datasets of brain MRI and chest X-rays for counterfactual image generation.
arXiv Detail & Related papers (2024-12-30T01:59:34Z)
Multi-task Learning Approach for Intracranial Hemorrhage Prognosis [0.0]
We propose a 3D multi-task image model to predict prognosis, Glasgow Coma Scale and age, improving accuracy and interpretability. Our method outperforms current state-of-the-art baseline image models, and demonstrates superior performance in ICH prognosis compared to four board-certified neuroradiologists using only CT scans as input.
arXiv Detail & Related papers (2024-08-16T14:56:17Z)
Harnessing the power of longitudinal medical imaging for eye disease prognosis using Transformer-based sequence modeling [49.52787013516891]
Our proposed Longitudinal Transformer for Survival Analysis (LTSA) enables dynamic disease prognosis from longitudinal medical imaging. A temporal attention analysis also suggested that, while the most recent image is typically the most influential, prior imaging still provides additional prognostic value.
arXiv Detail & Related papers (2024-05-14T17:15:28Z)
Towards Enhanced Analysis of Lung Cancer Lesions in EBUS-TBNA -- A Semi-Supervised Video Object Detection Method [0.0]
This study aims to establish a computer-aided diagnostic system for lung lesions using endobronchial ultrasound (EBUS) Previous research has lacked the application of object detection models to EBUS-TBNA.
arXiv Detail & Related papers (2024-04-02T13:23:21Z)
Swin-Tempo: Temporal-Aware Lung Nodule Detection in CT Scans as Video Sequences Using Swin Transformer-Enhanced UNet [2.7547288571938795]
We present an innovative model that harnesses the strengths of both convolutional neural networks and vision transformers. Inspired by object detection in videos, we treat each 3D CT image as a video, individual slices as frames, and lung nodules as objects, enabling a time-series application.
arXiv Detail & Related papers (2023-10-05T07:48:55Z)
MedDiffusion: Boosting Health Risk Prediction via Diffusion-based Data Augmentation [58.93221876843639]
This paper introduces a novel, end-to-end diffusion-based risk prediction model, named MedDiffusion. It enhances risk prediction performance by creating synthetic patient data during training to enlarge sample space. It discerns hidden relationships between patient visits using a step-wise attention mechanism, enabling the model to automatically retain the most vital information for generating high-quality data.
arXiv Detail & Related papers (2023-10-04T01:36:30Z)
LVM-Med: Learning Large-Scale Self-Supervised Vision Models for Medical Imaging via Second-order Graph Matching [59.01894976615714]
We introduce LVM-Med, the first family of deep networks trained on large-scale medical datasets. We have collected approximately 1.3 million medical images from 55 publicly available datasets. LVM-Med empirically outperforms a number of state-of-the-art supervised, self-supervised, and foundation models.
arXiv Detail & Related papers (2023-06-20T22:21:34Z)
Temporal Context Matters: Enhancing Single Image Prediction with Disease Progression Representations [8.396615243014768]
We present a deep learning approach that leverages temporal progression information to improve clinical outcome predictions from single-timepoint images. In our method, a self-attention based Temporal Convolutional Network (TCN) is used to learn a representation that is most reflective of the disease trajectory. A Vision Transformer is pretrained in a self-supervised fashion to extract features from single-timepoint images.
arXiv Detail & Related papers (2022-03-02T22:11:07Z)
Variational Knowledge Distillation for Disease Classification in Chest X-Rays [102.04931207504173]
We propose itvariational knowledge distillation (VKD), which is a new probabilistic inference framework for disease classification based on X-rays. We demonstrate the effectiveness of our method on three public benchmark datasets with paired X-ray images and EHRs.
arXiv Detail & Related papers (2021-03-19T14:13:56Z)
An Interpretable Multiple-Instance Approach for the Detection of referable Diabetic Retinopathy from Fundus Images [72.94446225783697]
We propose a machine learning system for the detection of referable Diabetic Retinopathy in fundus images. By extracting local information from image patches and combining it efficiently through an attention mechanism, our system is able to achieve high classification accuracy. We evaluate our approach on publicly available retinal image datasets, in which it exhibits near state-of-the-art performance.
arXiv Detail & Related papers (2021-03-02T13:14:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.