Temporal Context Matters: Enhancing Single Image Prediction with Disease
Progression Representations
- URL: http://arxiv.org/abs/2203.01933v1
- Date: Wed, 2 Mar 2022 22:11:07 GMT
- Title: Temporal Context Matters: Enhancing Single Image Prediction with Disease
Progression Representations
- Authors: Aishik Konwer, Xuan Xu, Joseph Bae, Chao Chen, Prateek Prasanna
- Abstract summary: We present a deep learning approach that leverages temporal progression information to improve clinical outcome predictions from single-timepoint images.
In our method, a self-attention based Temporal Convolutional Network (TCN) is used to learn a representation that is most reflective of the disease trajectory.
A Vision Transformer is pretrained in a self-supervised fashion to extract features from single-timepoint images.
- Score: 8.396615243014768
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Clinical outcome or severity prediction from medical images has largely
focused on learning representations from single-timepoint or snapshot scans. It
has been shown that disease progression can be better characterized by temporal
imaging. We therefore hypothesized that outcome predictions can be improved by
utilizing the disease progression information from sequential images. We
present a deep learning approach that leverages temporal progression
information to improve clinical outcome predictions from single-timepoint
images. In our method, a self-attention based Temporal Convolutional Network
(TCN) is used to learn a representation that is most reflective of the disease
trajectory. Meanwhile, a Vision Transformer is pretrained in a self-supervised
fashion to extract features from single-timepoint images. The key contribution
is to design a recalibration module that employs maximum mean discrepancy loss
(MMD) to align distributions of the above two contextual representations. We
train our system to predict clinical outcomes and severity grades from
single-timepoint images. Experiments on chest and osteoarthritis radiography
datasets demonstrate that our approach outperforms other state-of-the-art
techniques.
Related papers
- ST-NeRP: Spatial-Temporal Neural Representation Learning with Prior Embedding for Patient-specific Imaging Study [16.383405461343678]
We propose a strategy of spatial-temporal Neural Representation learning with Prior embedding (ST-NeRP) for patient-specific imaging study.
Our strategy involves leveraging an Implicit Neural Representation (INR) network to encode the image at the reference time point into a prior embedding.
This network is trained using the whole patient-specific image sequence, enabling the prediction of deformation fields at various target time points.
arXiv Detail & Related papers (2024-10-25T03:33:17Z) - Multi-task Learning Approach for Intracranial Hemorrhage Prognosis [0.0]
We propose a 3D multi-task image model to predict prognosis, Glasgow Coma Scale and age, improving accuracy and interpretability.
Our method outperforms current state-of-the-art baseline image models, and demonstrates superior performance in ICH prognosis compared to four board-certified neuroradiologists using only CT scans as input.
arXiv Detail & Related papers (2024-08-16T14:56:17Z) - Unlocking the Power of Spatial and Temporal Information in Medical Multimodal Pre-training [99.2891802841936]
We introduce the Med-ST framework for fine-grained spatial and temporal modeling.
For spatial modeling, Med-ST employs the Mixture of View Expert (MoVE) architecture to integrate different visual features from both frontal and lateral views.
For temporal modeling, we propose a novel cross-modal bidirectional cycle consistency objective by forward mapping classification (FMC) and reverse mapping regression (RMR)
arXiv Detail & Related papers (2024-05-30T03:15:09Z) - Multi-scale Spatio-temporal Transformer-based Imbalanced Longitudinal
Learning for Glaucoma Forecasting from Irregular Time Series Images [45.894671834869975]
Glaucoma is one of the major eye diseases that leads to progressive optic nerve fiber damage and irreversible blindness.
We introduce the Multi-scale Spatio-temporal Transformer Network (MST-former) based on the transformer architecture tailored for sequential image inputs.
Our method shows excellent generalization capability on the Alzheimer's Disease Neuroimaging Initiative (ADNI) MRI dataset, with an accuracy of 90.3% for mild cognitive impairment and Alzheimer's disease prediction.
arXiv Detail & Related papers (2024-02-21T02:16:59Z) - PIE: Simulating Disease Progression via Progressive Image Editing [27.658116659009025]
Progressive Image Editing (PIE) enables controlled manipulation of disease-related image features.
We leverage recent advancements in text-to-image generative models to simulate disease progression accurately and personalize it for each patient.
PIE is the first of its kind to generate disease progression images meeting real-world standards.
arXiv Detail & Related papers (2023-09-21T02:46:32Z) - Learning to Exploit Temporal Structure for Biomedical Vision-Language
Processing [53.89917396428747]
Self-supervised learning in vision-language processing exploits semantic alignment between imaging and text modalities.
We explicitly account for prior images and reports when available during both training and fine-tuning.
Our approach, named BioViL-T, uses a CNN-Transformer hybrid multi-image encoder trained jointly with a text model.
arXiv Detail & Related papers (2023-01-11T16:35:33Z) - Metadata-enhanced contrastive learning from retinal optical coherence tomography images [7.932410831191909]
We extend conventional contrastive frameworks with a novel metadata-enhanced strategy.
Our approach employs widely available patient metadata to approximate the true set of inter-image contrastive relationships.
Our approach outperforms both standard contrastive methods and a retinal image foundation model in five out of six image-level downstream tasks.
arXiv Detail & Related papers (2022-08-04T08:53:15Z) - On the Robustness of Pretraining and Self-Supervision for a Deep
Learning-based Analysis of Diabetic Retinopathy [70.71457102672545]
We compare the impact of different training procedures for diabetic retinopathy grading.
We investigate different aspects such as quantitative performance, statistics of the learned feature representations, interpretability and robustness to image distortions.
Our results indicate that models from ImageNet pretraining report a significant increase in performance, generalization and robustness to image distortions.
arXiv Detail & Related papers (2021-06-25T08:32:45Z) - An Interpretable Multiple-Instance Approach for the Detection of
referable Diabetic Retinopathy from Fundus Images [72.94446225783697]
We propose a machine learning system for the detection of referable Diabetic Retinopathy in fundus images.
By extracting local information from image patches and combining it efficiently through an attention mechanism, our system is able to achieve high classification accuracy.
We evaluate our approach on publicly available retinal image datasets, in which it exhibits near state-of-the-art performance.
arXiv Detail & Related papers (2021-03-02T13:14:15Z) - Deep Co-Attention Network for Multi-View Subspace Learning [73.3450258002607]
We propose a deep co-attention network for multi-view subspace learning.
It aims to extract both the common information and the complementary information in an adversarial setting.
In particular, it uses a novel cross reconstruction loss and leverages the label information to guide the construction of the latent representation.
arXiv Detail & Related papers (2021-02-15T18:46:44Z) - Retinopathy of Prematurity Stage Diagnosis Using Object Segmentation and
Convolutional Neural Networks [68.96150598294072]
Retinopathy of Prematurity (ROP) is an eye disorder primarily affecting premature infants with lower weights.
It causes proliferation of vessels in the retina and could result in vision loss and, eventually, retinal detachment, leading to blindness.
In recent years, there has been a significant effort to automate the diagnosis using deep learning.
This paper builds upon the success of previous models and develops a novel architecture, which combines object segmentation and convolutional neural networks (CNN)
Our proposed system first trains an object segmentation model to identify the demarcation line at a pixel level and adds the resulting mask as an additional "color" channel in
arXiv Detail & Related papers (2020-04-03T14:07:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.