Stochastic Siamese MAE Pretraining for Longitudinal Medical Images
- URL: http://arxiv.org/abs/2512.23441v1
- Date: Mon, 29 Dec 2025 13:00:12 GMT
- Title: Stochastic Siamese MAE Pretraining for Longitudinal Medical Images
- Authors: Taha Emre, Arunava Chakravarty, Thomas Pinetz, Dmitrii Lachinov, Martin J. Menten, Hendrik Scholl, Sobha Sivaprasad, Daniel Rueckert, Andrew Lotery, Stefan Sacu, Ursula Schmidt-Erfurth, Hrvoje Bogunović,
- Abstract summary: STAMP is a Siamese MAE framework that encodes temporal information through a process by conditioning on the time difference between the 2 input volumes.<n>Unlike deterministic approaches, which compare scans from different time points but fail to account for the inherent uncertainty in disease evolution, STAMP learns temporal dynamicsally by reframing the MAE reconstruction loss as a conditional variational objective.<n>We evaluated STAMP on two OCT and one MRI datasets with multiple visits per patient. STAMP pretrained ViT models outperformed both existing temporal MAE methods and foundation models on different late stage Age-Related Macular Degeneration and Alzheimer's Disease progression prediction
- Score: 18.38706070993135
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Temporally aware image representations are crucial for capturing disease progression in 3D volumes of longitudinal medical datasets. However, recent state-of-the-art self-supervised learning approaches like Masked Autoencoding (MAE), despite their strong representation learning capabilities, lack temporal awareness. In this paper, we propose STAMP (Stochastic Temporal Autoencoder with Masked Pretraining), a Siamese MAE framework that encodes temporal information through a stochastic process by conditioning on the time difference between the 2 input volumes. Unlike deterministic Siamese approaches, which compare scans from different time points but fail to account for the inherent uncertainty in disease evolution, STAMP learns temporal dynamics stochastically by reframing the MAE reconstruction loss as a conditional variational inference objective. We evaluated STAMP on two OCT and one MRI datasets with multiple visits per patient. STAMP pretrained ViT models outperformed both existing temporal MAE methods and foundation models on different late stage Age-Related Macular Degeneration and Alzheimer's Disease progression prediction which require models to learn the underlying non-deterministic temporal dynamics of the diseases.
Related papers
- Conditional Neural ODE for Longitudinal Parkinson's Disease Progression Forecasting [51.906871559732245]
Parkinson's disease (PD) shows heterogeneous, evolving brain-morphometry patterns.<n>Modeling these longitudinal trajectories enables mechanistic insight, treatment development, and individualized 'digital-twin' forecasting.<n>We propose CNODE, a novel framework for continuous, individualized PD progression forecasting.
arXiv Detail & Related papers (2025-11-06T20:16:33Z) - Multi-Task Diffusion Approach For Prediction of Glioma Tumor Progression [0.6978367196609415]
Glioma is an aggressive brain malignancy that poses significant challenges for accurate evolution prediction.<n>In this paper, we present a multitask diffusion framework for time-agnostic, pixel-wise prediction of glioma progression.
arXiv Detail & Related papers (2025-09-13T14:42:46Z) - Temporally-Aware Diffusion Model for Brain Progression Modelling with Bidirectional Temporal Regularisation [7.097850157718258]
Current approaches fail to explicitly capture the relationship between structural changes and time intervals.<n>Most approaches rely on 2D slice-based architectures, thereby disregarding full 3D anatomical context.<n>We propose a 3D Temporally-Aware Diffusion Model (TADM-3D), which accurately predicts brain progression on MRI volumes.
arXiv Detail & Related papers (2025-09-03T08:51:38Z) - Latent Drifting in Diffusion Models for Counterfactual Medical Image Synthesis [55.959002385347645]
Latent Drifting enables diffusion models to be conditioned for medical images fitted for the complex task of counterfactual image generation.<n>We evaluate our method on three public longitudinal benchmark datasets of brain MRI and chest X-rays for counterfactual image generation.
arXiv Detail & Related papers (2024-12-30T01:59:34Z) - CTPD: Cross-Modal Temporal Pattern Discovery for Enhanced Multimodal Electronic Health Records Analysis [50.56875995511431]
We introduce a Cross-Modal Temporal Pattern Discovery (CTPD) framework, designed to efficiently extract meaningful cross-modal temporal patterns from multimodal EHR data.<n>Our approach introduces shared initial temporal pattern representations which are refined using slot attention to generate temporal semantic embeddings.
arXiv Detail & Related papers (2024-11-01T15:54:07Z) - Unlocking the Power of Spatial and Temporal Information in Medical Multimodal Pre-training [99.2891802841936]
We introduce the Med-ST framework for fine-grained spatial and temporal modeling.
For spatial modeling, Med-ST employs the Mixture of View Expert (MoVE) architecture to integrate different visual features from both frontal and lateral views.
For temporal modeling, we propose a novel cross-modal bidirectional cycle consistency objective by forward mapping classification (FMC) and reverse mapping regression (RMR)
arXiv Detail & Related papers (2024-05-30T03:15:09Z) - Temporally Adjustable Longitudinal Fluid-Attenuated Inversion Recovery
MRI Estimation / Synthesis for Multiple Sclerosis [0.0]
Multiple Sclerosis (MS) is a chronic progressive neurological disease characterized by the development of lesions in the white matter of the brain.
FLAIR brain magnetic resonance imaging (MRI) provides superior visualization and characterization of MS lesions, relative to other MRI modalities.
Longitudinal brain FLAIR MRI in MS, involving repetitively imaging a patient over time, provides helpful information for clinicians towards monitoring disease progression.
Predicting future whole brain MRI examinations with variable time lag has only been attempted in limited applications, such as healthy aging and structural degeneration in Alzheimer's Disease.
arXiv Detail & Related papers (2022-09-09T12:42:00Z) - TINC: Temporally Informed Non-Contrastive Learning for Disease
Progression Modeling in Retinal OCT Volumes [4.397304270654923]
Non-contrastive methods implicitly incorporate negatives in the loss, allowing different images and modalities as pairs.
We exploited already existing temporal information in a longitudinal optical coherence tomography dataset using temporally informed non-contrastive loss.
Our model outperforms existing models in predicting the risk of conversion within a time frame from intermediate age-related macular degeneration (AMD) to the late wet-AMD stage.
arXiv Detail & Related papers (2022-06-30T13:42:09Z) - Efficient Learning and Decoding of the Continuous-Time Hidden Markov
Model for Disease Progression Modeling [119.50438407358862]
We present the first complete characterization of efficient EM-based learning methods for CT-HMM models.
We show that EM-based learning consists of two challenges: the estimation of posterior state probabilities and the computation of end-state conditioned statistics.
We demonstrate the use of CT-HMMs with more than 100 states to visualize and predict disease progression using a glaucoma dataset and an Alzheimer's disease dataset.
arXiv Detail & Related papers (2021-10-26T20:06:05Z) - Deep Recurrent Model for Individualized Prediction of Alzheimer's
Disease Progression [4.034948808542701]
Alzheimer's disease (AD) is one of the major causes of dementia and is characterized by slow progression over several years.
We propose a novel computational framework that can predict the phenotypic measurements of MRI biomarkers and trajectories of clinical status.
arXiv Detail & Related papers (2020-05-06T08:08:00Z) - Learning Dynamic and Personalized Comorbidity Networks from Event Data
using Deep Diffusion Processes [102.02672176520382]
Comorbid diseases co-occur and progress via complex temporal patterns that vary among individuals.
In electronic health records we can observe the different diseases a patient has, but can only infer the temporal relationship between each co-morbid condition.
We develop deep diffusion processes to model "dynamic comorbidity networks"
arXiv Detail & Related papers (2020-01-08T15:47:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.