Multimodal Disease Progression Modeling via Spatiotemporal Disentanglement and Multiscale Alignment
- URL: http://arxiv.org/abs/2510.11112v1
- Date: Mon, 13 Oct 2025 08:02:36 GMT
- Title: Multimodal Disease Progression Modeling via Spatiotemporal Disentanglement and Multiscale Alignment
- Authors: Chen Liu, Wenfang Yao, Kejing Yin, William K. Cheung, Jing Qin,
- Abstract summary: $textttDiPro$ is a novel framework that addresses challenges through region-aware disentanglement and multi-timescale alignment.<n>Experiments on the MIMIC dataset demonstrate that $textttDiPro$ could effectively extract temporal clinical dynamics.
- Score: 16.824692012617334
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Longitudinal multimodal data, including electronic health records (EHR) and sequential chest X-rays (CXRs), is critical for modeling disease progression, yet remains underutilized due to two key challenges: (1) redundancy in consecutive CXR sequences, where static anatomical regions dominate over clinically-meaningful dynamics, and (2) temporal misalignment between sparse, irregular imaging and continuous EHR data. We introduce $\texttt{DiPro}$, a novel framework that addresses these challenges through region-aware disentanglement and multi-timescale alignment. First, we disentangle static (anatomy) and dynamic (pathology progression) features in sequential CXRs, prioritizing disease-relevant changes. Second, we hierarchically align these static and dynamic CXR features with asynchronous EHR data via local (pairwise interval-level) and global (full-sequence) synchronization to model coherent progression pathways. Extensive experiments on the MIMIC dataset demonstrate that $\texttt{DiPro}$ could effectively extract temporal clinical dynamics and achieve state-of-the-art performance on both disease progression identification and general ICU prediction tasks.
Related papers
- Scalable Spatio-Temporal SE(3) Diffusion for Long-Horizon Protein Dynamics [51.85385061275941]
Molecular dynamics (MD) simulations remain the gold standard for studying protein dynamics.<n>Recent generative models have shown promise in accelerating simulations, yet they struggle with long-horizon generation.<n>We present STAR-MD, a scalable diffusion model that generates physically plausible protein trajectories over micro-scale timescales.
arXiv Detail & Related papers (2026-02-02T14:13:28Z) - NeuroSSM: Multiscale Differential State-Space Modeling for Context-Aware fMRI Analysis [4.753690672619091]
We propose NeuroSSM, a selective state-space architecture designed for end-to-end analysis of raw BOLD signals in fMRI time series.<n>NeuroSSM addresses the above limitations through two complementary design components.<n> Experiments on clinical and non-clinical datasets demonstrate that NeuroSSM achieves competitive performance and efficiency against state-of-the-art fMRI analysis methods.
arXiv Detail & Related papers (2026-01-03T16:35:45Z) - Learning Patient-Specific Disease Dynamics with Latent Flow Matching for Longitudinal Imaging Generation [17.33607122354623]
Understanding disease progression is a central clinical challenge with implications for early diagnosis and personalized treatment.<n>We propose to treat the disease dynamic as a velocity field and leverage Flow Matching (FM) to align the temporal evolution of patient data.<n>We present $$-LFM, a framework for modeling patient-specific latent progression with flow matching.
arXiv Detail & Related papers (2025-12-09T23:13:54Z) - Machine Learning Approaches to Clinical Risk Prediction: Multi-Scale Temporal Alignment in Electronic Health Records [2.9576397177561087]
This study proposes a risk prediction method based on a Multi-Scale Temporal Alignment Network (MSTAN)<n>It addresses the challenges of temporal irregularity, sampling interval differences, and multi-scale dynamic dependencies in Electronic Health Records (EHR)<n> Experiments conducted on publicly available EHR datasets show that the proposed model outperforms mainstream baselines in accuracy, recall, precision, and F1-Score.
arXiv Detail & Related papers (2025-11-26T16:33:59Z) - Conditional Neural ODE for Longitudinal Parkinson's Disease Progression Forecasting [51.906871559732245]
Parkinson's disease (PD) shows heterogeneous, evolving brain-morphometry patterns.<n>Modeling these longitudinal trajectories enables mechanistic insight, treatment development, and individualized 'digital-twin' forecasting.<n>We propose CNODE, a novel framework for continuous, individualized PD progression forecasting.
arXiv Detail & Related papers (2025-11-06T20:16:33Z) - Biaxialformer: Leveraging Channel Independence and Inter-Channel Correlations in EEG Signal Decoding for Predicting Neurological Outcomes [1.7656489005289302]
Accurate decoding of EEG signals requires modeling of both temporal dynamics within individual channels and spatial dependencies across channels.<n> Transformer-based models utilizing channel-independence (CI) strategies have demonstrated strong performance in various time series tasks.<n>We propose Biformer, characterized by a meticulously engineered two-stage attention-based framework.
arXiv Detail & Related papers (2025-06-18T11:47:26Z) - X$^{2}$-Gaussian: 4D Radiative Gaussian Splatting for Continuous-time Tomographic Reconstruction [64.2059940799033]
Current methods discretize temporal resolution into fixed phases with respiratory gating devices.<n>X$2$-Gaussian, a novel framework, enables continuous-time 4DCT reconstruction by integrating dynamic radiative splatting with self-supervised respiratory motion learning.
arXiv Detail & Related papers (2025-03-27T17:59:57Z) - CTPD: Cross-Modal Temporal Pattern Discovery for Enhanced Multimodal Electronic Health Records Analysis [50.56875995511431]
We introduce a Cross-Modal Temporal Pattern Discovery (CTPD) framework, designed to efficiently extract meaningful cross-modal temporal patterns from multimodal EHR data.<n>Our approach introduces shared initial temporal pattern representations which are refined using slot attention to generate temporal semantic embeddings.
arXiv Detail & Related papers (2024-11-01T15:54:07Z) - Trajectory Flow Matching with Applications to Clinical Time Series Modeling [77.58277281319253]
Trajectory Flow Matching (TFM) trains a Neural SDE in a simulation-free manner, bypassing backpropagation through the dynamics.<n>We demonstrate improved performance on three clinical time series datasets in terms of absolute performance and uncertainty prediction.
arXiv Detail & Related papers (2024-10-28T15:54:50Z) - Addressing Asynchronicity in Clinical Multimodal Fusion via Individualized Chest X-ray Generation [14.658627367126009]
We propose DDL-CXR, a method that dynamically generates an up-to-date latent representation of the individualized chest X-ray images.
Our approach leverages latent diffusion models for patient-specific generation strategically conditioned on a previous CXR image and EHR time series.
Experiments using MIMIC datasets show that the proposed model could effectively address asynchronicity in multimodal fusion and consistently outperform existing methods.
arXiv Detail & Related papers (2024-10-23T14:34:39Z) - PMT: Progressive Mean Teacher via Exploring Temporal Consistency for Semi-Supervised Medical Image Segmentation [51.509573838103854]
We propose a semi-supervised learning framework, termed Progressive Mean Teachers (PMT), for medical image segmentation.
Our PMT generates high-fidelity pseudo labels by learning robust and diverse features in the training process.
Experimental results on two datasets with different modalities, i.e., CT and MRI, demonstrate that our method outperforms the state-of-the-art medical image segmentation approaches.
arXiv Detail & Related papers (2024-09-08T15:02:25Z) - Temporal Cross-Attention for Dynamic Embedding and Tokenization of Multimodal Electronic Health Records [1.6609516435725236]
We introduce a dynamic embedding and tokenization framework for precise representation of multimodal clinical time series.
Our framework outperformed baseline approaches on the task of predicting the occurrence of nine postoperative complications.
arXiv Detail & Related papers (2024-03-06T19:46:44Z) - Cross-Modal Causal Intervention for Medical Report Generation [107.76649943399168]
Radiology Report Generation (RRG) is essential for computer-aided diagnosis and medication guidance.<n> generating accurate lesion descriptions remains challenging due to spurious correlations from visual-linguistic biases.<n>We propose a two-stage framework named CrossModal Causal Representation Learning (CMCRL)<n> Experiments on IU-Xray and MIMIC-CXR show that our CMCRL pipeline significantly outperforms state-of-the-art methods.
arXiv Detail & Related papers (2023-03-16T07:23:55Z) - An Adversarial Domain Separation Framework for Septic Shock Early
Prediction Across EHR Systems [7.058760708627898]
We propose a general domain adaptation (DA) framework that tackles two categories of discrepancies in EHRs collected from different medical systems.
We evaluate our framework for early diagnosis of an extremely challenging condition, septic shock, using two real-world EHRs from distinct medical systems in the U.S.
arXiv Detail & Related papers (2020-10-26T23:41:33Z) - Learning Dynamic and Personalized Comorbidity Networks from Event Data
using Deep Diffusion Processes [102.02672176520382]
Comorbid diseases co-occur and progress via complex temporal patterns that vary among individuals.
In electronic health records we can observe the different diseases a patient has, but can only infer the temporal relationship between each co-morbid condition.
We develop deep diffusion processes to model "dynamic comorbidity networks"
arXiv Detail & Related papers (2020-01-08T15:47:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.