Bayesian Event-Based Model for Disease Subtype and Stage Inference
- URL: http://arxiv.org/abs/2512.03467v1
- Date: Wed, 03 Dec 2025 05:45:16 GMT
- Title: Bayesian Event-Based Model for Disease Subtype and Stage Inference
- Authors: Hongtao Hao, Joseph L. Austerweil,
- Abstract summary: Subtype and Stage Inference Event-Based Model (SuStaIn) has been widely applied to uncover the subtypes of many diseases.<n>We develop a principled Bayesian subtype variant of the event-based model (BEBMS) and compare its performance to SuStaIn.<n>We find BEBMS has results that are more consistent with the scientific consensus of Alzheimer's disease progression than SuStaIn.
- Score: 1.2318267573115809
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Chronic diseases often progress differently across patients. Rather than randomly varying, there are typically a small number of subtypes for how a disease progresses across patients. To capture this structured heterogeneity, the Subtype and Stage Inference Event-Based Model (SuStaIn) estimates the number of subtypes, the order of disease progression for each subtype, and assigns each patient to a subtype from primarily cross-sectional data. It has been widely applied to uncover the subtypes of many diseases and inform our understanding of them. But how robust is its performance? In this paper, we develop a principled Bayesian subtype variant of the event-based model (BEBMS) and compare its performance to SuStaIn in a variety of synthetic data experiments with varied levels of model misspecification. BEBMS substantially outperforms SuStaIn across ordering, staging, and subtype assignment tasks. Further, we apply BEBMS and SuStaIn to a real-world Alzheimer's data set. We find BEBMS has results that are more consistent with the scientific consensus of Alzheimer's disease progression than SuStaIn.
Related papers
- Disease Progression and Subtype Modeling for Combined Discrete and Continuous Input Data [37.93788591991097]
We propose a novel disease progression model that handles both discrete and continuous data types.<n>We demonstrate the effectiveness of Mixed-SuStaIn through simulation experiments and real-world data from the Alzheimer's Disease Neuroimaging Initiative.
arXiv Detail & Related papers (2026-02-25T15:31:30Z) - Clustering Alzheimer's Disease Subtypes via Similarity Learning and Graph Diffusion [14.536841566365048]
Alzheimer's disease (AD) is a complex neurodegenerative disorder that affects millions of people worldwide.
In this study, we aim to identify subtypes of AD that represent distinctive clinical features and underlying pathology.
arXiv Detail & Related papers (2024-10-04T21:38:14Z) - Meta Flow Matching: Integrating Vector Fields on the Wasserstein Manifold [83.18058549195855]
We argue that multiple processes in natural sciences have to be represented as vector fields on the Wasserstein manifold of probability densities.<n>In particular, this is crucial for personalized medicine where the development of diseases and their respective treatment response depend on the microenvironment of cells specific to each patient.<n>We propose Meta Flow Matching (MFM), a practical approach to integrate along these vector fields on the Wasserstein manifold by amortizing the flow model over the initial populations.
arXiv Detail & Related papers (2024-08-26T20:05:31Z) - Multimodal Neurodegenerative Disease Subtyping Explained by ChatGPT [15.942849233189664]
Alzheimer's disease is the most prevalent neurodegenerative disease.
Current data driven approaches are able to classify the subtypes at later stages of AD or related disorders, but struggle when predicting at the asymptomatic or prodromal stage.
We propose a multimodal framework that uses early-stage indicators such as imaging, genetics and clinical assessments to classify AD patients into subtypes at early stages.
arXiv Detail & Related papers (2024-01-31T19:30:04Z) - Cascaded Multi-Modal Mixing Transformers for Alzheimer's Disease
Classification with Incomplete Data [8.536869574065195]
Multi-Modal Mixing Transformer (3MAT) is a disease classification transformer that not only leverages multi-modal data but also handles missing data scenarios.
We propose a novel modality dropout mechanism to ensure an unprecedented level of modality independence and robustness to handle missing data scenarios.
arXiv Detail & Related papers (2022-10-01T11:31:02Z) - Robust Hierarchical Patterns for identifying MDD patients: A Multisite
Study [3.4561220135252264]
We look at hierarchical Sparse Connectivity Patterns (h SCPs) as biomarkers for major depressive disorder (MDD)
We propose a novel model based on h SCPs to predict MDD patients from functional connectivity matrices extracted from resting-state fMRI data.
Our results show the impact of diversity on prediction performance. Our model can reduce diversity and improve the predictive and generalizing capability of the components.
arXiv Detail & Related papers (2022-02-22T19:40:32Z) - SANSformers: Self-Supervised Forecasting in Electronic Health Records
with Attention-Free Models [48.07469930813923]
This work aims to forecast the demand for healthcare services, by predicting the number of patient visits to healthcare facilities.
We introduce SANSformer, an attention-free sequential model designed with specific inductive biases to cater for the unique characteristics of EHR data.
Our results illuminate the promising potential of tailored attention-free models and self-supervised pretraining in refining healthcare utilization predictions across various patient demographics.
arXiv Detail & Related papers (2021-08-31T08:23:56Z) - Relational Subsets Knowledge Distillation for Long-tailed Retinal
Diseases Recognition [65.77962788209103]
We propose class subset learning by dividing the long-tailed data into multiple class subsets according to prior knowledge.
It enforces the model to focus on learning the subset-specific knowledge.
The proposed framework proved to be effective for the long-tailed retinal diseases recognition task.
arXiv Detail & Related papers (2021-04-22T13:39:33Z) - Adversarial Sample Enhanced Domain Adaptation: A Case Study on
Predictive Modeling with Electronic Health Records [57.75125067744978]
We propose a data augmentation method to facilitate domain adaptation.
adversarially generated samples are used during domain adaptation.
Results confirm the effectiveness of our method and the generality on different tasks.
arXiv Detail & Related papers (2021-01-13T03:20:20Z) - Personalized pathology test for Cardio-vascular disease: Approximate
Bayesian computation with discriminative summary statistics learning [48.7576911714538]
We propose a platelet deposition model and an inferential scheme to estimate the biologically meaningful parameters using approximate computation.
This work opens up an unprecedented opportunity of personalized pathology test for CVD detection and medical treatment.
arXiv Detail & Related papers (2020-10-13T15:20:21Z) - Select-ProtoNet: Learning to Select for Few-Shot Disease Subtype
Prediction [55.94378672172967]
We focus on few-shot disease subtype prediction problem, identifying subgroups of similar patients.
We introduce meta learning techniques to develop a new model, which can extract the common experience or knowledge from interrelated clinical tasks.
Our new model is built upon a carefully designed meta-learner, called Prototypical Network, that is a simple yet effective meta learning machine for few-shot image classification.
arXiv Detail & Related papers (2020-09-02T02:50:30Z) - Predicting Clinical Diagnosis from Patients Electronic Health Records
Using BERT-based Neural Networks [62.9447303059342]
We show the importance of this problem in medical community.
We present a modification of Bidirectional Representations from Transformers (BERT) model for classification sequence.
We use a large-scale Russian EHR dataset consisting of about 4 million unique patient visits.
arXiv Detail & Related papers (2020-07-15T09:22:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.