Clustering Left-Censored Multivariate Time-Series
- URL: http://arxiv.org/abs/2102.07005v1
- Date: Sat, 13 Feb 2021 21:22:40 GMT
- Title: Clustering Left-Censored Multivariate Time-Series
- Authors: Irene Y. Chen, Rahul G. Krishnan, David Sontag
- Abstract summary: We focus on mitigating the interference of left-censorship in the task of clustering.
We develop a deep generative, continuous-time model of time-series data that clusters while correcting for censorship time.
We demonstrate accurate, stable, and interpretable results on synthetic data.
- Score: 2.4493299476776778
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Unsupervised learning seeks to uncover patterns in data. However, different
kinds of noise may impede the discovery of useful substructure from real-world
time-series data. In this work, we focus on mitigating the interference of
left-censorship in the task of clustering. We provide conditions under which
clusters and left-censorship may be identified; motivated by this result, we
develop a deep generative, continuous-time model of time-series data that
clusters while correcting for censorship time. We demonstrate accurate, stable,
and interpretable results on synthetic data that outperform several benchmarks.
To showcase the utility of our framework on real-world problems, we study how
left-censorship can adversely affect the task of disease phenotyping, resulting
in the often incorrect assumption that longitudinal patient data are aligned by
disease stage. In reality, patients at the time of diagnosis are at different
stages of the disease -- both late and early due to differences in when
patients seek medical care and such discrepancy can confound unsupervised
learning algorithms. On two clinical datasets, our model corrects for this form
of censorship and recovers known clinical subtypes.
Related papers
- Contrastive Learning of Temporal Distinctiveness for Survival Analysis
in Electronic Health Records [10.192973297290136]
We propose a novel Ontology-aware Temporality-based Contrastive Survival (OTCSurv) analysis framework.
OTCSurv uses survival durations from both censored and observed data to define temporal distinctiveness.
We conduct experiments using a large EHR dataset to forecast the risk of hospitalized patients who are in danger of developing acute kidney injury (AKI)
arXiv Detail & Related papers (2023-08-24T22:36:22Z) - T-Phenotype: Discovering Phenotypes of Predictive Temporal Patterns in
Disease Progression [82.85825388788567]
We develop a novel temporal clustering method, T-Phenotype, to discover phenotypes of predictive temporal patterns from labeled time-series data.
We show that T-Phenotype achieves the best phenotype discovery performance over all the evaluated baselines.
arXiv Detail & Related papers (2023-02-24T13:30:35Z) - LifeLonger: A Benchmark for Continual Disease Classification [59.13735398630546]
We introduce LifeLonger, a benchmark for continual disease classification on the MedMNIST collection.
Task and class incremental learning of diseases address the issue of classifying new samples without re-training the models from scratch.
Cross-domain incremental learning addresses the issue of dealing with datasets originating from different institutions while retaining the previously obtained knowledge.
arXiv Detail & Related papers (2022-04-12T12:25:05Z) - Temporal Clustering with External Memory Network for Disease Progression
Modeling [8.015263440307631]
Disease progression modeling (DPM) involves using mathematical frameworks to quantitatively measure the severity of how certain disease progresses.
DPM is useful in many ways such as predicting health state, categorizing disease stages, and assessing patients disease trajectory etc.
arXiv Detail & Related papers (2021-09-29T02:32:06Z) - Phenotyping Clusters of Patient Trajectories suffering from Chronic
Complex Disease [3.1564542805009332]
We evaluate three different clustering models on a large hospital dataset of vital-sign observations from patients suffering from COPD.
We propose novel modifications to deal with unevenly sampled time-series data and unbalanced class distribution to improve phenotype separation.
arXiv Detail & Related papers (2020-11-17T01:18:33Z) - Trajectories, bifurcations and pseudotime in large clinical datasets:
applications to myocardial infarction and diabetes data [94.37521840642141]
We suggest a semi-supervised methodology for the analysis of large clinical datasets, characterized by mixed data types and missing values.
The methodology is based on application of elastic principal graphs which can address simultaneously the tasks of dimensionality reduction, data visualization, clustering, feature selection and quantifying the geodesic distances (pseudotime) in partially ordered sequences of observations.
arXiv Detail & Related papers (2020-07-07T21:04:55Z) - Temporal Phenotyping using Deep Predictive Clustering of Disease
Progression [97.88605060346455]
We develop a deep learning approach for clustering time-series data, where each cluster comprises patients who share similar future outcomes of interest.
Experiments on two real-world datasets show that our model achieves superior clustering performance over state-of-the-art benchmarks.
arXiv Detail & Related papers (2020-06-15T20:48:43Z) - Deep Mining External Imperfect Data for Chest X-ray Disease Screening [57.40329813850719]
We argue that incorporating an external CXR dataset leads to imperfect training data, which raises the challenges.
We formulate the multi-label disease classification problem as weighted independent binary tasks according to the categories.
Our framework simultaneously models and tackles the domain and label discrepancies, enabling superior knowledge mining ability.
arXiv Detail & Related papers (2020-06-06T06:48:40Z) - Learning Dynamic and Personalized Comorbidity Networks from Event Data
using Deep Diffusion Processes [102.02672176520382]
Comorbid diseases co-occur and progress via complex temporal patterns that vary among individuals.
In electronic health records we can observe the different diseases a patient has, but can only infer the temporal relationship between each co-morbid condition.
We develop deep diffusion processes to model "dynamic comorbidity networks"
arXiv Detail & Related papers (2020-01-08T15:47:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.