Risk factor identification for incident heart failure using neural
network distillation and variable selection
- URL: http://arxiv.org/abs/2102.12936v2
- Date: Mon, 1 Mar 2021 10:09:33 GMT
- Title: Risk factor identification for incident heart failure using neural
network distillation and variable selection
- Authors: Yikuan Li, Shishir Rao, Mohammad Mamouei, Gholamreza Salimi-Khorshidi,
Dexter Canoy, Abdelaali Hassaine, Thomas Lukasiewicz, Kazem Rahimi
- Abstract summary: We propose two methods to untangle hidden patterns learned by an established deep learning model for risk association identification.
A cohort with 788,880 (8.3% incident heart failure) patients was considered for the study.
Model distillation identified 598 and 379 diseases that were associated and dissociated with heart failure at the population level, respectively.
In addition to these important population-level insights, we developed an approach to individual-level interpretation to take account of varying manifestation of heart failure in clinical practice.
- Score: 24.366241122862473
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent evidence shows that deep learning models trained on electronic health
records from millions of patients can deliver substantially more accurate
predictions of risk compared to their statistical counterparts. While this
provides an important opportunity for improving clinical decision-making, the
lack of interpretability is a major barrier to the incorporation of these
black-box models in routine care, limiting their trustworthiness and preventing
further hypothesis-testing investigations. In this study, we propose two
methods, namely, model distillation and variable selection, to untangle hidden
patterns learned by an established deep learning model (BEHRT) for risk
association identification. Due to the clinical importance and diversity of
heart failure as a phenotype, it was used to showcase the merits of the
proposed methods. A cohort with 788,880 (8.3% incident heart failure) patients
was considered for the study. Model distillation identified 598 and 379
diseases that were associated and dissociated with heart failure at the
population level, respectively. While the associations were broadly consistent
with prior knowledge, our method also highlighted several less appreciated
links that are worth further investigation. In addition to these important
population-level insights, we developed an approach to individual-level
interpretation to take account of varying manifestation of heart failure in
clinical practice. This was achieved through variable selection by detecting a
minimal set of encounters that can maximally preserve the accuracy of
prediction for individuals. Our proposed work provides a discovery-enabling
tool to identify risk factors in both population and individual levels from a
data-driven perspective. This helps to generate new hypotheses and guides
further investigations on causal links.
Related papers
- Deep State-Space Generative Model For Correlated Time-to-Event Predictions [54.3637600983898]
We propose a deep latent state-space generative model to capture the interactions among different types of correlated clinical events.
Our method also uncovers meaningful insights about the latent correlations among mortality and different types of organ failures.
arXiv Detail & Related papers (2024-07-28T02:42:36Z) - Seeing Unseen: Discover Novel Biomedical Concepts via
Geometry-Constrained Probabilistic Modeling [53.7117640028211]
We present a geometry-constrained probabilistic modeling treatment to resolve the identified issues.
We incorporate a suite of critical geometric properties to impose proper constraints on the layout of constructed embedding space.
A spectral graph-theoretic method is devised to estimate the number of potential novel classes.
arXiv Detail & Related papers (2024-03-02T00:56:05Z) - Towards a Transportable Causal Network Model Based on Observational
Healthcare Data [1.333879175460266]
We propose a novel approach that combines selection diagrams, missingness graphs, causal discovery and prior knowledge into a single graphical model.
We learn this model from data comprising two different cohorts of patients.
The resulting causal network model is validated by expert clinicians in terms of risk assessment, accuracy and explainability.
arXiv Detail & Related papers (2023-11-13T13:23:31Z) - Interpretable Survival Analysis for Heart Failure Risk Prediction [50.64739292687567]
We propose a novel survival analysis pipeline that is both interpretable and competitive with state-of-the-art survival models.
Our pipeline achieves state-of-the-art performance and provides interesting and novel insights about risk factors for heart failure.
arXiv Detail & Related papers (2023-10-24T02:56:05Z) - CARNA: Characterizing Advanced heart failure Risk and hemodyNAmic
phenotypes using learned multi-valued decision diagrams [6.599394944440605]
CARNA is a hemodynamic risk stratification and phenotyping framework for advanced heart failure.
It takes advantage of the explainability and expressivity of machine learned Multi-Valued Decision Diagrams (MVDDs)
It incorporates invasive hemodynamics and can make predictions on missing data.
arXiv Detail & Related papers (2023-06-11T22:56:59Z) - Statistical and Computational Phase Transitions in Group Testing [73.55361918807883]
We study the group testing problem where the goal is to identify a set of k infected individuals carrying a rare disease.
We consider two different simple random procedures for assigning individuals tests.
arXiv Detail & Related papers (2022-06-15T16:38:50Z) - A k-mer Based Approach for SARS-CoV-2 Variant Identification [55.78588835407174]
We show that preserving the order of the amino acids helps the underlying classifiers to achieve better performance.
We also show the importance of the different amino acids which play a key role in identifying variants and how they coincide with those reported by the USA's Centers for Disease Control and Prevention (CDC)
arXiv Detail & Related papers (2021-08-07T15:08:15Z) - Development of an accessible 10-year Digital CArdioVAscular (DiCAVA)
risk assessment: a UK Biobank study [0.46180371154032895]
The aim was to develop a new risk model (DiCAVA) using statistical and machine learning techniques.
A secondary goal was to identify new patient-centric variables that could be incorporated into CVD risk assessments.
arXiv Detail & Related papers (2021-04-20T16:01:50Z) - An explainable Transformer-based deep learning model for the prediction
of incident heart failure [22.513476932615845]
We developed a novel Transformer deep-learning model for prediction of incident heart failure involving 100,071 patients.
The model achieved 0.93 and 0.93 area under the receiver operator curve and 0.69 and 0.70 area under the precision-recall curve.
The importance of contextualised medical information was revealed in sensitivity analyses.
arXiv Detail & Related papers (2021-01-27T12:45:15Z) - Deep Learning Applied to Chest X-Rays: Exploiting and Preventing
Shortcuts [11.511323714777298]
This paper studies the case of spurious class skew in which patients with a particular attribute are spuriously more likely to have the outcome of interest.
We show that deep nets can accurately identify many patient attributes including sex (AUROC = 0.96) and age (AUROC >= 0.90) when learning to predict a diagnosis.
A simple transfer learning approach is surprisingly effective at preventing the shortcut and promoting good performance.
arXiv Detail & Related papers (2020-09-21T18:52:43Z) - A General Framework for Survival Analysis and Multi-State Modelling [70.31153478610229]
We use neural ordinary differential equations as a flexible and general method for estimating multi-state survival models.
We show that our model exhibits state-of-the-art performance on popular survival data sets and demonstrate its efficacy in a multi-state setting.
arXiv Detail & Related papers (2020-06-08T19:24:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.