Targeted-BEHRT: Deep learning for observational causal inference on
longitudinal electronic health records
- URL: http://arxiv.org/abs/2202.03487v1
- Date: Mon, 7 Feb 2022 20:05:05 GMT
- Title: Targeted-BEHRT: Deep learning for observational causal inference on
longitudinal electronic health records
- Authors: Shishir Rao, Mohammad Mamouei, Gholamreza Salimi-Khorshidi, Yikuan Li,
Rema Ramakrishnan, Abdelaali Hassaine, Dexter Canoy, Kazem Rahimi
- Abstract summary: We investigate causal modelling of an RCT-established null causal association: the effect of antihypertensive use on incident cancer risk.
We develop a dataset for our observational study and a Transformer-based model, Targeted BEHRT coupled with doubly robust estimation.
We find that our model provides more accurate estimates of RR compared to benchmarks for risk ratio estimation on high-dimensional EHR.
- Score: 1.3192560874022086
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Observational causal inference is useful for decision making in medicine when
randomized clinical trials (RCT) are infeasible or non generalizable. However,
traditional approaches fail to deliver unconfounded causal conclusions in
practice. The rise of "doubly robust" non-parametric tools coupled with the
growth of deep learning for capturing rich representations of multimodal data,
offers a unique opportunity to develop and test such models for causal
inference on comprehensive electronic health records (EHR). In this paper, we
investigate causal modelling of an RCT-established null causal association: the
effect of antihypertensive use on incident cancer risk. We develop a dataset
for our observational study and a Transformer-based model, Targeted BEHRT
coupled with doubly robust estimation, we estimate average risk ratio (RR). We
compare our model to benchmark statistical and deep learning models for causal
inference in multiple experiments on semi-synthetic derivations of our dataset
with various types and intensities of confounding. In order to further test the
reliability of our approach, we test our model on situations of limited data.
We find that our model provides more accurate estimates of RR (least sum
absolute error from ground truth) compared to benchmarks for risk ratio
estimation on high-dimensional EHR across experiments. Finally, we apply our
model to investigate the original case study: antihypertensives' effect on
cancer and demonstrate that our model generally captures the validated null
association.
Related papers
- Deep State-Space Generative Model For Correlated Time-to-Event Predictions [54.3637600983898]
We propose a deep latent state-space generative model to capture the interactions among different types of correlated clinical events.
Our method also uncovers meaningful insights about the latent correlations among mortality and different types of organ failures.
arXiv Detail & Related papers (2024-07-28T02:42:36Z) - Towards a Transportable Causal Network Model Based on Observational
Healthcare Data [1.333879175460266]
We propose a novel approach that combines selection diagrams, missingness graphs, causal discovery and prior knowledge into a single graphical model.
We learn this model from data comprising two different cohorts of patients.
The resulting causal network model is validated by expert clinicians in terms of risk assessment, accuracy and explainability.
arXiv Detail & Related papers (2023-11-13T13:23:31Z) - MedDiffusion: Boosting Health Risk Prediction via Diffusion-based Data
Augmentation [58.93221876843639]
This paper introduces a novel, end-to-end diffusion-based risk prediction model, named MedDiffusion.
It enhances risk prediction performance by creating synthetic patient data during training to enlarge sample space.
It discerns hidden relationships between patient visits using a step-wise attention mechanism, enabling the model to automatically retain the most vital information for generating high-quality data.
arXiv Detail & Related papers (2023-10-04T01:36:30Z) - Deep Stable Representation Learning on Electronic Health Records [8.256340233221112]
Causal Healthcare Embedding (CHE) aims at eliminating the spurious statistical relationship by removing the dependencies between diagnoses and procedures.
Our proposed CHE method can be used as a flexible plug-and-play module that can enhance existing deep learning models on EHR.
arXiv Detail & Related papers (2022-09-03T04:10:45Z) - SurvLatent ODE : A Neural ODE based time-to-event model with competing
risks for longitudinal data improves cancer-associated Deep Vein Thrombosis
(DVT) prediction [68.8204255655161]
We propose a generative time-to-event model, SurvLatent ODE, which parameterizes a latent representation under irregularly sampled data.
Our model then utilizes the latent representation to flexibly estimate survival times for multiple competing events without specifying shapes of event-specific hazard function.
SurvLatent ODE outperforms the current clinical standard Khorana Risk scores for stratifying DVT risk groups.
arXiv Detail & Related papers (2022-04-20T17:28:08Z) - Statistical quantification of confounding bias in predictive modelling [0.0]
I propose the partial and full confounder tests, which probe the null hypotheses of unconfounded and fully confounded models.
The tests provide a strict control for Type I errors and high statistical power, even for non-normally and non-linearly dependent predictions.
arXiv Detail & Related papers (2021-11-01T10:35:24Z) - Harmonization with Flow-based Causal Inference [12.739380441313022]
This paper presents a normalizing-flow-based method to perform counterfactual inference upon a structural causal model (SCM) to harmonize medical data.
We evaluate on multiple, large, real-world medical datasets to observe that this method leads to better cross-domain generalization compared to state-of-the-art algorithms.
arXiv Detail & Related papers (2021-06-12T19:57:35Z) - Bootstrapping Your Own Positive Sample: Contrastive Learning With
Electronic Health Record Data [62.29031007761901]
This paper proposes a novel contrastive regularized clinical classification model.
We introduce two unique positive sampling strategies specifically tailored for EHR data.
Our framework yields highly competitive experimental results in predicting the mortality risk on real-world COVID-19 EHR data.
arXiv Detail & Related papers (2021-04-07T06:02:04Z) - UNITE: Uncertainty-based Health Risk Prediction Leveraging Multi-sourced
Data [81.00385374948125]
We present UNcertaInTy-based hEalth risk prediction (UNITE) model.
UNITE provides accurate disease risk prediction and uncertainty estimation leveraging multi-sourced health data.
We evaluate UNITE on real-world disease risk prediction tasks: nonalcoholic fatty liver disease (NASH) and Alzheimer's disease (AD)
UNITE achieves up to 0.841 in F1 score for AD detection, up to 0.609 in PR-AUC for NASH detection, and outperforms various state-of-the-art baselines by up to $19%$ over the best baseline.
arXiv Detail & Related papers (2020-10-22T02:28:11Z) - A General Framework for Survival Analysis and Multi-State Modelling [70.31153478610229]
We use neural ordinary differential equations as a flexible and general method for estimating multi-state survival models.
We show that our model exhibits state-of-the-art performance on popular survival data sets and demonstrate its efficacy in a multi-state setting.
arXiv Detail & Related papers (2020-06-08T19:24:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.