Looking for Out-of-Distribution Environments in Critical Care: A case
study with the eICU Database
- URL: http://arxiv.org/abs/2205.13398v1
- Date: Thu, 26 May 2022 14:46:13 GMT
- Title: Looking for Out-of-Distribution Environments in Critical Care: A case
study with the eICU Database
- Authors: Dimitris Spathis, Stephanie L. Hyland
- Abstract summary: Generalizing to new populations and domains in machine learning is still an open problem which has seen increased interest recently.
Recent proposed models for domain generalisation promise to alleviate this problem by learning invariant characteristics across environments.
We take a principled approach to identifying Out of Distribution environments, motivated by the problem of cross-hospital performance generalization in critical care.
- Score: 4.915029686150194
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Generalizing to new populations and domains in machine learning is still an
open problem which has seen increased interest recently. In particular,
clinical models show a significant performance drop when tested in settings not
seen during training, e.g., new hospitals or population demographics. Recently
proposed models for domain generalisation promise to alleviate this problem by
learning invariant characteristics across environments, however, there is still
scepticism about whether they improve over traditional training. In this work,
we take a principled approach to identifying Out of Distribution (OoD)
environments, motivated by the problem of cross-hospital generalization in
critical care. We propose model-based and heuristic approaches to identify OoD
environments and systematically compare models with different levels of
held-out information. In particular, based on the assumption that models with
access to OoD data should outperform other models, we train models across a
range of experimental setups that include leave-one-hospital-out training and
cross-sectional feature splits. We find that access to OoD data does not
translate to increased performance, pointing to inherent limitations in
defining potential OoD environments in the eICU Database potentially due to
data harmonisation and sampling. Echoing similar results with other popular
clinical benchmarks in the literature, new approaches are required to evaluate
robust models in critical care.
Related papers
- Deep State-Space Generative Model For Correlated Time-to-Event Predictions [54.3637600983898]
We propose a deep latent state-space generative model to capture the interactions among different types of correlated clinical events.
Our method also uncovers meaningful insights about the latent correlations among mortality and different types of organ failures.
arXiv Detail & Related papers (2024-07-28T02:42:36Z) - Addressing Data Heterogeneity in Federated Learning of Cox Proportional Hazards Models [8.798959872821962]
This paper outlines an approach in the domain of federated survival analysis, specifically the Cox Proportional Hazards (CoxPH) model.
We present an FL approach that employs feature-based clustering to enhance model accuracy across synthetic datasets and real-world applications.
arXiv Detail & Related papers (2024-07-20T18:34:20Z) - Meta-Learners for Partially-Identified Treatment Effects Across Multiple Environments [67.80453452949303]
Estimating the conditional average treatment effect (CATE) from observational data is relevant for many applications such as personalized medicine.
Here, we focus on the widespread setting where the observational data come from multiple environments.
We propose different model-agnostic learners (so-called meta-learners) to estimate the bounds that can be used in combination with arbitrary machine learning models.
arXiv Detail & Related papers (2024-06-04T16:31:43Z) - MedDiffusion: Boosting Health Risk Prediction via Diffusion-based Data
Augmentation [58.93221876843639]
This paper introduces a novel, end-to-end diffusion-based risk prediction model, named MedDiffusion.
It enhances risk prediction performance by creating synthetic patient data during training to enlarge sample space.
It discerns hidden relationships between patient visits using a step-wise attention mechanism, enabling the model to automatically retain the most vital information for generating high-quality data.
arXiv Detail & Related papers (2023-10-04T01:36:30Z) - On the Importance of Clinical Notes in Multi-modal Learning for EHR Data [0.0]
Previous research has shown that jointly using clinical notes with electronic health record data improved predictive performance for patient monitoring.
We first confirm that performance significantly improves over state-of-the-art EHR data models when combining EHR data and clinical notes.
We then provide an analysis showing improvements arise almost exclusively from a subset of notes containing broader context on patient state rather than clinician notes.
arXiv Detail & Related papers (2022-12-06T15:18:57Z) - Deep Stable Representation Learning on Electronic Health Records [8.256340233221112]
Causal Healthcare Embedding (CHE) aims at eliminating the spurious statistical relationship by removing the dependencies between diagnoses and procedures.
Our proposed CHE method can be used as a flexible plug-and-play module that can enhance existing deep learning models on EHR.
arXiv Detail & Related papers (2022-09-03T04:10:45Z) - A comparison of approaches to improve worst-case predictive model
performance over patient subpopulations [14.175321968797252]
Predictive models for clinical outcomes that are accurate on average in a patient population may underperform drastically for some subpopulations.
We identify approaches for model development and selection that consistently improve disaggregated and worst-case performance over subpopulations.
We find that, with relatively few exceptions, no approach performs better, for each patient subpopulation examined, than standard learning procedures.
arXiv Detail & Related papers (2021-08-27T13:10:00Z) - Bootstrapping Your Own Positive Sample: Contrastive Learning With
Electronic Health Record Data [62.29031007761901]
This paper proposes a novel contrastive regularized clinical classification model.
We introduce two unique positive sampling strategies specifically tailored for EHR data.
Our framework yields highly competitive experimental results in predicting the mortality risk on real-world COVID-19 EHR data.
arXiv Detail & Related papers (2021-04-07T06:02:04Z) - Adversarial Sample Enhanced Domain Adaptation: A Case Study on
Predictive Modeling with Electronic Health Records [57.75125067744978]
We propose a data augmentation method to facilitate domain adaptation.
adversarially generated samples are used during domain adaptation.
Results confirm the effectiveness of our method and the generality on different tasks.
arXiv Detail & Related papers (2021-01-13T03:20:20Z) - Predictive Modeling of ICU Healthcare-Associated Infections from
Imbalanced Data. Using Ensembles and a Clustering-Based Undersampling
Approach [55.41644538483948]
This work is focused on both the identification of risk factors and the prediction of healthcare-associated infections in intensive-care units.
The aim is to support decision making addressed at reducing the incidence rate of infections.
arXiv Detail & Related papers (2020-05-07T16:13:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.