From Single-Hospital to Multi-Centre Applications: Enhancing the
Generalisability of Deep Learning Models for Adverse Event Prediction in the
ICU
- URL: http://arxiv.org/abs/2303.15354v2
- Date: Fri, 7 Apr 2023 18:03:56 GMT
- Title: From Single-Hospital to Multi-Centre Applications: Enhancing the
Generalisability of Deep Learning Models for Adverse Event Prediction in the
ICU
- Authors: Patrick Rockenschaub, Adam Hilbert, Tabea Kossen, Falk von Dincklage,
Vince Istvan Madai, Dietmar Frey
- Abstract summary: Deep learning (DL) can aid doctors in detecting worsening patient states early, affording them time to react and prevent bad outcomes.
While DL-based early warning models usually work well in the hospitals they were trained for, they tend to be less reliable when applied at new hospitals.
We systematically assessed the reliability of DL models for three common adverse events: death, acute kidney injury (AKI), and sepsis.
We found that models achieved high AUROC for mortality (0.838-0.869), AKI (0.823-0.866), and sepsis (0.749-0.824) at the training hospital.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep learning (DL) can aid doctors in detecting worsening patient states
early, affording them time to react and prevent bad outcomes. While DL-based
early warning models usually work well in the hospitals they were trained for,
they tend to be less reliable when applied at new hospitals. This makes it
difficult to deploy them at scale. Using carefully harmonised intensive care
data from four data sources across Europe and the US (totalling 334,812 stays),
we systematically assessed the reliability of DL models for three common
adverse events: death, acute kidney injury (AKI), and sepsis. We tested whether
using more than one data source and/or explicitly optimising for
generalisability during training improves model performance at new hospitals.
We found that models achieved high AUROC for mortality (0.838-0.869), AKI
(0.823-0.866), and sepsis (0.749-0.824) at the training hospital. As expected,
performance dropped at new hospitals, sometimes by as much as -0.200. Using
more than one data source for training mitigated the performance drop, with
multi-source models performing roughly on par with the best single-source
model. This suggests that as data from more hospitals become available for
training, model robustness is likely to increase, lower-bounding robustness
with the performance of the most applicable data source in the training data.
Dedicated methods promoting generalisability did not noticeably improve
performance in our experiments.
Related papers
- Comparing Federated Stochastic Gradient Descent and Federated Averaging for Predicting Hospital Length of Stay [0.0]
Predicting hospital length of stay (LOS) reliably is an essential need for efficient resource allocation at hospitals.
Traditional predictive modeling tools frequently have difficulty acquiring sufficient and diverse data because healthcare institutions have privacy rules in place.
This modeling approach facilitates collaborative model training by modeling decentralized data sources from different hospitals without extracting sensitive data outside of hospitals.
arXiv Detail & Related papers (2024-07-17T17:00:20Z) - Federated learning model for predicting major postoperative complications [2.565552377354702]
We developed federated learning models to predict nine major postoperative complications.
We compared federated learning models with local learning models trained on a single site and central learning models trained on pooled dataset from two centers.
Our federated learning model obtained comparable performance to the best local learning model at each center, demonstrating strong generalizability.
arXiv Detail & Related papers (2024-04-09T22:31:10Z) - TWINS: A Fine-Tuning Framework for Improved Transferability of
Adversarial Robustness and Generalization [89.54947228958494]
This paper focuses on the fine-tuning of an adversarially pre-trained model in various classification tasks.
We propose a novel statistics-based approach, Two-WIng NormliSation (TWINS) fine-tuning framework.
TWINS is shown to be effective on a wide range of image classification datasets in terms of both generalization and robustness.
arXiv Detail & Related papers (2023-03-20T14:12:55Z) - Density-Aware Personalized Training for Risk Prediction in Imbalanced
Medical Data [89.79617468457393]
Training models with imbalance rate (class density discrepancy) may lead to suboptimal prediction.
We propose a framework for training models for this imbalance issue.
We demonstrate our model's improved performance in real-world medical datasets.
arXiv Detail & Related papers (2022-07-23T00:39:53Z) - BERT WEAVER: Using WEight AVERaging to enable lifelong learning for
transformer-based models in biomedical semantic search engines [49.75878234192369]
We present WEAVER, a simple, yet efficient post-processing method that infuses old knowledge into the new model.
We show that applying WEAVER in a sequential manner results in similar word embedding distributions as doing a combined training on all data at once.
arXiv Detail & Related papers (2022-02-21T10:34:41Z) - Practical Challenges in Differentially-Private Federated Survival
Analysis of Medical Data [57.19441629270029]
In this paper, we take advantage of the inherent properties of neural networks to federate the process of training of survival analysis models.
In the realistic setting of small medical datasets and only a few data centers, this noise makes it harder for the models to converge.
We propose DPFed-post which adds a post-processing stage to the private federated learning scheme.
arXiv Detail & Related papers (2022-02-08T10:03:24Z) - A comparison of approaches to improve worst-case predictive model
performance over patient subpopulations [14.175321968797252]
Predictive models for clinical outcomes that are accurate on average in a patient population may underperform drastically for some subpopulations.
We identify approaches for model development and selection that consistently improve disaggregated and worst-case performance over subpopulations.
We find that, with relatively few exceptions, no approach performs better, for each patient subpopulation examined, than standard learning procedures.
arXiv Detail & Related papers (2021-08-27T13:10:00Z) - UNITE: Uncertainty-based Health Risk Prediction Leveraging Multi-sourced
Data [81.00385374948125]
We present UNcertaInTy-based hEalth risk prediction (UNITE) model.
UNITE provides accurate disease risk prediction and uncertainty estimation leveraging multi-sourced health data.
We evaluate UNITE on real-world disease risk prediction tasks: nonalcoholic fatty liver disease (NASH) and Alzheimer's disease (AD)
UNITE achieves up to 0.841 in F1 score for AD detection, up to 0.609 in PR-AUC for NASH detection, and outperforms various state-of-the-art baselines by up to $19%$ over the best baseline.
arXiv Detail & Related papers (2020-10-22T02:28:11Z) - A Machine Learning Early Warning System: Multicenter Validation in
Brazilian Hospitals [4.659599449441919]
Early recognition of clinical deterioration is one of the main steps for reducing inpatient morbidity and mortality.
Since hospital wards are given less attention compared to the Intensive Care Unit, ICU, we hypothesized that when a platform is connected to a stream of EHR, there would be a drastic improvement in dangerous situations awareness.
With the application of machine learning, the system is capable to consider all patient's history and through the use of high-performing predictive models, an intelligent early warning system is enabled.
arXiv Detail & Related papers (2020-06-09T21:21:38Z) - Self-Training with Improved Regularization for Sample-Efficient Chest
X-Ray Classification [80.00316465793702]
We present a deep learning framework that enables robust modeling in challenging scenarios.
Our results show that using 85% lesser labeled data, we can build predictive models that match the performance of classifiers trained in a large-scale data setting.
arXiv Detail & Related papers (2020-05-03T02:36:00Z) - An Adversarial Approach for the Robust Classification of Pneumonia from
Chest Radiographs [9.462808515258464]
Deep learning models often exhibit performance loss due to dataset shift.
Models trained using data from one hospital system achieve high predictive performance when tested on data from the same hospital, but perform significantly worse when tested in different hospital systems.
We propose an approach based on adversarial optimization, which allows us to learn more robust models that do not depend on confounders.
arXiv Detail & Related papers (2020-01-13T03:49:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.